Results 1 to 5 of 5

Thread: Nagios Network Monitor - Installation and configuration

  1. #1
    Administrator Advisor peter's Avatar
    Join Date
    Apr 2004
    Posts
    882

    Nagios Network Monitor - Installation and configuration

    by Wayne E Goodrich (Outlaw)
    (Transferred from the wiki by Peter)

    Introduction

    If you manage a network of any size, you want to be notified of problems before your customers or your bosses find out, but you don't want to be tied to a console checking for the availability of hosts and services. This is where Nagios shines. If you put in the time it takes to install and customize Nagios for your environment, you'll be rewarded with a superb monitoring and notification solution that happens to be free. In this PET, I will guide you through the installation and configuration of Nagios, and I will provide examples of customizations you can add using plugins you can write yourself.
    Gather up our packages

    I will use Redhat Enterprise Linux AS 4.0 in these examples, but they can be adapted for any Linux distribution. The following are required packages for HTTPD services that will drive Nagios's web interface:



    Apache


    Code:
    httpd 
    httpd-suexec 
    apr-util
    Optional (for secure sockets layer, HTTPS interface)
    Code:
    mod_ssl
    If you selected the default package set during installation, these are already installed. If you opted not to make Apache available during Redhat install, you can grab the packages from RHN using up2date or by manually downloading them.
    The following are needed for Nagios basic functionality, really it's the Nagios framework we get. Nagios's checks are accomplished entirely through the use of plugins, which are available in a separate package. From here on out, I will suggest getting prebuilt packages from Dag Wieers's collection, and occasionally from CPAN. To make it easier on yourself, add Dag's repositories if you use YUM.



    Nagios

    nagios-2.2-1.el4.rf.i386.rpm http://dag.wieers.com/packages/nagios/
    The following are needed for Nagios to actually perform checks



    Nagios Plugins

    nagios-plugins-1.4.1-1.2.el4.rf.i386.rpm http://dag.wieers.com/packages/nagios-plugins/
    fping-2.4-1.b2.2.el4.rf.i386.rpm http://dag.wieers.com/packages/fping/
    perl-Crypt-DES-2.03-3.2.el4.rf.i386.rpm http://dag.wieers.com/packages/perl-Crypt-DES/
    perl-Net-SNMP-5.0.1-1.2.el4.rf.noarch.rpm http://dag.wieers.com/packages/perl-Net-SNMP/
    perl-IO-Socket-INET6-2.51-1.2.el4.rf.noarch.rpm http://dag.wieers.com/packages/perl-IO-Socket-INET6/
    Digest-HMAC-1.01.tar.gz http://search.cpan.org/~gaas/Digest-HMAC-1.01/lib/Digest/HMAC.pm
    Digest-SHA1-2.11.tar.gz http://search.cpan.org/~gaas/Digest-SHA1-2.11/SHA1.pm
    Install Necessary Packages

    We can begin installation of the packages by first installing Nagios:
    Code:
    rpm -ivh nagios-2.2-1.el4.rf.i386.rpm
    Now we begin satisfying nagios-plugins dependencies:
    Code:
    rpm -ivh fping-2.4-1.b2.2.el4.rf.i386.rpm
    rpm -ivh perl-Crypt-DES-2.03-3.2.el4.rf.i386.rpm
    mkdir /tmp/perltmp
    cp *gz /tmp/perltmp
    cd /tmp/perltmp
    find . -name "*gz" -exec tar xvzf {} \;
    cd Digest-SHA1-2.11
    perl Makefile.pl
    make test
    make install
    cd ../Digest-HMAC-1.01
     perl Makefile.pl
    make test
    make install
    cd ../Socket6-0.19
    perl Makefile.pl
    make test
    make install
    These next two Dag perl packages expect SHA1, HMAC and Socket6 to be available as rpms, but since they were not, we have to tell rpm not to check dependencies.
    Code:
    rpm -ivh --nodeps perl-Net-SNMP-5.0.1-1.2.el4.rf.noarch.rpm
    rpm -ivh --nodeps perl-IO-Socket-INET6-2.51-1.2.el4.rf.noarch.rpm
    rpm -ivh nagios-plugins-1.4.1-1.2.el4.rf.i386.rpm
    Begin Configuration

    Nagios has two methods for arranging its configuration files. One way relies on a single file where you specify hosts, groups, services etc. The other allows you to split these files up by purpose for ease of administration. The single file method can become unwieldy as you add machines and services to monitor. Here, we'll assume the multiple definition file method.
    Configure The Nagios Service

    Let's become familiar with the file locations that the Dag provided packages use as defaults:



    Main Nagios Configs
    Code:
    /etc/nagios
    Plugins and CGIs
    Code:
    /usr/lib/nagios
    Nagios Web Files
    Code:
    /usr/share/nagios
    Here, we see the example config files in /etc/nagios:
    Code:
    [radar@test2 ~]$ ls -lh /etc/nagios
    total 160K
    -rw-rw-r--  1 root root  30K Apr  8 08:28 bigger.cfg
    -rw-rw-r--  1 root root 9.4K Apr  8 08:28 cgi.cfg
    -rw-rw-r--  1 root root 4.8K Apr  8 08:28 checkcommands.cfg
    -rw-r--r--  1 root root  16K Aug  5  2005 command-plugins.cfg
    -rw-rw-r--  1 root root  14K Apr  8 08:28 minimal.cfg
    -rw-rw-r--  1 root root 4.2K Apr  8 08:28 misccommands.cfg
    -rw-rw-r--  1 root root  30K Apr  8 08:28 nagios.cfg
    -rw-rw----  1 root root 1.3K Apr  8 08:28 resource.cfg
    The first file we're interested in is nagios.cfg, the main config file. This file specifies, among other things, the object config (definition) files. Those are what we are most interested in at this point. We want to open /etc/nagios/nagios.cfg in an editor and comment out the line that contains minimal.cfg. Then we'll uncomment the lines containing the object config files that we'll need to create, and populate with our definitions. Let's go ahead and do that, then.

    Code:
    # You can split other types of object definitions across several
    # config files if you wish (as done here), or keep them all in a
    # single config file.
    
    #cfg_file=/etc/nagios/minimal.cfg
    Here, I have commented out minimal.cfg
    Code:
    cfg_file=/etc/nagios/contactgroups.cfg
    cfg_file=/etc/nagios/contacts.cfg
    #cfg_file=/etc/nagios/dependencies.cfg
    #cfg_file=/etc/nagios/escalations.cfg
    cfg_file=/etc/nagios/hostgroups.cfg
    cfg_file=/etc/nagios/hosts.cfg
    cfg_file=/etc/nagios/services.cfg
    cfg_file=/etc/nagios/timeperiods.cfg
    And here I have uncommented the object config files we will work with first, to get basic functionality. We will now create these and populate them with some hosts, services, groups, etc.


    While we're at it we want to enable service commands in the CGIs, and enable flap detection:

  2. #2
    Administrator Advisor peter's Avatar
    Join Date
    Apr 2004
    Posts
    882
    Still in nagios.cfg, change:
    Code:
    check_external_commands=0
    check_external_commands=1
    and change:
    Code:
    enable_flap_detection=0
    enable_flap_detection=1
    open minimal.cfg and copy the timeperiod definition and paste it into a new file called timeperiods.cfg and save it.
    Code:
    define timeperiod{
            timeperiod_name 24x7
            alias           24 Hours A Day, 7 Days A Week
            sunday          00:00-24:00
            monday          00:00-24:00
            tuesday         00:00-24:00
            wednesday       00:00-24:00
            thursday        00:00-24:00
            friday          00:00-24:00
            saturday        00:00-24:00
            }
    Do the same for the contact definition and contact group definition. For hosts, copy the generic-host definition, along with the localhost definition and paste into hosts.cfg.
    Code:
    define host{
            name                            generic-host    ; The name of this host template
            notifications_enabled           1       ; Host notifications are enabled
            event_handler_enabled           1       ; Host event handler is enabled
            flap_detection_enabled          1       ; Flap detection is enabled
            failure_prediction_enabled      1       ; Failure prediction is enabled
            process_perf_data               1       ; Process performance data
            retain_status_information       1       ; Retain status information across program restarts
            retain_nonstatus_information    1       ; Retain non-status information across program restarts
            register                        0       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
            }
    
    
    # Since this is a simple configuration file, we only monitor one host - the
    # local host (this machine).
    
    define host{
            use                     generic-host            ; Name of host template to use
            host_name               localhost
            alias                   localhost
            address                 127.0.0.1
            check_command           check-host-alive
            max_check_attempts      10
            notification_interval   120
            notification_period     24x7
            notification_options    d,r
            contact_groups  admins
            }
    
    define host{
            use                     generic-host            ; Name of host template to use
            host_name               testbox
            alias                   Testbox
            address                 192.168.0.4
            check_command           check-host-alive
            max_check_attempts      10
            notification_interval   120
            notification_period     24x7
            notification_options    d,r
            contact_groups  admins
            }
    I have added a networked host to check. Copy the hostgroup definition from minimal.cfg and paste into the new hostgroups.cfg.
    Code:
    define hostgroup{
            hostgroup_name  test
            alias           Test Servers
            members         localhost,testbox
            }


    I added our testbox to this group. We will need to copy the services definitions from minimal.cfg and paste them all into the new services.cfg file. Now we verify our work using nagios:

  3. #3
    Administrator Advisor peter's Avatar
    Join Date
    Apr 2004
    Posts
    882
    Code:
    [radar@test2 nagios]$ sudo nagios -v /etc/nagios/nagios.cfg
    
    Nagios 2.2
    Copyright (c) 1999-2006 Ethan Galstad (http://www.nagios.org)
    Last Modified: 04-07-2006
    License: GPL
    
    Reading configuration data...
    
    Running pre-flight check on configuration data...
    
    Checking services...
            Checked 5 services.
    Checking hosts...
    Warning: Host 'testbox' has no services associated with it!
            Checked 2 hosts.
    Checking host groups...
            Checked 1 host groups.
    Checking service groups...
            Checked 0 service groups.
    Checking contacts...
            Checked 1 contacts.
    Checking contact groups...
            Checked 1 contact groups.
    Checking service escalations...
            Checked 0 service escalations.
    Checking service dependencies...
            Checked 0 service dependencies.
    Checking host escalations...
            Checked 0 host escalations.
    Checking host dependencies...
            Checked 0 host dependencies.
    Checking commands...
            Checked 22 commands.
    Checking time periods...
            Checked 1 time periods.
    Checking extended host info definitions...
            Checked 0 extended host info definitions.
    Checking extended service info definitions...
            Checked 0 extended service info definitions.
    Checking for circular paths between hosts...
    Checking for circular host and service dependencies...
    Checking global event handlers...
    Checking obsessive compulsive processor commands...
    Checking misc settings...
    
    Total Warnings: 1
    Total Errors:   0
    
    Things look okay - No serious problems were detected during the pre-flight check
    If we had made a mistake, nagios would do its best to hint toward the problem. So all looks good for us to have a basic functioning setup. I will address the warning about no services set up for the testbox in a bit. We will now set up apache for authentication.
    Configure HTTPD authentication and CGI accesses

    Look at /etc/httpd/conf.d/nagios.conf to see how authentication files are set:



    Code:
    AuthName "Nagios Access"
      AuthType Basic
      AuthUserFile /etc/nagios/htpasswd.users
      Require valid-user
    So we need to add nagiosadmin, who's defined as a contact, in htpasswd.users:
    Code:
    sudo /usr/bin/htpasswd -c /etc/nagios/htpasswd.users nagiosadmin
    Make sure this file is readable by the apache user, if not already:
    Code:
    sudo chmod 644 /etc/nagios/htpasswd.users
    Now edit cgi.cfg, uncommenting the lines containing allowed actions for the nagiosadmin user.


    Configure Nagios and Apache Services for Start

    Code:
    [radar@test2 ~]$ sudo /sbin/chkconfig --level 35 httpd on
       [radar@test2 ~]$ sudo /sbin/chkconfig --level 35 nagios on
    Unfortunately, before we proceed, we have to disable SELinux. There is no policy (that I know of) created to allow nagios functionality with SELinux enabled apache. If anyone knows the solution, please see contact info at the end of this PET, and discuss. The easiest way to disable SELinux, is to go to applications, system settings, security level and select the selinux tab. Uncheck "Enabled (Modification Requires Reboot". Then click ok and reboot.


    When the machine is up, we can point the browser to https://machine/nagios. We'll see right away in the control panel that there's an issue with the total processes check. By looking at /etc/nagios/services.cfg for check_local_procs we see the check definition:
    Code:
    check_local_procs!250!400
    So lets look at our checkcommands.cfg file to see how that's defined:
    Code:
    $USER1$/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$
    Right away, we see there's a mismatch. The default service definition supplies only 2 arguments (delimited by the '!'), yet the command definition is looking for 3. Lets see what that -s is for:
    Code:
    cd /usr/lib/nagios/plugins
       ./check_procs -h | less
    The help tells us that the -s is optional:
    Optional Filters:
    -s, --state=STATUSFLAGS
    So we'll remove that from the command definition for now:



    Code:
    define command{
            command_name    check_local_procs
            command_line    $USER1$/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$
            }
     define command{
            command_name    check_local_procs
            command_line    $USER1$/check_procs -w $ARG1$ -c $ARG2$
            }
    
    We've removed the optional ps status flag.
    Restart nagios:
    Code:
    [radar@test2 plugins]$ sudo /sbin/service nagios restart
    Running configuration check...done
    Stopping network monitor: nagios
    Waiting for nagios to exit . done.
    Starting network monitor: nagios
    
    Now all is green! We have basic Nagios functionality and can start adding our customizations.

    Adding Services To Nagios

    Remember that when we verified nagios's configuration, we got a warning about our testbox host not having any services associated with it. What this means is that, besides the obvious, nagios will not do any host alive checks against it. Nagios tries to spread out the checks in an efficient manner and will normally only check a host's alive state when a service is failing. Once we establish a service for testbox. It will count the host as alive if the service associated with it succeeds. You can set up a service just to ping the box, but we'll set up a custom command using one of the provided plugins.

    Using a Supplied Plugin

    I have started apache on our testbox, and will use the check_http plugin to define a command, and then from that, define a service to run against testbox. We can test the plugin directly so we know what to expect:
    Code:
    /usr/lib/nagios/plugins/check_http -h
    Gives us the usage
    Code:
    [radar@test2 www]$ /usr/lib/nagios/plugins/check_http -H testbox -u /error/noindex.html
       HTTP OK HTTP/1.1 200 OK - 4177 bytes in 0.007 seconds |time=0.006624s;;;0.000000 size=4177B;;;0
    Gives us the default new install page. We can use that to set up a service to test whether apache is up on testbox. Create a new config file in /etc/nagios called custom_cmds.cfg and place the following in it:
    Code:
    define command{
            command_name    check_apache
            command_line    $USER1$/check_http -H $ARG1$ -S -u $ARG2$
            }
    Now open services.cfg in an editor and define a service to use this command definition:
    Code:
    define service{
           use                             generic-service         ; Name of service template to use
           host_name                       testbox
           service_description             Check Apache
           is_volatile                     0
           check_period                    24x7
           max_check_attempts              4
           normal_check_interval           5
           retry_check_interval            1
           contact_groups                  admins
           notification_options            w,u,c,r
           notification_interval           960
           notification_period             24x7
           check_command                   check_apache!testbox!/error/noindex.html
    We have to tell nagios that this new command file exists by adding the path to the file:
    Code:
    cfg_file=/etc/nagios/custom_cmds.cfg
    I added that under the existing command definition. Now we can use this file to add custom command definitions. We need to verify that we did'nt make any mistakes:


    Code:
        Total Warnings: 0
       Total Errors:   0
       
       Things look okay - No serious problems were detected during the pre-flight check
    Good. We can restart nagios:
    Code:
    sudo /sbin/service nagios restart
    We see that the new service is there, but it's pending. We can force it by rescheduling the next check and accepting the default time, which is immediate. We now can see that the service is working.


    Pretty easy, but we may also want to write our own plugin and make a service check from that. Let's emulate the functionality of the check_http plugin, for illustration purposes, using available tools and wrap it up in a bash script.

    Create Custom Plugin

    To use this example, curl needs to be installed. It is by default on RHEL.
    Nagios expects plugins to return a code telling what the status of the check is. The following details what the codes are:
    Code:
    0 = OK
       1 = WARNING
       2 = CRITICAL
       3 = UNKNOWN
    The warning and critical exit codes are ideal for setting thresholds, such as CPU usage and load averages. But since our service is either on or off, we can use critical, ok, and unknown (for bad parameters passed).
    This script takes arguments and passes them to the curl command. We'll use it to get similar functionality as the check_http plugin.

  4. #4
    Administrator Advisor peter's Avatar
    Join Date
    Apr 2004
    Posts
    882
    Code:
    #!/bin/bash
    #
    # testweb.sh
    #
    #
    
    BADCALL="Wrong combination of parameters $@"
    
    printuse ()
    {
    cat <<End-of-usage
    
    Usage:   ./testweb.sh -h [hostname] [-H|S]
             ./testweb.sh -h [hostname] [-H|S] -p [port]
    
    Example: ./testweb.sh -h www.redhat.com -S
             ./testweb.sh -h 192.168.0.10 -p 7778
    
    End-of-usage
    }
    
    # Rudimentary check for proper number and combination of parameters
    
    if [ "$#" -lt 3 ] || [ "$#" -gt 5 ] || [ "$#" -eq 4 ] || [ "$1" != "-h" ] || \
       [ ! `echo "$3" | grep [S,H]` ]
    then
        echo "$BADCALL"
        printuse
        exit 3
    elif [ "$#" -eq 5 ] && [ "$4" != "-p" ] || [ `echo "$5" | grep [^0-9]` ]
    then
        echo $BADCALL
        printuse
        exit 3
    fi
    
    # Set the URL prefix based on parameter 3
    
    if [ "$3" == "-S" ]
    then
        PRE=https://
    else
        PRE=http://
    fi
    
    # Build URL
    
    HOST="$2"
    if [ "$#" -eq 5 ]
    then
        PORT=":$5"
        URL="$PRE$HOST$PORT"
    else
        URL="$PRE$HOST"
    fi
    
    curl -k -s -I -w "%{size_header} bytes in %{time_total} seconds\n\n" $URL >/tmp/$HOST.header.txt
    
    case "$?" in
        "7")
        MSG=`cat /tmp/$HOST.header.txt`
        echo "CRITICAL - Failed to connect => $MSG"
        exit 2
        ;;
        "0")
        STAT=`grep seconds /tmp/$HOST.header.txt`
        SRV=`grep Server /tmp/$HOST.header.txt | awk '{print $2}'`
        echo "OK - $SRV => $STAT"
        rm -f /tmp/$HOST.header.txt
        exit 0
        ;;
    esac
     
    And we save this in /usr/lib/nagios/plugins as testweb.sh and make it executable:

    Code:
    chmod 755 /usr/lib/nagios/plugins/testweb.sh
    Let's see how to use the plugin:
    Code:
    [radar@test2 nagios]$ /usr/lib/nagios/plugins/testweb.sh -h testbox -S
         OK - Apache/2.0.52 => 199 bytes in 0.354 seconds
         [radar@test2 nagios]$ /usr/lib/nagios/plugins/testweb.sh -h testbox -H
         OK - Apache/2.0.52 => 199 bytes in 0.008 seconds
    SSL seems considerably slower, as can be expected.
    We can use this now to define a new service. Let's edit /etc/nagios/custom_cmds.cfg and add a command.
    Code:
    define command{
                command_name    check_apache_also
                command_line    $USER1$/testweb.sh -h $ARG1$ -S
                }
    Now we edit services.cfg and define the service:
    Code:
    define service{
                 use                             generic-service         ; Name of service template to use
                 host_name                       testbox
                 service_description             Check Apache Also
                 is_volatile                     0
                 check_period                    24x7
                 max_check_attempts              4
                 normal_check_interval           5
                 retry_check_interval            1
                 contact_groups                  admins
                 notification_options            w,u,c,r
                 notification_interval           960
                 notification_period             24x7
                 check_command                   check_apache_also!testbox
                 }


    And we verify our changes with nagios:

    Code:
    [radar@test2 nagios]$ sudo nagios -v /etc/nagios/nagios.cfg
         Total Warnings: 0
         Total Errors:   0
         
         Things look okay - No serious problems were detected during the pre-flight check
    Now restart nagios:
    Code:
    [radar@test2 nagios]$ sudo /sbin/service nagios restart
    The service will show pending, so force its schedule as before. And we see it works!

    Conclusion

    It took a little configuration, but it's quite easy to have a functioning Nagios install, with reliable checks. There is quite a bit more to nagios, all of which you'll want to get working. Things like service groups, notifications, dependencies and escalations will further refine the way Nagios works for you. Nagios is well documented - you can view the help files right from within a working install, or go over to Nagios's project site.

    Links



    Next

    Coming soon: Nagios Remote Process Executor (NRPE) and a custom remote plugin example

  5. #5
    Sir can u please help me how to install nagios in linux and from there how to monitor the windows Desktop and Server machines

Similar Threads

  1. Replies: 3
    Last Post: 01-02-2011, 01:33 PM
  2. cluster installation and configuration
    By linuxhelpme in forum Linux - Software, Applications & Programming
    Replies: 0
    Last Post: 11-08-2005, 10:53 AM
  3. Network Monitor in Windows 2000
    By regix in forum Windows - General Topics
    Replies: 0
    Last Post: 01-05-2005, 01:45 AM
  4. Replies: 2
    Last Post: 05-10-2004, 05:09 PM
  5. Network configuration
    By parth in forum Linux - Hardware, Networking & Security
    Replies: 3
    Last Post: 09-27-2002, 03:12 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •