Nagios is a host and service monitor designed to inform you of network problems before your clients, end-users or managers do. It has been designed to run under the Linux operating system, but works fine under most *NIX variants as well. The monitoring daemon runs intermittent checks on hosts and services you specify using external "plugins" which return status information to Nagios. When problems are encountered, the daemon can send notifications out to administrative contacts in a variety of different ways (email, instant message, SMS, etc.). Current status information, historical logs, and reports can all be accessed via a web browser.
It is the plugin option that makes Nagios so powerful, not everyone is willing of capable of writing a Nagios clone for them selfs but most of the Linux/UNIX administrators and developers will be able to write a plugin. An other power of Nagios is that you can write a plugin in every language you like as long as it is capable of providing a stdout. This means that the person writing the plugin can do this using C/C++, Java, Bash, Perl or any language of this choice.
This is one of the main drivers of the success of Nagios and the adaption of the system in a wide range of companies. Almost every company serious about running Linux / UNIX servers have a Nagios server running or should think about this. As my new job requires me to monitor a lot of Linux servers besides working on Oracle projects I will have to learn all the ins and outs of the Nagios system. This will save me a lot of time and frustration and in cooperation with ILO it will save me a lot of drives to the datacenter.
As I always will be a Oracle person I have been searching on Google for Nagios plugins which can be used to monitor Oracle and found several. A good website I can advice you to take a look is NagiosExchange which offers a large number of plugins which also contains some Nagios Oracle plugins. Plugins to check if the database is in archive mode, buffer cache checking, tablespace usage, executing your own PL/SQL checking scripts, check if you are able to write to a database instance and a lot more plugin functions can be found here.
If we take a look at the other type of plugins you will find a plugin for almost every system and network equipment that is out there and if not you will find out that it is not that hard to write your own custom plugin. As an example you can take a look at the plugins which can be found in the nagios-plugins-1.4.tar.gz file which you can download from http://gentoo.osuosl.org/distfiles/.
If you take a look at the "contrib" directory you will be able to view a lot of files you can use as an example or use out-off-the-box. You can also use google codesearch to take a look without downloading the file. You can use this link for viewing the file like that. As an example you can see here the code used to check if a process is running on a server:
#!/bin/bash
#
# Check_procr.sh
#
# Program: Process running check plugin for Nagios
# License : GPL
# Copyright (c) 2002 Jerome Tytgat (j.tytgat@sioban.net)
#
# check_procr.sh,v 1.0 2002/09/18 15:28
#
# Description :
#
# This plugin check if at least one process is running
#
# Usage :
#
# check_procr.sh -p process_name
#
# Example :
#
# To know if snort is running
# check_procr.sh -p snort
# > OK - total snort running : PID=23441
#
# Linux Redhat 7.3
#
help_usage() {
echo "Usage:"
echo " $0 -p "
echo " $0 (-v | --version)"
echo " $0 (-h | --help)"
}
help_version() {
echo "check_procr.sh (nagios-plugins) 1.0"
echo "The nagios plugins come with ABSOLUTELY NO WARRANTY. You may redistribute"
echo "copies of the plugins under the terms of the GNU General Public License."
echo "For more information about these matters, see the file named COPYING."
echo "Copyright (c) 2002 Jerome Tytgat - j.tytgat@sioban.net"
echo "Greetings goes to Websurg which kindly let me took time to develop this"
echo " Manu Feig and Jacques Kern who were my beta testers, thanks to them !"
}
verify_dep() {
needed="bash cut egrep expr grep let ps sed sort tail test tr wc"
for i in `echo $needed`
do
type $i > /dev/null 2>&1 /dev/null
if [ $? -eq 1 ]
then
echo "I am missing an important component : $i"
echo "Cannot continue, sorry, try to find the missing one..."
exit 3
fi
done
}
myself=$0
verify_dep
if [ "$1" = "-h" -o "$1" = "--help" ]
then
help_version
echo ""
echo "This plugin will check if a process is running."
echo ""
help_usage
echo ""
echo "Required Arguments:"
echo " -p, --process STRING"
echo " process name we want to verify"
echo ""
exit 3
fi
if [ "$1" = "-v" -o "$1" = "--version" ]
then
help_version
exit 3
fi
if [ `echo $@|tr "=" " "|wc -w` -lt 2 ]
then
echo "Bad arguments number (need two)!"
help_usage
exit 3
fi
tt=0
process_name=""
exclude_process_name=""
wt=""
ct=""
# Test of the command lines arguments
while test $# -gt 0
do
case "$1" in
-p|--process)
if [ -n "$process_name" ]
then
echo "Only one --process argument is useful..."
help_usage
exit 3
fi
shift
process_name="`echo $1|tr \",\" \"|\"`"
;;
*)
echo "Unknown argument $1"
help_usage
exit 3
;;
esac
shift
done
# ps line construction set...
for i in `ps ho pid -C $process_name`
do
pid_list="$pid_list $i"
done
if [ -z "$pid_list" ]
then
crit=1
else
crit=0
fi
# Finally Inform Nagios of what we found...
if [ $crit -eq 1 ]
then
echo "CRITICAL - process $process_name is not running !"
exit 2
else
echo "OK - process $process_name is running : PID=$pid_list "
exit 0
fi
# Hey what are we doing here ???
exit 3
If you want some more basic insight into how Nagios is working there is a very
nice introduction guide written by Mark Duling which you can find at
http://homepage.mac.com/duling/halfdozen/Nagios-Howto-p1.html