安装配置Nagios
监控端
1、安装前的准备工作
(1)解决安装Nagios的依赖关系:
Nagios基本组件的运行依赖于httpd、gcc和gd。可以通过以下命令来检查nagios所依赖的rpm包是否已经完全安装:
# yum -y install httpd gcc glibc glibc-common gd gd-devel php php-mysql mysql mysql-devel mysql-server
(2)添加nagios运行所需要的用户和组:
# groupadd nagcmd
# useradd -G nagcmd nagios
# passwd nagios
把apache加入到nagcmd组,以便于在通过web Interface操作nagios时能够具有足够的权限:
# usermod -a -G nagcmd apache
2、编译安装nagios:
# tar zxf nagios-3.3.1.tar.gz
# cd nagios-3.3.1
# ./configure –with-command-group=nagcmd –enable-event-broker
# make all
# make install
# make install-init
# make install-commandmode
# make install-config
在httpd的配置文件目录(conf.d)中创建Nagios的Web程序配置文件:
# make install-webconf
创建一个登录nagios web程序的用户,这个用户帐号在以后通过web登录nagios认证时所用:
# htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
以上过程配置结束以后需要重新启动httpd:
# service httpd restart
3、编译、安装nagios-plugins
nagios的所有监控工作都是通过插件完成的,因此,在启动nagios之前还需要为其安装官方提供的插件。
# tar zxf nagios-plugins-1.4.15.tar.gz
# cd nagios-plugins-1.4.15
# ./configure –with-nagios-user=nagios –with-nagios-group=nagios
# make
# make install
4、配置并启动Nagios
(1)把nagios添加为系统服务并将之加入到自动启动服务队列:
# chkconfig –add nagios
# chkconfig nagios on
(2)检查其主配置文件的语法是否正确:
# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
(3)如果上面的语法检查没有问题,接下来就可以正式启动nagios服务了:
# service nagios start
(4)通过web界面查看nagios:
http://your_nagios_IP/nagios
被监控端(基于NRPE监控远程Linux主机)
1、安装配置被监控端
1)先添加nagios用户
# useradd -s /sbin/nologin nagios
2)NRPE依赖于nagios-plugins,因此,需要先安装之
# tar zxf nagios-plugins-1.4.15.tar.gz
# cd nagios-plugins-1.4.15
# ./configure –with-nagios-user=nagios –with-nagios-group=nagios
# make all
# make instal
3)安装NRPE
# tar -zxvf nrpe-2.12.tar.gz
# cd nrpe-2.12.tar.gz
# ./configure –with-nrpe-user=nagios \
–with-nrpe-group=nagios \
–with-nagios-user=nagios \
–with-nagios-group=nagios \
–enable-command-args \
–enable-ssl
# make all
# make install-plugin
# make install-daemon
# make install-daemon-config
4)配置NRPE
# vim /usr/local/nagios/etc/nrpe.conf
log_facility=daemon
pid_file=/var/run/nrpe.pid
server_address=192.168.210.12
server_port=5666
nrpe_user=nagios
nrpe_group=nagios
allowed_hosts=192.168.210.11
command_timeout=60
connection_timeout=300
debug=0
5)启动NRPE
# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
为了便于NRPE服务的启动,可以将如下内容定义为/etc/init.d/nrped脚本:
#!/bin/bash
# chkconfig: 2345 88 12
# description: NRPE DAEMON
NRPE=/usr/local/nagios/bin/nrpe
NRPECONF=/usr/local/nagios/etc/nrpe.cfg
case “$1″ in
start)
echo -n “Starting NRPE daemon…”
$NRPE -c $NRPECONF -d
echo ” done.”
;;
stop)
echo -n “Stopping NRPE daemon…”
pkill -u nagios nrpe
echo ” done.”
;;
restart)
$0 stop
sleep 2
$0 start
;;
*)
echo “Usage: $0 start|stop|restart”
;;
esac
exit 0
6)配置允许远程主机监控的对象
在被监控端,可以通过NRPE监控的服务或资源需要通过nrpe.conf文件使用命令进行定义:
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_sda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda1
command[check_sda3]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda3
command[check_swap]=/usr/local/nagios/libexec/check_disk -w 40% -c 20% -p /dev/shm
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
command[check_diskdisk]=/usr/local/nagios/libexec/check_diskdisk.sh
配置监控端
1)安装NRPE
# tar -zxvf nrpe-2.12.tar.gz
# cd nrpe-2.12.tar.gz
# ./configure –with-nrpe-user=nagios \
–with-nrpe-group=nagios \
–with-nagios-user=nagios \
–with-nagios-group=nagios \
–enable-command-args \
–enable-ssl
# make all
# make install-plugin
2)定义如何监控远程主机及服务:
nagios.cfg主配置文件加一行:cfg_file=/usr/local/nagios/etc/objects/192.168.210.12.cfg
192.168.210.12.cfg内容如下:
define host{
use linux-server
host_name 192.168.210.12
alias 0.12
address 192.168.210.12
}
define service{
use generic-service
host_name 192.168.210.12
service_description check_ping
check_command check_ping!100.0,20%!200.0,50%
max_check_attempts 5
normal_check_interval 1
}
define service{
use generic-service
host_name 192.168.210.12
service_description check_ssh
check_command check_ssh
max_check_attempts 5
normal_check_interval 1
notification_interval 60
}
define service{
use generic-service
host_name 192.168.210.12
service_description check_http
check_command check_http
max_check_attempts 5
normal_check_interval 1
contact_groups common
notifications_enabled 1
notification_period 24×7
notification_options w,u,c,r
}
define service{
use generic-service
host_name 192.168.210.12
service_description check_load
check_command check_nrpe!check_load
max_check_attempts 5
normal_check_interval 1
}
define service{
use generic-service
host_name 192.168.210.12
service_description check_disk_sda1
check_command check_nrpe!check_sda1
max_check_attempts 5
normal_check_interval 1
}
原文地址:http://onelinux.blog.51cto.com/2179673/1678582