Nagios是一款开源的免费网络监视工具,能有效监控Windows、Linux和Unix的主机状态,交换机路由器等网络设置,打印机等。在系统或服务状态异常时发出邮件或短信报警第一时间通知网站运维人员,在状态恢复后发出正常的邮件或短信通知。Nagios可运行在Linux/Unix平台之上,同时提供一个可选的基于浏览器的WEB界面以方便系统管理人员查看网络状态,各种系统问题,以及日志等等。
Nagios的功能是监控服务和主机,但是他自身并不包括这部分功能,所有的监控、检测功能都是通过各种插件来完成的。
启动Nagios后,它会周期性的自动调用插件去检测服务器状态,同时Nagios会维持一个队列,所有插件返回来的状态信息都进入队列,Nagios每次都从队首开始读取信息,并进行处理后,把状态结果通过web显示出来。
Nagios提供了许多插件,利用这些插件可以方便的监控很多服务状态。安装完成后,在nagios主目录下的/libexec里放有 nagios自带的可以使用的所有插件,如,check_disk是检查磁盘空间的插件,check_load是检查CPU负载的,等等。每一个插件可以 通过运行./check_xxx –h 来查看其使用方法和功能。
Nagios可以识别4种状态返回信息,即 0(OK)表示状态正常/绿色、1(WARNING)表示出现警告/黄色、2(CRITICAL)表示出现非常严重的错误/红色、3(UNKNOWN)表 示未知错误/深黄色。Nagios根据插件返回来的值,来判断监控对象的状态,并通过web显示出来,以供管理员及时发现故障。
nagios-plugins是nagios官方提供的一套插件程序,nagios监控主机的功能其实都是通过执行插件程序来实现的。
nagios.cfg Nagios 主配置文件
resource.cfg 变量定义文件,又称为资源文件,在些文件中定义变量,以便由其他配置文件引用,如$USER1$
objects objects 是一个目录,在此目录下有很多配置文件模板,用于定义Nagios 对象
objects/commands.cfg 命令定义配置文件,其中定义的命令可以被其他配置文件引用
objects/contacts.cfg 定义联系人和联系人组的配置文件
objects/localhost.cfg 定义监控本地主机的配置文件
objects/templates.cfg 定义主机和服务的一个模板配置文件,可以在其他配置文件中引用
objects/timeperiods.cfg 定义Nagios 监控时间段的配置文件
监控主机:172.25.85.2 server2.example.com
被监控主机:172.25.85.3 server3.example.com
server2:
添加用户:
tar jxf nagios-cn-3.2.3.tar.bz2
yum install gd-devel-2.0.35-11.el6.x86_63.rpm
groupadd nagcmd
useradd -M -d /usr/local/nagios -G nagcmd nagios (-M 不指定用户家目录 -m指定用户家目录)
usermod -G nagcmd apache ##添加nagcmd用户组,用以通过web页面提交外部控制命令
cd nagios-cn-3.2.0
./configure --with-command-group=nagcmd
make all
make install
make install-init
make install-commandmode
make install-config
make install-webconf ##安装web配置文件
vim /etc/httpd/conf.d/nagios.conf
cat /usr/local/nagios/etc/htpasswd.users
nagiosadmin:gCWSDnqEHR45c
htpasswd /usr/local/nagios/etc/htpasswd.users nagiosadmin
##(htpasswd 用于创建和更新储存用户名,和用于用户密码认证)
cat /usr/local/nagios/etc/htpasswd.users
nagiosadmin:AYGVkVYuX2mDs
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg ##检查配置文件
/etc/init.d/httpd start
/etc/init.d/nagios start
http://172.25.85.2/nagios/
2. server2:
安装nagios-plugins
tar zxf nagios-plugins-1.4.14.tar.gz
cd nagios-plugins-1.4.14
yum install mysql-devel openssl-devel
./configure
make
make install
cd /usr/local/nagios/libexec
chown nagios.nagios * -R
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
/etc/init.d/nagios reload
http://172.25.85.2/nagios/
3.server2:
修改配置文件:
vim /usr/local/nagios/etc/nagios.cfg
##注销掉cfg_file= 这一行
cd /usr/local/nagios/etc/objects
cp localhost.cfg hosts.cfg -p
vim hosts.cfg
define host{ use linux-server host_name server2.example.com alias Manager # parents MainSwitch address 172.25.85.2 icon_image server.gif statusmap_image server.gd2 2d_coords 500,200 3d_coords 500,200,100 } # HOST GROUP DEFINITION ############################################################################### # Define an optional hostgroup for Linux machines define hostgroup{ hostgroup_name linux-servers ; The name of the hostgroup alias Linux Servers ; Long name of the group members * ; Comma separated list of hosts that belong to this group }
cp localhost.cfg services.cfg
chown nagios.nagios services.cfg
vim services.cfg
define servicegroup{ servicegroup_name 系统负荷检查 alias 负荷检查 members server2.example.com,进程总数,server2.example.com,登录用户数,server2.example.com,根分区,server2.example.com,交换空间利用率 } define service{ use local-service ; Name of service template to use host_name * service_description PING check_command check_ping!100.0,20%!500.0,60% } define service{ use local-service ; Name of service template to use host_name server2.example.com service_description 根分区 check_command check_local_disk!20%!10%!/ } define service{ use local-service ; Name of service template to use host_name server2.example.com service_description 登录用户数 check_command check_local_users!20!50 } define service{ use local-service ; Name of service template to use host_name server2.example.com service_description 进程总数 check_command check_local_procs!250!400!RSZDT } define service{ use local-service ; Name of service template to use host_name server2.example.com service_description 系统负荷 check_command check_local_load!5.0,4.0,3.0!10.0,6.0,4.0 } define service{ use local-service ; Name of service template to use host_name server2.example.com service_description 交换空间利用率 check_command check_local_swap!20!10 } define service{ use local-service ; Name of service template to use host_name server2.example.com service_description SSH check_command check_tcp!22!1.0!10.0 notifications_enabled 0 } define service{ use local-service ; Name of service template to use host_name server2.example.com service_description HTTP check_command check_http notifications_enabled 0 }
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
/etc/init.d/nagios reload
172.25.85.2/nagios/
server2:
cd /usr/local/nagios/libexec
./check_disk -w 20 -c 10
./check_disk -w 20 -c 10 -p /
cd /usr/local/nagios/etc/objects
vim services.cfg
在定义根分区最后加入:
max_check_attempts 2
4. server3:
yum install mysql-server -y
/etc/init.d/mysqld start
mysql_secure_installation
mysql -p
mysql> create database nagdb; mysql> grant select on nagdb.* to nagios@‘172.25.85.2‘ identified by "redhat";
server2:
cd /usr/local/nagios/libexec
./check_mysql -H 172.25.85.3 -u nagios -p redhat
mysql -h 172.25.85.3 -u nagios -predhat ##可以登陆
server3:
/etc/init.d/mysqld stop
server2:
mysql -h 172.25.85.3 -u nagios -predhat
server2:
cd /usr/local/nagios/etc/objects
vim commands.cfg
define command{ command_name check_mysql command_line $USER1$/check_mysql -H $HOSTADDRESS$ -u $ARG1$ -p $ARG2$ }
最后一行加上:
define service{ use local-service ; Name of service template to use host_name server3.example.com service_description MYSQL check_command check_mysql!nagios!redhat notifications_enabled 0 }
vim hosts.cfg
添加一个host
define host{ use linux-server host_name server3.example.com alias Manager # parents MainSwitch address 172.25.85.3 icon_image server.gif statusmap_image server.gd2 2d_coords 400,100 3d_coords 400,100,100 }
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
/etc/init.d/nagios reload
http://172.25.85.2:/nagios/
5. server3:
tar zxf nrpe-2.15.tar.gz
tar zxf nagios-plugins-2.1.1.tar.gz
useradd -M -d /usr/local/nagios nagios
cd /root/nagios-plugins-2.1.1
yum install openssl-devel xinetd -y
./configure
make
make install
cd /usr/local/nagios/
chown nagios.nagios . -R
cd /root/nrpe-2.15
./configure
make all
make install-plugin
make install-daemon
make install-daemon-config
make install-xinetd
cd /etc/xinetd.d/
vim nrpe
修改:
only_from = 172.25.85.2 ##监控主机
vim /etc/services
在最后添加:
nrpe 5666/tcp
vim /usr/local/nagios/etc/nrpe.cfg
allowed_hosts=172.25.85.2
/etc/init.d/xinetd start
cd /usr/local/nagios/libexec/
scp check_nrpe root@172.25.85.2:/usr/local/nagios/libexec/
server2:
cd /usr/local/nagios/libexec/
chown nagios.nagios check_nrpe
./check_nrpe -H 172.25.85.3 ##检测nrpe是否可用成功显示nrpe版本号 ./check_nrpe -H 172.25.85.3 -c check_disk
cd /usr/local/nagios/etc/objects
vim commands.cfg
define command{ command_name check_nrpe command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ }
添加在mysql后面:
define service{ use local-service ; Name of service template to use host_name server3.example.com service_description 根分区 check_command check_nrpe!check_disk } define service{ use local-service ; Name of service template to use host_name server3.example.com service_description 登陆用户数 check_command check_nrpe!check_users }
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
/etc/init.d/nagios reload
http://172.25.85.2:/nagios/
6. 虚拟机联网:
网桥方式:适合宽带网络
server2:
yum install mail -y
ip addr add 172.25.254.185/24 dev eth1
ip route show
route -n
cat /etc/resolv.conf
nameserver 192.168.11.182
物理机:
ip addr show
NAT方式:适合无线
先将虚拟机的ip 改为动态分配 ##server2和server3都要修改
然后修改虚拟机的虚拟网络接口为NAT方式
重启虚拟机
注意:
[重启虚拟机后两台虚拟机的ip 会变化,之前所作的和ip 相关的操作需要重新做。
重启之后:server2 172.25.85.2 变成了 192.168.122.196
server3 172.25.85.3 变成了 192.168.122.202]
server2:
vim /usr/local/nagios/etc/objects/contacts.cfg
vim /usr/local/nagios/etc/objects/templates.cfg ##186行
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
/etc/init.d/nagios reload
su - nagios ##试着给qq邮箱发送邮件,看虚拟机通网了没
登录邮箱设置白名单:nagios@server2.example.com
检测:
server3:
/etc/init.d/mysqld stop
172.25.85.2/nagios/
7.110云告警: ##步骤可参考官网
登陆one!ert->配置-> 选择配置类
生成了一个key
配置->通知策略->
关闭server3上的mysqld
server2:
cd /usr/local/nagios
tar zxf alert-agent-4.1.3.1-linux-x64.tar.gz -C /usr/local/nagios/libexec
su - nagios
-bash-4.1$ cd /usr/local/nagios/libexec/alert-agent
-bash-4.1$ cd nagios-plugin/
-bash-4.1$ cp nagios /usr/local/nagios/libexec/
-bash-4.1$ cp 110monitor.cfg /usr/local/nagios/etc/objects/
cd /usr/local/nagios/etc/objects
vim contacts.cfg
vim /usr/local/nagios/etc/nagios.cfg
date_format=iso8601
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
/etc/init.d/nagios reload
告警:
cd /usr/local/nagios/var
tail -f nagios.log
原文地址:http://11713145.blog.51cto.com/11703145/1834150