NRPE是Nagios的一个功能扩展,它可在远程Linux/Unix主机上执行插件程序。通过在远程服务器上安装NRPE插件及Nagios插件程序来向Nagios监控平台提供该服务器的本地情况,如CPU负载,内存使用,磁盘使用等。这里将Nagios监控端称为Nagios服务器端,而将远程被监控的主机称为Nagios客户端。
Nagios监控远程主机的方法有多种,其方式包括SNMP,NRPE,SSH,NCSA等。这里介绍其通过NRPE监控远程Linux主机的方式。
NRPE(Nagios Remote Plugin Executor)是用于在远端服务器上运行监测命令的守护进程,它用于让Nagios监控端基于安装的方式触发远端主机上的检测命令,并将检测结果返回给监控端。而其执行的开销远低于基于SSH的检测方式,而且检测过程不需要远程主机上的系统账号信息,其安全性也高于SSH的检测方式。
NRPE有两部分组成
check_nrpe插件:位于监控主机上
nrpe daemon:运行在远程主机上,通常是被监控端agent
注意:nrpe daemon需要Nagios-plugins插件的支持,否则daemon不能做任何监控
当Nagios需要监控某个远程Linux主机的服务或者资源情况时:
首先:Nagios会运行check_nrpe这个插件,告诉它要检查什么;
其次:check_nrpe插件会连接到远程的NRPE daemon,所用的方式是SSL;
然后:NRPE daemon 会运行相应的Nagios插件来执行检查;
最后:NRPE daemon 将检查的结果返回给check_nrpe 插件,插件将其递交给nagios做处理。
一、被监控端安装Nagios-plugins插件和NRPE
1、添加nagios用户
useradd -s /sbin/nologin nagios
2、安装nagios-plugins,因为NRPE依赖此插件
yum -y install gcc gcc-c++ make openssl openssl-devel tar xf nagios-plugins-2.0.3.tar.gz cd nagios-plugins-2.0.3 ./configure --with-nagios-user=nagios --with-nagios-group=nagios make all && make install #注意:如何要监控mysql 需要添加 --with-mysql
3、安装NRPE
tar xf nrpe-2.15.tar.gz cd nrpe-2.15 ./configure --with-nrpe-user=nagios --with-nrpe-group=nagios --with-nagios-user=nagios --with-nagios-group=nagios --enable-command-args --enable-ssl make all make install-plugin make install-daemon make install-daemon-config
4、配置NRPE
vim /usr/local/nagios/etc/nrpc.cfg log_facility=daemon pid_file=/var/run/nrpe.pid server_port=5666 #监听的端口 nrpe_user=nagios nrpe_group=nagios allowed_hosts=192.168.110.157 #允许的地址通常是Nagios服务器端 dont_blame_nrpe=0 allow_bash_command_substitution=0 debug=0 command_timeout=60 connection_timeout=300 command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10 command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20 command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1 command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
5、启动NRPE
#以守护进程的方式启动 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d netstat -tulpn | grep nrpe tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN 22597/nrpe tcp 0 0 :::5666 :::* LISTEN 22597/nrpe
有两种方式用于管理nrpe服务,nrpe有两种运行模式:
-i # Run as a service under inetd or xinetd -d # Run as a standalone daemon
可以为nrpe编写启动脚本,使得nrpe以standard alone方式运行:
vim /etc/init.d/nrped #!/bin/bash # chkconfig: 2345 88 12 # description: NRPE DAEMON NRPE=/usr/local/nagios/bin/nrpe NRPECONF=/usr/local/nagios/etc/nrpe.cfg case "$1" in start) echo -n "Starting NRPE daemon..." $NRPE -c $NRPECONF -d echo " done." ;; stop) echo -n "Stopping NRPE daemon..." pkill -u nagios nrpe echo " done." ;; restart) $0 stop sleep 2 $0 start ;; *) echo "Usage: $0 start|stop|restart" ;; esac exit 0 chmod +x /etc/init.d/nrped chkconfig --add nrped chkconfig nrped on service nrped start Starting NRPE daemon... done. netstat -tnlp Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1031/sshd tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 1108/master tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN 22597/nrpe tcp 0 0 :::22 :::* LISTEN 1031/sshd tcp 0 0 ::1:25 :::* LISTEN 1108/master tcp 0 0 :::5666 :::* LISTEN 22597/nrpe
也可以将此命令加入 /etc/rc.local ,以便开机自动启动。 # echo “/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d” >> /etc/rc.local
二、监控端安装NRPE
1、安装NRPE
# tar xf nrpe-2.15.tar.gz # cd nrpe-2.15 # ./configure --with-nrpe-user=nagios --with-nrpe-group=nagios --with-nagios-user=nagios --with-nagios-group=nagios --enable-command-args --enable-ssl # make all # make install-plugin #安装完成后,会在Nagios安装目录的libexec下生成check_nrpe的插件 # cd /usr/local/nagios/libexec/ # ll -d check_nrpe -rwxrwxr-x. 1 nagios nagios 76769 9月 28 08:07 check_nrpe
2、check_nrpe的用法
通过NRPE监控远程Linux主机要使用chech_nrpe插件进行,其语法格式如下:
check_nrpe -H <host> [-n] [-u] [-p <port>] [-t <timeout>] [-c <command>] [-a <arglist...>] # ./check_nrpe -H 192.168.0.81 NRPE v2.15
3、定义命令
# cd /usr/local/nagios/etc/objects/ # vim commands.cfg #增加到末尾行 define command{ command_name check_nrpe command_line $USER1$/check_nrpe -H "$HOSTADDRESS$" -c "$ARG1$" }
4、定义服务
cp localhost.cfg linhost.cfg # vim linhost define host{ use linux-server host_name linhost alias My Linux Server address 192.168.110.154 } define service{ use generic-service host_name linhost service_description CHECK USER check_command check_nrpe!check_users } define service{ use generic-service host_name linhost service_description Load check_command check_nrpe!check_load } define service{ use generic-service host_name linhost service_description SDA1 check_command check_nrpe!check_hda1 } define service{ use generic-service host_name linhost service_description Zombie check_command check_nrpe!check_zombie_procs } define service{ use generic-service host_name linhost service_description Total procs check_command check_nrpe!check_total_procs }
这里重点说下,Nagios服务端定义服务的命令完全是根据被监控端NRPE中内置的监控命令,如下图所示
5、启动所定义的命令和服务
# vim /usr/local/nagios/etc/nagios.cfg #增加一行 cfg_file=/usr/local/nagios/etc/objects/linhost.cfg
6、配置文件语法检查
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg Total Warnings: 0 Total Errors: 0 Things look okay - No serious problems were detected during the pre-flight check
7、重新启动nagios服务
# service nagios restart
8、打开Nagios web监控页面
1)首先点击【Hosts】查看监控主机状态是否为UP
2)其次点击【Services】查看各监控服务的状态是否为OK