1. 首页
  2. 系统运维
  3. Nagios

nagios监控使用pnp4nagios自定义模板画图实例

现在使用zabbix的人很多,不过我觉得如果服务器监控数量不多,老牌监控系统nagios还是很不错的。nagios报警功能非常强大,而且程序小巧,资源占用小。nagios默认不支持画图,可以搭配cacti,不过搭建比较复杂。个人还是习惯用pnp4nagios。

nagios和pnp4nagios的一键安装脚本参考我的github:https://github.com/zhangnq/nagios/tree/master/setup

pnp4nagios默认图非常不美观,如果监控项中有多个数据项,pnp4nagios会分别显示多个。这里博主用监控内存脚本为例介绍如何使用pnp4nagios自定义模板实现美观的监控图。

nagios客户端

客户端上需要添加内存监控的脚本,默认插件不提供。

1、添加内存监控脚本,内容类似如下:

#!/bin/bash

#nagios exit code
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3

help () {
        local command=`basename $0`
        echo "NAME
        ${command} -- check memory status
SYNOPSIS
        ${command} [OPTION]
DESCRIPTION
        -w warning=
        -c critical=
USAGE:
        $0 -w 50% -c 60%" 1>&2
        exit ${STATE_WARNING}
}

check_num () {
        local num_str="$1"
        echo ${num_str}|grep -E '^[0-9]+$' >/dev/null 2>&1 || local stat='not a positive integers!'
        if [ "${stat}" = 'not a positive integers!' ];then
                echo "${num_str} ${stat}" 1>&2
                exit ${STATE_WARNING}
        else
                local num_int=`echo ${num_str}*1|bc`
                if [ ${num_int} -lt 0 ];then
                        echo "${num_int} must be greater than 0!" 1>&2
                        exit ${STATE_WARNING}
                fi
        fi
}

#input
while getopts w:c: opt
do
        case "$opt" in
        w) 
                warning=$OPTARG
                warning_num=`echo "${warning}"|sed  's/%//g'`
                check_num "${warning_num}"
        ;;
        c) 
                critical=$OPTARG
                critical_num=`echo "${critical}"|sed  's/%//g'`
                check_num "${critical_num}"
        ;;
        *) help;;
        esac
done
shift $[ $OPTIND - 1 ]

[ $# -gt 0 -o -z "${warning_num}" -o -z "${critical_num}" ] && help

if [ -n "${warning_num}" -a -n "${critical_num}" ];then
        if [ ${warning_num} -ge ${critical_num} ];then
                echo "-w ${warning} must lower than -c ${critical}!" 1>&2
                exit ${STATE_UNKNOWN}
        fi
fi

datas=`awk -F':|k' '$2~/[0-9]+/{datas[$1]=$2}END{for (data in datas) {print data"="datas[data]}}' /proc/meminfo | grep -Ev '[)|(]'`

var=`echo "${datas}"|sed 's/ //g'`
eval "${var}"

MemUsed=`echo ${MemTotal}-${MemFree}-${Cached}-${Buffers}|bc`
MemUsage=`echo "${MemUsed}/${MemTotal}*100"|bc -l`
MemUsage_num=`echo ${MemUsage}/1|bc`
#echo ${MemUsage_num}
MemTotal_MB=`echo ${MemTotal}/1024|bc`
MemUsed_MB=`echo ${MemUsed}/1024|bc`
MemFree_MB=`echo ${MemFree}/1024|bc`
Cached_MB=`echo ${Cached}/1024|bc`
Buffers_MB=`echo ${Buffers}/1024|bc`

message () {
local stat="$1"
echo "MEMORY is ${stat} - Usage: ${MemUsage_num}%. Total: ${MemTotal_MB} MB Used: ${MemUsed_MB} MB Free: ${MemFree_MB} MB | Used=${MemUsed_MB};; Cached=${Cached_MB};; Buffers=${Buffers_MB};; Free=${MemFree_MB};;"
}

[ ${MemUsage_num} -lt ${warning_num} ] && message "OK" && exit ${STATE_OK}
[ ${MemUsage_num} -ge ${critical_num} ] && message "Critical" && exit ${STATE_CRITICAL}
[ ${MemUsage_num} -ge ${warning_num} ] && message "Warning" && exit ${STATE_WARNING}

脚本路径一般是/usr/local/nagios/libexec,命名check_mem.sh。

2、然后修改nrpe.cfg配置文件,重启nrpe,命令类似如下。

wget http://download.chekiang.info/nagios/check_mem.sh
chmod +x check_mem.sh
chown nagios:nagios check_mem.sh

cat >>/usr/local/nagios/etc/nrpe.cfg<<"EOF"
command[check_mem]=/usr/local/nagios/libexec/check_mem.sh -w 80% -c 90%
EOF
sleep 3
/root/restart_nrpe.sh

nagios服务端

1、客户端添加完check_mem.sh插件后,在服务端添加监控服务check_mem,重启nagios 。

define service{
        use                     local-service,srv-pnp
        host_name               blog.nbhao.org
        service_description     check memory usage
        check_command           check_nrpe!check_mem
        notification_options    w,c
}

2、进入pnp4nagios的check_command配置文件目录,例如/usr/local/pnp4nagios/etc/check_commands/。默认目录中会有几个sample文件,添加check_nrpe.cfg,内容如下。

CUSTOM_TEMPLATE = 1   #使用命令的第一个参数做自定义模板名
DATATYPE = GAUGE       #数据类型为即时数值
USE_MIN_ON_CREATE = 0    #绘图数据最小值为0,用来排除某些错误溢出导致的负值

3、进入pnp4nagios的template模板目录,例如/usr/local/pnp4nagios/share/templates.dist 。添加check_mem.php,内存类似如下。

$alpha = 'CC';
$colors = array(
    '#850707' . $alpha,
    '#FFDB87' . $alpha,
    '#25345C' . $alpha,
    '#88008A' . $alpha,
    '#4F7774' . $alpha,
);
$opt[1] = sprintf('-T 55 -l 0 --vertical-label "Bytes" --title "%s / Memory Usage"', $hostname);
$def[1] = '';
$count = 0;
foreach ($DS as $i) {
    $def[1] .= rrd::def("var$i", $rrdfile, $DS[$i], 'AVERAGE');
    if ($i == '1') {
        $def[1] .= rrd::area ("var$i", $colors[$count], rrd::cut(ucfirst($NAME[$i]), 15));
    } else {
        $def[1] .= rrd::area ("var$i", $colors[$count], rrd::cut(ucfirst($NAME[$i]), 15), 'STACK');
    }
    $def[1] .= rrd::gprint  ("var$i", array('LAST','MAX','AVERAGE'), "%4.2lf %s\\t");
    $count++;
}

添加完成之后过几分钟等nagios生成数据即可看到pnp4nagios自定义模板的效果图。

20151226233818203

我们可以同时相同的办法实现网卡流量、磁盘读写等相似图表。

参考连接:

https://github.com/June-Wang/NagiosPlugins

http://docs.pnp4nagios.org/pnp-0.6/tpl

http://www.itnms.info/discuz/forum.php?mod=viewthread&tid=2788&page=1

评论列表(0条)

联系我们

0574-55011290

QQ:248687950

邮件:admin@nbhao.org

工作时间:周一至周五,9:00-18:00,节假日休息

QR code