Consul - alert if disk is full

The consul demo has disk usage and memory usage checks. enter image description here

http://demo.consul.io/ui/#/ams2/nodes/ams2-server-1

How could you write a configuration to run the demo? Warning at 10% and critical eras at 5%?

This is what I am trying

{
  "check": {
     "name": "Disk Util",
     "script": "disk_util=$(df -k | grep '/dev/sda1' | awk '{print $5}' | sed 's/[^0-9]*//g' ) | if [ $disk_util >  90 ] ; then echo 'Disk /dev/sda above 90% full' && exit 1; elif [ $disk_util > 80 ] ; then echo 'Disk /dev/sda above 80%' && exit 3;  else exit 0; fi",
     "interval": "2m"
     }
}

      

Here is the same script but more human readable

disk_util=$(df -k | grep '/dev/sda1' | awk '{print $5}' | sed 's/[^0-9]*//g' ) | 
if [ $disk_util >  90 ] 
 then echo 'Disk /dev/sda above 90% full' && exit 1
elif [ $disk_util > 80 ] 
 then echo 'Disk /dev/sda above 80%' && exit 3
else exit 0; fi

      

The validation seems to work, but does not print any text. How can I check if this works and print the output?

enter image description here

+3


source to share


4 answers


Your check json file will look something like this:



{
  "check": {
    "name": "disks",
    "notes": "Critical 5%, warning 10% free",
    "script": "/path/to/check_disk -w 10% -c 5%",
    "interval": "2m"
  }
}

      

+7


source


The exit code for your warning condition must be 1 for critical, 2 or higher. (See "Script Testing" at https://www.consul.io/docs/agent/checks.html ), so you probably want to change your output lines.

Your status is "OK" (disk usage <80%) is giving no output, which most likely means you are seeing empty output.



Second, the concept of using nagios plugins rather than rolling your own. Many operating systems will have the nagios-plugins package (s), which are installed by yum / apt.

+1


source


The health checks depend on the check exit code. To test if the health checks are being checked by the Consul server, you can write a script that always exits with a 1, and then you will see that the health check failed. Then replace it with a script that always returns 0 and you should see the health check passed in.

If you want to return text to ui add output field to json.

0


source


It seems the consul is only analyzing stdout

, not stderr

. I tested with redirect ( 2>&1

) in the service check file config. It looks like work!

JSON configuration

{
  "check": {
    "name": "disks",
    "notes": "Critical 5%, warning 10% free",
    "script": "/path/to/check_disk -w 10% -c 5% 2>&1",
    "interval": "2m"
  }
}

      

Output result

enter image description here

0


source







All Articles