Monitoring

Материал из noname.com.ua
Перейти к навигацииПерейти к поиску

Monitorimg

Collectd

Collectd is simple data collector, use plugins to collect data and output plugins to send tada to another tool (heka in our confuration)
Collectd is collecting following metrics (compute node, simple cluster):

Metrics

Please see plugin details on collectd man page: https://collectd.org/documentation/manpages/collectd.conf.5.shtml#

  • cpu (CPU usage)
  • df (disk usage/free size)
  • disk (disk usage/IOPS)
  • interface (interfece usage/bytes sent and received )
  • load (Linux LA)
  • memory (memory usage)
  • processes (detailed monitoring of collect and hekad)
  • swap (swap usage)]

Output

Collectd saves all data in rrd files and sends it to heka using write_http plugin )https://collectd.org/documentation/manpages/collectd.conf.5.shtml#plugin_write_http). It sends data in JSON format to local hekad (BTW Why do we use local heka on each node?)
Plugin configuration:

<LoadPlugin write_http>
  Globals false
</LoadPlugin>

<Plugin "write_http">
  <URL "http://127.0.0.1:8325">
    Format "JSON"
    StoreRates true
  </URL>
</Plugin>

Hekad is listen on 127.0.0.1:8325

# netstat -ntpl | grep 8325
tcp        0      0 127.0.0.1:8325          0.0.0.0:*               LISTEN      15368/hekad

Debug

It is possible to debug data tranfering from collectd to hekad. e.g. you can use tcpflow or you favorite tool to dump http traffic
Run dumping tool:

  • heka is listen on port 8325, taken from write_http config
  • lo interface is loopback, heka is listen on 127.0.0.1, so it is easy to find interface
    • # ip ro get 127.0.0.1 local 127.0.0.1 dev lo src 127.0.0.1 cache <local>

dev lo is device you need.

 tcpflow  -i lo port 8325
    • # tcpflow -i lo port 8325
    • Example of output:
# root@node-7:~/1# cat 127.000.000.001.45848-127.000.000.001.08325 | head -8 POST / HTTP/1.1 User-Agent: collectd/5.4.0.git Host: 127.0.0.1:8325 Accept: */* Content-Type: application/json Content-Length: 4064 [{"values":[2160],"dstypes":["gauge"],"dsnames":["value"],"time":1453203196.259,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"collectd","type":"ps_stacksize","type_instance":""},{"values":[0,1999.74],"dstypes":["derive","derive"],"dsnames":["user","syst"],"time":1453203196.262,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"collectd","type":"ps_cputime","type_instance":""},{"values":[1,11],"dstypes":["gauge","gauge"],"dsnames":["processes","threads"],"time":1453203196.262,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"collectd","type":"ps_count","type_instance":""},{"values":[40.9947,0],"dstypes":["derive","derive"],"dsnames":["minflt","majflt"],"time":1453203196.262,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"collectd","type":"ps_pagefaults","type_instance":""},{"values":[14424.2,0],"dstypes":["derive","derive"],"dsnames":["read","write"],"time":1453203196.262,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"collectd","type":"ps_disk_octets","type_instance":""},{"values":[101.407,0],"dstypes":["derive","derive"],"dsnames":["read","write"],"time":1453203196.262,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"collectd","type":"ps_disk_ops","type_instance":""},{"values":[5.18824e+08],"dstypes":["gauge"],"dsnames":["value"],"time":1453203196.262,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"hekad","type":"ps_vm","type_instance":""},{"values":[1.08454e+08],"dstypes":["gauge"],"dsnames":["value"],"time":1453203196.262,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"hekad","type":"ps_rss","type_instance":""},{"values":[4.85949e+08],"dstypes":["gauge"],"dsnames":["value"],"time":1453203196.262,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"hekad","type":"ps_data","type_instance":""},{"values":[1.36684e+07],"dstypes":["gauge"],"dsnames":["value"],"time":1453203196.262,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"hekad","type":"ps_code","type_instance":""},{"values":[1.39894e+14],"dstypes":["gauge"],"dsnames":["value"],"time":1453203196.262,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"hekad","type":"ps_stacksize","type_instance":""},{"values":[5000.36,1000.07],"dstypes":["derive","derive"],"dsnames":["user","syst"],"time":1453203196.262,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"hekad","type":"ps_cputime","type_instance":""},{"values":[1,8],"dstypes":["gauge","gauge"],"dsnames":["processes","threads"],"time":1453203196.262,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"hekad","type":"ps_count","type_instance":""},{"values":[0,0],"dstypes":["derive","derive"],"dsnames":["minflt","majflt"],"time":1453203196.262,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"hekad","type":"ps_pagefaults","type_instance":""},{"values":[27103.5,4280.91],"dstypes":["derive","derive"],"dsnames":["read","write"],"time":1453203196.262,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"hekad","type":"ps_disk_octets","type_instance":""},{"values":[146.411,2.10015],"dstypes":["derive","derive"],"dsnames":["read","write"],"time":1453203196.262,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"hekad","type":"ps_disk_ops","type_instance":""},{"values":[1.90014],"dstypes":["derive"],"dsnames":["value"],"time":1453203196.262,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"","type":"fork_rate","type_instance":""},{"values":[1.79999],"dstypes":["derive"],"dsnames":["value"],"time":1453203206.225,"interval":10.000,"host":"node-7","plugin":"cpu","plugin_instance":"0","type":"cpu","type_instance":"user"},{"values":[0],"dstypes":["derive"],"dsnames":["value"],"time":1453203206.225,"interval":10.000,"host":"node-7","plugin":"cpu","plugin_instance":"0","type":"cpu","type_instance":"nice"}]POST / HTTP/1.1

Config Files

All config files are in /etc/collectd/
/etc/collectd/conf.d stores plugin configuration files

# ls -lsa /etc/collectd/conf.d/
4 -rw-r----- 1 root root  169 Jan 18 16:38 05-logfile.conf
4 -rw-r----- 1 root root   71 Jan 18 16:38 10-cpu.conf
4 -rw-r----- 1 root root  289 Jan 18 16:38 10-df.conf
4 -rw-r----- 1 root root  145 Jan 18 16:38 10-disk.conf
4 -rw-r----- 1 root root  189 Jan 18 16:38 10-interface.conf
4 -rw-r----- 1 root root   72 Jan 18 16:38 10-load.conf
4 -rw-r----- 1 root root   74 Jan 18 16:38 10-memory.conf
4 -rw-r----- 1 root root   77 Jan 18 16:38 10-processes.conf
4 -rw-r----- 1 root root  138 Jan 18 16:38 10-swap.conf
4 -rw-r----- 1 root root   73 Jan 18 16:38 10-users.conf
4 -rw-r----- 1 root root  189 Jan 18 16:38 10-write_http.conf
4 -rw-r----- 1 root root   66 Jan 18 16:38 processes-config.conf

Heka