Monitoring: различия между версиями

Материал из noname.com.ua
Перейти к навигацииПерейти к поиску
Строка 1: Строка 1:
 
=Monitorimg=
 
=Monitorimg=
 
==Collectd==
 
==Collectd==
 
Collectd is simple data collector, use plugins to collect data and output plugins to send tada to another tool (heka in our confuration)<BR>
 
Collectd is collecting following metrics (compute node, simple cluster):
 
 
===Metrics===
 
Please see plugin details on collectd man page: https://collectd.org/documentation/manpages/collectd.conf.5.shtml#
 
* cpu (CPU usage)
 
* df (disk usage/free size)
 
* disk (disk usage/IOPS)
 
* interface (interfece usage/bytes sent and received )
 
* load (Linux LA)
 
* memory (memory usage)
 
* processes (detailed monitoring of collect and hekad)
 
* swap (swap usage)
 
* openstack metrics (python plugin)
 
* other metrics
 
* your custom metrics if added
 
 
===Output===
 
Collectd saves all data in rrd files and sends it to heka using write_http plugin )https://collectd.org/documentation/manpages/collectd.conf.5.shtml#plugin_write_http). It sends data in JSON format to local hekad <B>(BTW Why do we use local heka on each node?) </B>
 
<BR>Plugin configuration:
 
<PRE>
 
<LoadPlugin write_http>
 
Globals false
 
</LoadPlugin>
 
 
<Plugin "write_http">
 
<URL "http://127.0.0.1:8325">
 
Format "JSON"
 
StoreRates true
 
</URL>
 
</Plugin>
 
</PRE>
 
Hekad is listen on 127.0.0.1:8325
 
<PRE>
 
# netstat -ntpl | grep 8325
 
tcp 0 0 127.0.0.1:8325 0.0.0.0:* LISTEN 15368/hekad
 
</PRE>
 
====Chain====
 
Details: https://collectd.org/documentation/manpages/collectd.conf.5.shtml#filter_configuration
 
<BR>
 
<PRE>
 
<Chain "PostCache">
 
<Rule>
 
<Match "regex">
 
Plugin "^pacemaker_resource$"
 
TypeInstance "^vip__management$"
 
</Match>
 
<Target "notification">
 
Message "{\"resource\":\"%{type_instance}\",\"value\":%{ds:value}}"
 
Severity "OKAY"
 
</Target>
 
</Rule>
 
Target "write"
 
</Chain>
 
</PRE>
 
 
This rule creates notifications in log file if message match regexp:
 
<PRE>
 
[2016-01-20 14:00:54] Notification: severity = OKAY, host = node-6, plugin = pacemaker_resource, type = gauge, type_instance = vip__management, message = {"resource":"vip__management","value":1}
 
</PRE>
 
====Debug====
 
=====Debug http traffic=====
 
It is possible to debug data tranfering from collectd to hekad. e.g. you can use tcpflow or you favorite tool to dump http traffic
 
<BR>Run dumping tool:
 
* heka is listen on port 8325, taken from write_http config
 
* lo interface is loopback, heka is listen on 127.0.0.1, so it is easy to find interface
 
** {{Root|<nowiki>
 
# ip ro get 127.0.0.1
 
local 127.0.0.1 dev lo src 127.0.0.1
 
cache <local>
 
</nowiki>}}
 
**<B>dev lo</B> is device you need.
 
* {{Root|<nowiki># tcpflow -i lo port 8325 </nowiki>}}
 
* Example of output: {{Root|<nowiki># cat 127.000.000.001.45848-127.000.000.001.08325 | head -8
 
POST / HTTP/1.1
 
User-Agent: collectd/5.4.0.git
 
Host: 127.0.0.1:8325
 
Accept: */*
 
Content-Type: application/json
 
Content-Length: 4064
 
 
[{"values":[2160],"dstypes":["gauge"],"dsnames":["value"],"time":1453203196.259,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"collectd","type":"ps_stacksize","type_instance":""},{"values":[0,1999.74],"dstypes":["derive","derive"],"dsnames":
 
...
 
skip
 
...
 
</nowiki>}}
 
 
=====Debug with your own write plugin=====
 
One more way to debug is create your own write plugin and write all you need.
 
<BR>For example I created simple write plugin (using python)
 
* Create plugin configuration, e.g. in /etc/collectd/conf.d/openstack.conf
 
<PRE>
 
Import write_file
 
<Module "write_file">
 
log_filename = "/var/log/collectd_debug.log"
 
</Module>
 
</PRE>
 
 
Create file /usr/lib/collectd/write_file.py (depends on your ModulePath, by-default it is "/usr/lib/collectd")
 
<PRE>
 
import collectd
 
 
 
def configure_callback(conf):
 
global f_log_file
 
 
for c in conf.children:
 
if c.key == 'log_filename':
 
log_filename = c.values[0]
 
else:
 
collectd.warning ('log_file_info plugin: Unknown config key: %s.' % c.key)
 
collectd.error('Configured with log_filename=%s' % (log_filename))
 
F_LOG_FILE = open(log_filename,'w')
 
f_log_file = open(log_filename,'w')
 
 
 
 
def write_callback(vl, data=None):
 
for i in vl.values:
 
#collectd.error("write_file: %s (%s): %f" % (vl.plugin, vl.type, i))
 
f_log_file.write("%s (%s): %f \n" % (vl.plugin, vl.type, i))
 
 
collectd.register_config(configure_callback)
 
collectd.register_write(write_callback)
 
</PRE>
 
 
=====Debug with unixsock plugin=====
 
One more way to get some debug information is using collectd-unixsock (https://collectd.org/documentation/manpages/collectd-unixsock.5.shtml)
 
 
add config, restart collectd
 
<PRE>
 
# cat 98-unixsock.conf
 
<LoadPlugin unixsock>
 
Globals false
 
</LoadPlugin>
 
 
<Plugin unixsock>
 
SocketFile "/var/run/collectd-unixsock"
 
SocketGroup "collectd"
 
SocketPerms "0770"
 
DeleteSocket true
 
</Plugin>
 
</PRE>
 
 
 
<PRE>
 
#collectdctl listval
 
node-6/apache-localhost/apache_bytes
 
node-6/apache-localhost/apache_connections
 
node-6/apache-localhost/apache_idle_workers
 
node-6/apache-localhost/apache_requests
 
node-6/apache-localhost/apache_scoreboard-closing
 
node-6/apache-localhost/apache_scoreboard-dnslookup
 
node-6/apache-localhost/apache_scoreboard-finishing
 
node-6/apache-localhost/apache_scoreboard-idle_cleanup
 
node-6/apache-localhost/apache_scoreboard-keepalive
 
node-6/apache-localhost/apache_scoreboard-logging
 
node-6/apache-localhost/apache_scoreboard-open
 
node-6/apache-localhost/apache_scoreboard-reading
 
node-6/apache-localhost/apache_scoreboard-sending
 
node-6/apache-localhost/apache_scoreboard-starting
 
node-6/apache-localhost/apache_scoreboard-waiting
 
node-6/check_openstack_api-cinder-api/gauge-RegionOne
 
...
 
Skip
 
...
 
</PRE>
 
 
 
<PRE>
 
# collectdctl getval node-6/swap/swap-free
 
value=1.923355e+09
 
</PRE>
 
 
===Config Files===
 
All config files are in /etc/collectd/
 
<BR>
 
/etc/collectd/conf.d stores plugin configuration files
 
<PRE>
 
# ls -lsa /etc/collectd/conf.d/
 
4 -rw-r----- 1 root root 169 Jan 18 16:38 05-logfile.conf
 
4 -rw-r----- 1 root root 71 Jan 18 16:38 10-cpu.conf
 
4 -rw-r----- 1 root root 289 Jan 18 16:38 10-df.conf
 
4 -rw-r----- 1 root root 145 Jan 18 16:38 10-disk.conf
 
4 -rw-r----- 1 root root 189 Jan 18 16:38 10-interface.conf
 
4 -rw-r----- 1 root root 72 Jan 18 16:38 10-load.conf
 
4 -rw-r----- 1 root root 74 Jan 18 16:38 10-memory.conf
 
4 -rw-r----- 1 root root 77 Jan 18 16:38 10-processes.conf
 
4 -rw-r----- 1 root root 138 Jan 18 16:38 10-swap.conf
 
4 -rw-r----- 1 root root 73 Jan 18 16:38 10-users.conf
 
4 -rw-r----- 1 root root 189 Jan 18 16:38 10-write_http.conf
 
4 -rw-r----- 1 root root 66 Jan 18 16:38 processes-config.conf
 
</PRE>
 
On controller there are more metrics:
 
<PRE>
 
d# ls -1
 
05-logfile.conf
 
10-apache.conf
 
10-cpu.conf
 
10-dbi.conf
 
10-df.conf
 
10-disk.conf
 
10-interface.conf
 
10-load.conf
 
10-match_regex.conf
 
10-memcached.conf
 
10-memory.conf
 
10-mysql.conf
 
10-processes.conf
 
10-swap.conf
 
10-target_notification.conf
 
10-users.conf
 
10-write_http.conf
 
99-chain-PostCache.conf
 
dbi_cinder_services.conf
 
dbi_mysql_status.conf
 
dbi_neutron_agents.conf
 
dbi_nova_services.conf
 
mysql-nova.conf
 
openstack.conf
 
processes-config.conf
 
</PRE>
 
   
 
==Heka==
 
==Heka==

Версия 18:59, 22 января 2016

Monitorimg

Collectd

Heka

Heka is an open source stream processing software system developed by Mozilla. Heka is a “Swiss Army Knife” type tool for data processing, useful for a wide variety of different tasks, such as:

  • Loading and parsing log files from a file system.
  • Accepting statsd type metrics data for aggregation and forwarding to upstream time series data stores such as graphite or InfluxDB.
  • Launching external processes to gather operational data from the local system.
  • Performing real time analysis, graphing, and anomaly detection on any data flowing through the Heka pipeline.
  • Shipping data from one location to another via the use of an external transport (such as AMQP) or directly (via TCP).
  • Delivering processed data to one or more persistent data stores.

Inputs

There are 2 types of input plugins used in heka

  • HttpListenInput
    • 127.0.0.1:8325; collectd_decoder
  • LogstreamerInput
    • /var/log/libvirt; libvirt_decoder
    • file_match = '(?P<Service>nova|cinder|keystone|glance|heat|neutron|murano)-all\.log$', openstack_decoder
    • "/var/log/dashboard\.log$'; decoder = "openstack_decoder"; splitter = "TokenSplitter"
    • file_match = '(?P<Service>nova|cinder|keystone|glance|heat|neutron|murano)-all\.log$'; differentiator = [ 'openstack.', 'Service' ]; decoder = "openstack_decoder"; splitter = "openstack_splitter"
    • file_match = '(?P<Service>ovs\-vswitchd|ovsdb\-server)\.log$';differentiator = [ 'Service' ];decoder = "ovs_decoder";splitter = "TokenSplitter"
    • file_match = '(?P<Service>daemon\.log|cron\.log|haproxy\.log|kern\.log|auth\.log|syslog|messages|debug)';differentiator = [ 'system.', 'Service' ];decoder = "system_decoder"


Splitters

Splitter details: https://hekad.readthedocs.org/en/v0.10.0/config/splitters/index.html
There are only one custom splitter:

[openstack_splitter]
type = "RegexSplitter"
delimiter = '(<[0-9]+>)'
delimiter_eol = false

Decoders

decoder-collectd.toml
decoder-libvirt.toml
decoder-openstack.toml
decoder-ovs.toml
decoder-system.toml