Monitoring: различия между версиями
Материал из noname.com.ua
Перейти к навигацииПерейти к поискуSirmax (обсуждение | вклад) |
Sirmax (обсуждение | вклад) |
||
Строка 1: | Строка 1: | ||
=Monitorimg= |
=Monitorimg= |
||
==Collectd== |
==Collectd== |
||
− | |||
− | Collectd is simple data collector, use plugins to collect data and output plugins to send tada to another tool (heka in our confuration)<BR> |
||
− | Collectd is collecting following metrics (compute node, simple cluster): |
||
− | |||
− | ===Metrics=== |
||
− | Please see plugin details on collectd man page: https://collectd.org/documentation/manpages/collectd.conf.5.shtml# |
||
− | * cpu (CPU usage) |
||
− | * df (disk usage/free size) |
||
− | * disk (disk usage/IOPS) |
||
− | * interface (interfece usage/bytes sent and received ) |
||
− | * load (Linux LA) |
||
− | * memory (memory usage) |
||
− | * processes (detailed monitoring of collect and hekad) |
||
− | * swap (swap usage) |
||
− | * openstack metrics (python plugin) |
||
− | * other metrics |
||
− | * your custom metrics if added |
||
− | |||
− | ===Output=== |
||
− | Collectd saves all data in rrd files and sends it to heka using write_http plugin )https://collectd.org/documentation/manpages/collectd.conf.5.shtml#plugin_write_http). It sends data in JSON format to local hekad <B>(BTW Why do we use local heka on each node?) </B> |
||
− | <BR>Plugin configuration: |
||
− | <PRE> |
||
− | <LoadPlugin write_http> |
||
− | Globals false |
||
− | </LoadPlugin> |
||
− | |||
− | <Plugin "write_http"> |
||
− | <URL "http://127.0.0.1:8325"> |
||
− | Format "JSON" |
||
− | StoreRates true |
||
− | </URL> |
||
− | </Plugin> |
||
− | </PRE> |
||
− | Hekad is listen on 127.0.0.1:8325 |
||
− | <PRE> |
||
− | # netstat -ntpl | grep 8325 |
||
− | tcp 0 0 127.0.0.1:8325 0.0.0.0:* LISTEN 15368/hekad |
||
− | </PRE> |
||
− | ====Chain==== |
||
− | Details: https://collectd.org/documentation/manpages/collectd.conf.5.shtml#filter_configuration |
||
− | <BR> |
||
− | <PRE> |
||
− | <Chain "PostCache"> |
||
− | <Rule> |
||
− | <Match "regex"> |
||
− | Plugin "^pacemaker_resource$" |
||
− | TypeInstance "^vip__management$" |
||
− | </Match> |
||
− | <Target "notification"> |
||
− | Message "{\"resource\":\"%{type_instance}\",\"value\":%{ds:value}}" |
||
− | Severity "OKAY" |
||
− | </Target> |
||
− | </Rule> |
||
− | Target "write" |
||
− | </Chain> |
||
− | </PRE> |
||
− | |||
− | This rule creates notifications in log file if message match regexp: |
||
− | <PRE> |
||
− | [2016-01-20 14:00:54] Notification: severity = OKAY, host = node-6, plugin = pacemaker_resource, type = gauge, type_instance = vip__management, message = {"resource":"vip__management","value":1} |
||
− | </PRE> |
||
− | ====Debug==== |
||
− | =====Debug http traffic===== |
||
− | It is possible to debug data tranfering from collectd to hekad. e.g. you can use tcpflow or you favorite tool to dump http traffic |
||
− | <BR>Run dumping tool: |
||
− | * heka is listen on port 8325, taken from write_http config |
||
− | * lo interface is loopback, heka is listen on 127.0.0.1, so it is easy to find interface |
||
− | ** {{Root|<nowiki> |
||
− | # ip ro get 127.0.0.1 |
||
− | local 127.0.0.1 dev lo src 127.0.0.1 |
||
− | cache <local> |
||
− | </nowiki>}} |
||
− | **<B>dev lo</B> is device you need. |
||
− | * {{Root|<nowiki># tcpflow -i lo port 8325 </nowiki>}} |
||
− | * Example of output: {{Root|<nowiki># cat 127.000.000.001.45848-127.000.000.001.08325 | head -8 |
||
− | POST / HTTP/1.1 |
||
− | User-Agent: collectd/5.4.0.git |
||
− | Host: 127.0.0.1:8325 |
||
− | Accept: */* |
||
− | Content-Type: application/json |
||
− | Content-Length: 4064 |
||
− | |||
− | [{"values":[2160],"dstypes":["gauge"],"dsnames":["value"],"time":1453203196.259,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"collectd","type":"ps_stacksize","type_instance":""},{"values":[0,1999.74],"dstypes":["derive","derive"],"dsnames": |
||
− | ... |
||
− | skip |
||
− | ... |
||
− | </nowiki>}} |
||
− | |||
− | =====Debug with your own write plugin===== |
||
− | One more way to debug is create your own write plugin and write all you need. |
||
− | <BR>For example I created simple write plugin (using python) |
||
− | * Create plugin configuration, e.g. in /etc/collectd/conf.d/openstack.conf |
||
− | <PRE> |
||
− | Import write_file |
||
− | <Module "write_file"> |
||
− | log_filename = "/var/log/collectd_debug.log" |
||
− | </Module> |
||
− | </PRE> |
||
− | |||
− | Create file /usr/lib/collectd/write_file.py (depends on your ModulePath, by-default it is "/usr/lib/collectd") |
||
− | <PRE> |
||
− | import collectd |
||
− | |||
− | |||
− | def configure_callback(conf): |
||
− | global f_log_file |
||
− | |||
− | for c in conf.children: |
||
− | if c.key == 'log_filename': |
||
− | log_filename = c.values[0] |
||
− | else: |
||
− | collectd.warning ('log_file_info plugin: Unknown config key: %s.' % c.key) |
||
− | collectd.error('Configured with log_filename=%s' % (log_filename)) |
||
− | F_LOG_FILE = open(log_filename,'w') |
||
− | f_log_file = open(log_filename,'w') |
||
− | |||
− | |||
− | |||
− | def write_callback(vl, data=None): |
||
− | for i in vl.values: |
||
− | #collectd.error("write_file: %s (%s): %f" % (vl.plugin, vl.type, i)) |
||
− | f_log_file.write("%s (%s): %f \n" % (vl.plugin, vl.type, i)) |
||
− | |||
− | collectd.register_config(configure_callback) |
||
− | collectd.register_write(write_callback) |
||
− | </PRE> |
||
− | |||
− | =====Debug with unixsock plugin===== |
||
− | One more way to get some debug information is using collectd-unixsock (https://collectd.org/documentation/manpages/collectd-unixsock.5.shtml) |
||
− | |||
− | add config, restart collectd |
||
− | <PRE> |
||
− | # cat 98-unixsock.conf |
||
− | <LoadPlugin unixsock> |
||
− | Globals false |
||
− | </LoadPlugin> |
||
− | |||
− | <Plugin unixsock> |
||
− | SocketFile "/var/run/collectd-unixsock" |
||
− | SocketGroup "collectd" |
||
− | SocketPerms "0770" |
||
− | DeleteSocket true |
||
− | </Plugin> |
||
− | </PRE> |
||
− | |||
− | |||
− | <PRE> |
||
− | #collectdctl listval |
||
− | node-6/apache-localhost/apache_bytes |
||
− | node-6/apache-localhost/apache_connections |
||
− | node-6/apache-localhost/apache_idle_workers |
||
− | node-6/apache-localhost/apache_requests |
||
− | node-6/apache-localhost/apache_scoreboard-closing |
||
− | node-6/apache-localhost/apache_scoreboard-dnslookup |
||
− | node-6/apache-localhost/apache_scoreboard-finishing |
||
− | node-6/apache-localhost/apache_scoreboard-idle_cleanup |
||
− | node-6/apache-localhost/apache_scoreboard-keepalive |
||
− | node-6/apache-localhost/apache_scoreboard-logging |
||
− | node-6/apache-localhost/apache_scoreboard-open |
||
− | node-6/apache-localhost/apache_scoreboard-reading |
||
− | node-6/apache-localhost/apache_scoreboard-sending |
||
− | node-6/apache-localhost/apache_scoreboard-starting |
||
− | node-6/apache-localhost/apache_scoreboard-waiting |
||
− | node-6/check_openstack_api-cinder-api/gauge-RegionOne |
||
− | ... |
||
− | Skip |
||
− | ... |
||
− | </PRE> |
||
− | |||
− | |||
− | <PRE> |
||
− | # collectdctl getval node-6/swap/swap-free |
||
− | value=1.923355e+09 |
||
− | </PRE> |
||
− | |||
− | ===Config Files=== |
||
− | All config files are in /etc/collectd/ |
||
− | <BR> |
||
− | /etc/collectd/conf.d stores plugin configuration files |
||
− | <PRE> |
||
− | # ls -lsa /etc/collectd/conf.d/ |
||
− | 4 -rw-r----- 1 root root 169 Jan 18 16:38 05-logfile.conf |
||
− | 4 -rw-r----- 1 root root 71 Jan 18 16:38 10-cpu.conf |
||
− | 4 -rw-r----- 1 root root 289 Jan 18 16:38 10-df.conf |
||
− | 4 -rw-r----- 1 root root 145 Jan 18 16:38 10-disk.conf |
||
− | 4 -rw-r----- 1 root root 189 Jan 18 16:38 10-interface.conf |
||
− | 4 -rw-r----- 1 root root 72 Jan 18 16:38 10-load.conf |
||
− | 4 -rw-r----- 1 root root 74 Jan 18 16:38 10-memory.conf |
||
− | 4 -rw-r----- 1 root root 77 Jan 18 16:38 10-processes.conf |
||
− | 4 -rw-r----- 1 root root 138 Jan 18 16:38 10-swap.conf |
||
− | 4 -rw-r----- 1 root root 73 Jan 18 16:38 10-users.conf |
||
− | 4 -rw-r----- 1 root root 189 Jan 18 16:38 10-write_http.conf |
||
− | 4 -rw-r----- 1 root root 66 Jan 18 16:38 processes-config.conf |
||
− | </PRE> |
||
− | On controller there are more metrics: |
||
− | <PRE> |
||
− | d# ls -1 |
||
− | 05-logfile.conf |
||
− | 10-apache.conf |
||
− | 10-cpu.conf |
||
− | 10-dbi.conf |
||
− | 10-df.conf |
||
− | 10-disk.conf |
||
− | 10-interface.conf |
||
− | 10-load.conf |
||
− | 10-match_regex.conf |
||
− | 10-memcached.conf |
||
− | 10-memory.conf |
||
− | 10-mysql.conf |
||
− | 10-processes.conf |
||
− | 10-swap.conf |
||
− | 10-target_notification.conf |
||
− | 10-users.conf |
||
− | 10-write_http.conf |
||
− | 99-chain-PostCache.conf |
||
− | dbi_cinder_services.conf |
||
− | dbi_mysql_status.conf |
||
− | dbi_neutron_agents.conf |
||
− | dbi_nova_services.conf |
||
− | mysql-nova.conf |
||
− | openstack.conf |
||
− | processes-config.conf |
||
− | </PRE> |
||
==Heka== |
==Heka== |
Версия 18:59, 22 января 2016
Monitorimg
Collectd
Heka
Heka is an open source stream processing software system developed by Mozilla. Heka is a “Swiss Army Knife” type tool for data processing, useful for a wide variety of different tasks, such as:
- Loading and parsing log files from a file system.
- Accepting statsd type metrics data for aggregation and forwarding to upstream time series data stores such as graphite or InfluxDB.
- Launching external processes to gather operational data from the local system.
- Performing real time analysis, graphing, and anomaly detection on any data flowing through the Heka pipeline.
- Shipping data from one location to another via the use of an external transport (such as AMQP) or directly (via TCP).
- Delivering processed data to one or more persistent data stores.
Inputs
There are 2 types of input plugins used in heka
- HttpListenInput
- 127.0.0.1:8325; collectd_decoder
- LogstreamerInput
- /var/log/libvirt; libvirt_decoder
- file_match = '(?P<Service>nova|cinder|keystone|glance|heat|neutron|murano)-all\.log$', openstack_decoder
- "/var/log/dashboard\.log$'; decoder = "openstack_decoder"; splitter = "TokenSplitter"
- file_match = '(?P<Service>nova|cinder|keystone|glance|heat|neutron|murano)-all\.log$'; differentiator = [ 'openstack.', 'Service' ]; decoder = "openstack_decoder"; splitter = "openstack_splitter"
- file_match = '(?P<Service>ovs\-vswitchd|ovsdb\-server)\.log$';differentiator = [ 'Service' ];decoder = "ovs_decoder";splitter = "TokenSplitter"
- file_match = '(?P<Service>daemon\.log|cron\.log|haproxy\.log|kern\.log|auth\.log|syslog|messages|debug)';differentiator = [ 'system.', 'Service' ];decoder = "system_decoder"
Splitters
Splitter details: https://hekad.readthedocs.org/en/v0.10.0/config/splitters/index.html
There are only one custom splitter:
[openstack_splitter] type = "RegexSplitter" delimiter = '(<[0-9]+>)' delimiter_eol = false
Decoders
decoder-collectd.toml decoder-libvirt.toml decoder-openstack.toml decoder-ovs.toml decoder-system.toml