Collectd: различия между версиями

Материал из noname.com.ua
Перейти к навигацииПерейти к поиску
Строка 148: Строка 148:
 
** self.depends_on_resource - name of "resource". This is just variable to identify plugin instance and filter notifications. (please see notification explanation section below for details)
 
** self.depends_on_resource - name of "resource". This is just variable to identify plugin instance and filter notifications. (please see notification explanation section below for details)
 
** self.config_file - config file, in test plugin only 2 values are possible to use - zero and any other positive number.
 
** self.config_file - config file, in test plugin only 2 values are possible to use - zero and any other positive number.
  +
<BR>
 
 
* Each "interval" called method registered as
 
* Each "interval" called method registered as
 
<syntaxhighlight lang="python">
 
<syntaxhighlight lang="python">
Строка 154: Строка 154:
 
</syntaxhighlight>
 
</syntaxhighlight>
 
Interval is global collectd configuration parameter: https://collectd.org/documentation/manpages/collectd.conf.5.shtml#global_options
 
Interval is global collectd configuration parameter: https://collectd.org/documentation/manpages/collectd.conf.5.shtml#global_options
  +
<BR>
 
 
 
*read function do the follwing:
 
*read function do the follwing:
 
** writes to log self.do_collect_data (do we need collect data now?)
 
** writes to log self.do_collect_data (do we need collect data now?)
Строка 161: Строка 160:
 
** calls self.check_config() method (will be described below)
 
** calls self.check_config() method (will be described below)
 
** if self.do_collect_data is "True" read data from data file (please see config section) and publish it in collectd.
 
** if self.do_collect_data is "True" read data from data file (please see config section) and publish it in collectd.
  +
<BR>
 
 
* on each notification calles registered method:
 
* on each notification calles registered method:
 
<syntaxhighlight lang="python">
 
<syntaxhighlight lang="python">
Строка 169: Строка 168:
 
** Loads message from notification if it possble: data = json.loads(notification.message)
 
** Loads message from notification if it possble: data = json.loads(notification.message)
 
** If message contain resource we configured above, read data and change do_collect_data. Resource is used to filter messages we do not need. Any kind of logic can be implemented, e.g. we can read notification of other plugins.
 
** If message contain resource we configured above, read data and change do_collect_data. Resource is used to filter messages we do not need. Any kind of logic can be implemented, e.g. we can read notification of other plugins.
  +
<BR>
  +
  +
* Method check_config(self) calling on each read and do the following:
  +
** Read config file (only first line)
  +
** If value from config file is greater than zero (if int(config_value) > 0) it sends notification where resource is pre-configured <B>DependsOnResource</B> message = '{"resource":"'+str(self.depends_on_resource)+'","value":1}')
  +
** Otherwise send notification with "value":0
  +
  +
<BR>
  +
<B>This notification is received by all plugins with register_notification configured include this plugin also</B><BR>
  +
So we get this notification in notification_callback, and set self.do_collect_data True if value = 0 and false if Value > 0.
  +
<BR>
  +
Now we are able to switch on or off data collecting in our plugin:
  +
<PRE>
  +
echo 0 > /var/log/collectd_in_data_config
  +
</PRE>
  +
<PRE>
  +
[2016-01-26 15:54:16] Notification: severity = OKAY, host = TestNode, plugin = read_file_demo_plugin, type = gauge, type_instance = read_file_plugin_test_instance, message = {"resource":"MyCustomTestResource","value":1}
  +
[2016-01-26 15:54:16] read_file_demo_plugin: pluginTest: do_collect_data=True
  +
[2016-01-26 15:54:26] Notification: severity = OKAY, host = node-6, plugin = pacemaker_resource, type = gauge, type_instance = vip__management, message = {"resource":"vip__management","value":1}
  +
[2016-01-26 15:54:26] READ NOTIFICATION____VAR: True.
  +
[2016-01-26 15:54:26] read_file_demo_plugin: config_value=0
  +
</PRE>
  +
  +
<PRE>
  +
echo 99 > /var/log/collectd_in_data_config
  +
</PRE>
   
   
  +
<BR>
  +
It looks little bit tricky but it is just an example go
   
   

Версия 17:55, 26 января 2016

Collectd

Collectd is simple data collector, use plugins to collect data and output plugins to send tada to another tool (heka in our confuration)
Collectd is collecting following metrics (compute node, simple cluster):

Metrics

Please see plugin details on collectd man page: https://collectd.org/documentation/manpages/collectd.conf.5.shtml#

  • cpu (CPU usage)
  • df (disk usage/free size)
  • disk (disk usage/IOPS)
  • interface (interfece usage/bytes sent and received )
  • load (Linux LA)
  • memory (memory usage)
  • processes (detailed monitoring of collect and hekad)
  • swap (swap usage)
  • openstack metrics (python plugin)
  • other metrics
  • your custom metrics if added

Output

Collectd saves all data in rrd files and sends it to heka using write_http plugin )https://collectd.org/documentation/manpages/collectd.conf.5.shtml#plugin_write_http). It sends data in JSON format to local hekad (BTW Why do we use local heka on each node?)
Plugin configuration:

<LoadPlugin write_http>
  Globals false
</LoadPlugin>

<Plugin "write_http">
  <URL "http://127.0.0.1:8325">
    Format "JSON"
    StoreRates true
  </URL>
</Plugin>

Hekad is listen on 127.0.0.1:8325

# netstat -ntpl | grep 8325
tcp        0      0 127.0.0.1:8325          0.0.0.0:*               LISTEN      15368/hekad

Custom read Plugin

For better understanding of collectd python plugins was created simple python plugin.

  • plugin reads data from file if in control file value > 0
  • data file, config file and resource name are configurable
  • notifications are used to show how it is possible to notify plugins

Plugin configuration

Add the following code to openstack.conf file:

    Import read_file
    <Module "read_file">
        DataFile = "/var/log/collectd_in_data"
        ConfigFile = "/var/log/collectd_in_data_config"
        DependsOnResource = "MyCustomTestResource"
    </Module>

Variables inside <Module> section can be read in plugin, so in this example we pass to plugin data file, configuration file and resource name.

Plugin code

Plugin code is .py file, loctated in /usr/lib/collectd. In our case this is /usr/lib/collectd/read_file.py Pay your attention - read_file.py is module name in configuration above: Import read_file <syntaxhighlight lang="python"> import collectd import pprint import json


class pluginTest():

 do_collect_data = False;
 depends_on_resource = ""
 config_file=""
 plugin_description='read_file_demo_plugin'
 data_file=""


 def configure_callback(self, conf):
   for c in conf.children:
     if c.key  == 'DataFile':
       self.data_file = c.values[0]
     elif c.key == 'DependsOnResource':
       self.depends_on_resource = c.values[0]
     elif c.key == 'ConfigFile':
       self.config_file = c.values[0]
     else:
       collectd.warning ('%s Unknown config key: %s.' % (self.plugin_description, c.key))
   collectd.error('%s : Configured with data_file=%s config_file=%s' % (self.plugin_description, self.data_file, self.config_file))


 def notification_callback(self, notification):
   try:
     data = json.loads(notification.message)
   except ValueError:
     return
   if 'value' not in data:
     collectd.warning ('%s : READ NOTIFICATION: missing value %s.' % ( self.plugin_description, self.__class__.__name__) )
   elif 'resource' not in data:
     collectd.warning ('%s : READ NOTIFICATION: missing resource %s.' % (self.plugin_description, self.__class__.__name__ ) )
   elif data['resource'] == self.depends_on_resource:
     do_collect_data = data['value'] > 0
     collectd.warning ("%s: %s: do_collect_data=%s" % (self.plugin_description, self.__class__.__name__, do_collect_data))
     self.do_collect_data = do_collect_data


 def read_callback(self):
   collectd.warning ('READ NOTIFICATION____VAR: %s.' % (self.do_collect_data ) )
   if self.do_collect_data % 2 == 0:
     collectd.warning ('READ NOTIFICATION____VAR___EVEN: %s.' % (self.do_collect_data ) )
   self.check_config()
   if self.do_collect_data:
     f_data_file = open(self.data_file,'r')
     value=f_data_file.readline()
     vl = collectd.Values(type='gauge')
     vl.plugin='python.read_file'
     vl.dispatch(values=[value])
     f_data_file.close()
 def check_config(self):
   f_config_file = open(self.config_file)
   config_value=f_config_file.readline()
   f_config_file.close()
   collectd.warning ("%s: config_value=%s" % (self.plugin_description, config_value))
   n = collectd.Notification()
   if int(config_value) > 0:
     n.dispatch(severity = 4, host = "TestNode", plugin = self.plugin_description, type = "gauge", type_instance = "read_file_plugin_test_instance", message = '{"resource":"'+str(self.depends_on_resource)+'","value":1}')
   else:
     n.dispatch(severity = 4, host = "TestNode", plugin = self.plugin_description, type = "gauge", type_instance = "read_file_plugin_test_instance", message = '{"resource":"'+str(self.depends_on_resource)+'","value":0}')



plugin = pluginTest()

collectd.register_notification(plugin.notification_callback) collectd.register_config(plugin.configure_callback) collectd.register_read(plugin.read_callback) </syntaxhighlight>

Code description

How int works?

  • Python code loads once on collectd start. So, you need restart collectd if code changed.
  • On load called funcion/method registerd as config function

<syntaxhighlight lang="python"> collectd.register_config(plugin.configure_callback) </syntaxhighlight>

  • This method reads config and initialize variables:
    • self.data_file - file with data (should be pre-created)
    • self.depends_on_resource - name of "resource". This is just variable to identify plugin instance and filter notifications. (please see notification explanation section below for details)
    • self.config_file - config file, in test plugin only 2 values are possible to use - zero and any other positive number.


  • Each "interval" called method registered as

<syntaxhighlight lang="python"> collectd.register_read(plugin.read_callback) </syntaxhighlight> Interval is global collectd configuration parameter: https://collectd.org/documentation/manpages/collectd.conf.5.shtml#global_options

  • read function do the follwing:
    • writes to log self.do_collect_data (do we need collect data now?)
    • check if self.do_collect_data even or odd and if it is even write one more message. This part of code is just shows that we can operate with variables passed via notification mechanism and implement any logic we need
    • calls self.check_config() method (will be described below)
    • if self.do_collect_data is "True" read data from data file (please see config section) and publish it in collectd.


  • on each notification calles registered method:

<syntaxhighlight lang="python"> collectd.register_notification(plugin.notification_callback) </syntaxhighlight>

  • This method do the following:
    • Loads message from notification if it possble: data = json.loads(notification.message)
    • If message contain resource we configured above, read data and change do_collect_data. Resource is used to filter messages we do not need. Any kind of logic can be implemented, e.g. we can read notification of other plugins.


  • Method check_config(self) calling on each read and do the following:
    • Read config file (only first line)
    • If value from config file is greater than zero (if int(config_value) > 0) it sends notification where resource is pre-configured DependsOnResource message = '{"resource":"'+str(self.depends_on_resource)+'","value":1}')
    • Otherwise send notification with "value":0


This notification is received by all plugins with register_notification configured include this plugin also
So we get this notification in notification_callback, and set self.do_collect_data True if value = 0 and false if Value > 0.
Now we are able to switch on or off data collecting in our plugin:

echo 0 > /var/log/collectd_in_data_config
[2016-01-26 15:54:16] Notification: severity = OKAY, host = TestNode, plugin = read_file_demo_plugin, type = gauge, type_instance = read_file_plugin_test_instance, message = {"resource":"MyCustomTestResource","value":1}
[2016-01-26 15:54:16] read_file_demo_plugin: pluginTest: do_collect_data=True
[2016-01-26 15:54:26] Notification: severity = OKAY, host = node-6, plugin = pacemaker_resource, type = gauge, type_instance = vip__management, message = {"resource":"vip__management","value":1}
[2016-01-26 15:54:26] READ NOTIFICATION____VAR: True.
[2016-01-26 15:54:26] read_file_demo_plugin: config_value=0
echo 99 > /var/log/collectd_in_data_config



It looks little bit tricky but it is just an example go


https://collectd.org/documentation/manpages/collectd-python.5.shtml

Chain

Details: https://collectd.org/documentation/manpages/collectd.conf.5.shtml#filter_configuration

<Chain "PostCache">
  <Rule>
    <Match "regex">
      Plugin "^pacemaker_resource$"
      TypeInstance "^vip__management$"
    </Match>
    <Target "notification">
      Message "{\"resource\":\"%{type_instance}\",\"value\":%{ds:value}}"
      Severity "OKAY"
    </Target>
  </Rule>
   Target "write"
</Chain>

This rule creates notifications in log file if message match regexp:

[2016-01-20 14:00:54] Notification: severity = OKAY, host = node-6, plugin = pacemaker_resource, type = gauge, type_instance = vip__management, message = {"resource":"vip__management","value":1}

Debug

Debug http traffic

It is possible to debug data tranfering from collectd to hekad. e.g. you can use tcpflow or you favorite tool to dump http traffic
Run dumping tool:

  • heka is listen on port 8325, taken from write_http config
  • lo interface is loopback, heka is listen on 127.0.0.1, so it is easy to find interface
    • # ip ro get 127.0.0.1 local 127.0.0.1 dev lo src 127.0.0.1 cache <local>
    • dev lo is device you need.
  • # tcpflow -i lo port 8325
  • Example of output:
    # cat 127.000.000.001.45848-127.000.000.001.08325 | head -8 POST / HTTP/1.1 User-Agent: collectd/5.4.0.git Host: 127.0.0.1:8325 Accept: */* Content-Type: application/json Content-Length: 4064 [{"values":[2160],"dstypes":["gauge"],"dsnames":["value"],"time":1453203196.259,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"collectd","type":"ps_stacksize","type_instance":""},{"values":[0,1999.74],"dstypes":["derive","derive"],"dsnames": ... skip ...

Debug with your own write plugin

One more way to debug is create your own write plugin and write all you need.
For example I created simple write plugin (using python)

  • Create plugin configuration, e.g. in /etc/collectd/conf.d/openstack.conf
    Import write_file
    <Module "write_file">
        log_filename = "/var/log/collectd_debug.log"
    </Module>

Create file /usr/lib/collectd/write_file.py (depends on your ModulePath, by-default it is "/usr/lib/collectd")

import collectd


def configure_callback(conf):
  global f_log_file

  for c in conf.children:
    if c.key  == 'log_filename':
      log_filename = c.values[0]
    else:
      collectd.warning ('log_file_info plugin: Unknown config key: %s.' % c.key)
  collectd.error('Configured with log_filename=%s' % (log_filename))
  F_LOG_FILE = open(log_filename,'w')
  f_log_file = open(log_filename,'w')



def write_callback(vl, data=None):
  for i in vl.values:
    #collectd.error("write_file: %s (%s): %f" % (vl.plugin, vl.type, i))
    f_log_file.write("%s (%s): %f \n" % (vl.plugin, vl.type, i))

collectd.register_config(configure_callback)
collectd.register_write(write_callback)

Debug with unixsock plugin

One more way to get some debug information is using collectd-unixsock (https://collectd.org/documentation/manpages/collectd-unixsock.5.shtml)

add config, restart collectd

# cat 98-unixsock.conf
<LoadPlugin  unixsock>
  Globals false
</LoadPlugin>

<Plugin unixsock>
  SocketFile "/var/run/collectd-unixsock"
  SocketGroup "collectd"
  SocketPerms "0770"
  DeleteSocket true
</Plugin>


#collectdctl listval
node-6/apache-localhost/apache_bytes
node-6/apache-localhost/apache_connections
node-6/apache-localhost/apache_idle_workers
node-6/apache-localhost/apache_requests
node-6/apache-localhost/apache_scoreboard-closing
node-6/apache-localhost/apache_scoreboard-dnslookup
node-6/apache-localhost/apache_scoreboard-finishing
node-6/apache-localhost/apache_scoreboard-idle_cleanup
node-6/apache-localhost/apache_scoreboard-keepalive
node-6/apache-localhost/apache_scoreboard-logging
node-6/apache-localhost/apache_scoreboard-open
node-6/apache-localhost/apache_scoreboard-reading
node-6/apache-localhost/apache_scoreboard-sending
node-6/apache-localhost/apache_scoreboard-starting
node-6/apache-localhost/apache_scoreboard-waiting
node-6/check_openstack_api-cinder-api/gauge-RegionOne
...
Skip
...


# collectdctl getval node-6/swap/swap-free
value=1.923355e+09

Config Files

All config files are in /etc/collectd/
/etc/collectd/conf.d stores plugin configuration files

# ls -lsa /etc/collectd/conf.d/
4 -rw-r----- 1 root root  169 Jan 18 16:38 05-logfile.conf
4 -rw-r----- 1 root root   71 Jan 18 16:38 10-cpu.conf
4 -rw-r----- 1 root root  289 Jan 18 16:38 10-df.conf
4 -rw-r----- 1 root root  145 Jan 18 16:38 10-disk.conf
4 -rw-r----- 1 root root  189 Jan 18 16:38 10-interface.conf
4 -rw-r----- 1 root root   72 Jan 18 16:38 10-load.conf
4 -rw-r----- 1 root root   74 Jan 18 16:38 10-memory.conf
4 -rw-r----- 1 root root   77 Jan 18 16:38 10-processes.conf
4 -rw-r----- 1 root root  138 Jan 18 16:38 10-swap.conf
4 -rw-r----- 1 root root   73 Jan 18 16:38 10-users.conf
4 -rw-r----- 1 root root  189 Jan 18 16:38 10-write_http.conf
4 -rw-r----- 1 root root   66 Jan 18 16:38 processes-config.conf

On controller there are more metrics:

d# ls -1
05-logfile.conf
10-apache.conf
10-cpu.conf
10-dbi.conf
10-df.conf
10-disk.conf
10-interface.conf
10-load.conf
10-match_regex.conf
10-memcached.conf
10-memory.conf
10-mysql.conf
10-processes.conf
10-swap.conf
10-target_notification.conf
10-users.conf
10-write_http.conf
99-chain-PostCache.conf
dbi_cinder_services.conf
dbi_mysql_status.conf
dbi_neutron_agents.conf
dbi_nova_services.conf
mysql-nova.conf
openstack.conf
processes-config.conf