Collectd: различия между версиями
Sirmax (обсуждение | вклад) |
Sirmax (обсуждение | вклад) |
||
Строка 59: | Строка 59: | ||
Plugin code is .py file, loctated in /usr/lib/collectd. In our case this is /usr/lib/collectd/read_file.py |
Plugin code is .py file, loctated in /usr/lib/collectd. In our case this is /usr/lib/collectd/read_file.py |
||
Pay your attention - <b>read_file</b>.py is module name in configuration above: <B>Import read_file</B> |
Pay your attention - <b>read_file</b>.py is module name in configuration above: <B>Import read_file</B> |
||
− | <syntaxhighlight> |
+ | <syntaxhighlight lang="python"> |
import collectd |
import collectd |
||
import pprint |
import pprint |
Версия 16:43, 26 января 2016
Collectd
Collectd is simple data collector, use plugins to collect data and output plugins to send tada to another tool (heka in our confuration)
Collectd is collecting following metrics (compute node, simple cluster):
Metrics
Please see plugin details on collectd man page: https://collectd.org/documentation/manpages/collectd.conf.5.shtml#
- cpu (CPU usage)
- df (disk usage/free size)
- disk (disk usage/IOPS)
- interface (interfece usage/bytes sent and received )
- load (Linux LA)
- memory (memory usage)
- processes (detailed monitoring of collect and hekad)
- swap (swap usage)
- openstack metrics (python plugin)
- other metrics
- your custom metrics if added
Output
Collectd saves all data in rrd files and sends it to heka using write_http plugin )https://collectd.org/documentation/manpages/collectd.conf.5.shtml#plugin_write_http). It sends data in JSON format to local hekad (BTW Why do we use local heka on each node?)
Plugin configuration:
<LoadPlugin write_http> Globals false </LoadPlugin> <Plugin "write_http"> <URL "http://127.0.0.1:8325"> Format "JSON" StoreRates true </URL> </Plugin>
Hekad is listen on 127.0.0.1:8325
# netstat -ntpl | grep 8325 tcp 0 0 127.0.0.1:8325 0.0.0.0:* LISTEN 15368/hekad
Custom read Plugin
For better understanding of collectd python plugins was created simple python plugin.
- plugin reads data from file if in control file value > 0
- data file, config file and resource name are configurable
- notifications are used to show how it is possible to notify plugins
Plugin configuration
Add the following code to openstack.conf file:
Import read_file <Module "read_file"> DataFile = "/var/log/collectd_in_data" ConfigFile = "/var/log/collectd_in_data_config" DependsOnResource = "MyCustomTestResource" </Module>
Variables inside <Module> section can be read in plugin, so in this example we pass to plugin data file, configuration file and resource name.
Plugin code
Plugin code is .py file, loctated in /usr/lib/collectd. In our case this is /usr/lib/collectd/read_file.py Pay your attention - read_file.py is module name in configuration above: Import read_file <syntaxhighlight lang="python"> import collectd import pprint import json
class pluginTest():
do_collect_data = False; depends_on_resource = "" config_file="" plugin_description='read_file_demo_plugin' data_file=""
def configure_callback(self, conf): for c in conf.children: if c.key == 'DataFile': self.data_file = c.values[0] elif c.key == 'DependsOnResource': self.depends_on_resource = c.values[0] elif c.key == 'ConfigFile': self.config_file = c.values[0] else: collectd.warning ('%s Unknown config key: %s.' % (self.plugin_description, c.key)) collectd.error('%s : Configured with data_file=%s config_file=%s' % (self.plugin_description, self.data_file, self.config_file))
def notification_callback(self, notification): try: data = json.loads(notification.message) except ValueError: return if 'value' not in data: collectd.warning ('%s : READ NOTIFICATION: missing value %s.' % ( self.plugin_description, self.__class__.__name__) ) elif 'resource' not in data: collectd.warning ('%s : READ NOTIFICATION: missing resource %s.' % (self.plugin_description, self.__class__.__name__ ) ) elif data['resource'] == self.depends_on_resource: do_collect_data = data['value'] > 0 collectd.warning ("%s: %s: do_collect_data=%s" % (self.plugin_description, self.__class__.__name__, do_collect_data)) self.do_collect_data = do_collect_data
def read_callback(self): collectd.warning ('READ NOTIFICATION____VAR: %s.' % (self.do_collect_data ) ) if self.do_collect_data % 2 == 0: collectd.warning ('READ NOTIFICATION____VAR___EVEN: %s.' % (self.do_collect_data ) ) self.check_config() if self.do_collect_data: f_data_file = open(self.data_file,'r') value=f_data_file.readline() vl = collectd.Values(type='gauge') vl.plugin='python.read_file' vl.dispatch(values=[value]) f_data_file.close()
def check_config(self): f_config_file = open(self.config_file) config_value=f_config_file.readline() f_config_file.close() collectd.warning ("%s: config_value=%s" % (self.plugin_description, config_value)) n = collectd.Notification() if int(config_value) > 0: n.dispatch(severity = 4, host = "TestNode", plugin = self.plugin_description, type = "gauge", type_instance = "read_file_plugin_test_instance", message = '{"resource":"'+str(self.depends_on_resource)+'","value":1}') else: n.dispatch(severity = 4, host = "TestNode", plugin = self.plugin_description, type = "gauge", type_instance = "read_file_plugin_test_instance", message = '{"resource":"'+str(self.depends_on_resource)+'","value":0}')
plugin = pluginTest()
collectd.register_notification(plugin.notification_callback) collectd.register_config(plugin.configure_callback) collectd.register_read(plugin.read_callback) </syntaxhighlight>
Code description
How int works?
- Python code loads once on collectd start. So, you need restart collectd if code changed.
- On load called funcion/method registerd as config function
collectd.register_config(plugin.configure_callback)
https://collectd.org/documentation/manpages/collectd-python.5.shtml
Chain
Details: https://collectd.org/documentation/manpages/collectd.conf.5.shtml#filter_configuration
<Chain "PostCache"> <Rule> <Match "regex"> Plugin "^pacemaker_resource$" TypeInstance "^vip__management$" </Match> <Target "notification"> Message "{\"resource\":\"%{type_instance}\",\"value\":%{ds:value}}" Severity "OKAY" </Target> </Rule> Target "write" </Chain>
This rule creates notifications in log file if message match regexp:
[2016-01-20 14:00:54] Notification: severity = OKAY, host = node-6, plugin = pacemaker_resource, type = gauge, type_instance = vip__management, message = {"resource":"vip__management","value":1}
Debug
Debug http traffic
It is possible to debug data tranfering from collectd to hekad. e.g. you can use tcpflow or you favorite tool to dump http traffic
Run dumping tool:
- heka is listen on port 8325, taken from write_http config
- lo interface is loopback, heka is listen on 127.0.0.1, so it is easy to find interface
- # ip ro get 127.0.0.1 local 127.0.0.1 dev lo src 127.0.0.1 cache <local>
- dev lo is device you need.
- # tcpflow -i lo port 8325
- Example of output: # cat 127.000.000.001.45848-127.000.000.001.08325 | head -8 POST / HTTP/1.1 User-Agent: collectd/5.4.0.git Host: 127.0.0.1:8325 Accept: */* Content-Type: application/json Content-Length: 4064 [{"values":[2160],"dstypes":["gauge"],"dsnames":["value"],"time":1453203196.259,"interval":10.000,"host":"node-7","plugin":"processes","plugin_instance":"collectd","type":"ps_stacksize","type_instance":""},{"values":[0,1999.74],"dstypes":["derive","derive"],"dsnames": ... skip ...
Debug with your own write plugin
One more way to debug is create your own write plugin and write all you need.
For example I created simple write plugin (using python)
- Create plugin configuration, e.g. in /etc/collectd/conf.d/openstack.conf
Import write_file <Module "write_file"> log_filename = "/var/log/collectd_debug.log" </Module>
Create file /usr/lib/collectd/write_file.py (depends on your ModulePath, by-default it is "/usr/lib/collectd")
import collectd def configure_callback(conf): global f_log_file for c in conf.children: if c.key == 'log_filename': log_filename = c.values[0] else: collectd.warning ('log_file_info plugin: Unknown config key: %s.' % c.key) collectd.error('Configured with log_filename=%s' % (log_filename)) F_LOG_FILE = open(log_filename,'w') f_log_file = open(log_filename,'w') def write_callback(vl, data=None): for i in vl.values: #collectd.error("write_file: %s (%s): %f" % (vl.plugin, vl.type, i)) f_log_file.write("%s (%s): %f \n" % (vl.plugin, vl.type, i)) collectd.register_config(configure_callback) collectd.register_write(write_callback)
Debug with unixsock plugin
One more way to get some debug information is using collectd-unixsock (https://collectd.org/documentation/manpages/collectd-unixsock.5.shtml)
add config, restart collectd
# cat 98-unixsock.conf <LoadPlugin unixsock> Globals false </LoadPlugin> <Plugin unixsock> SocketFile "/var/run/collectd-unixsock" SocketGroup "collectd" SocketPerms "0770" DeleteSocket true </Plugin>
#collectdctl listval node-6/apache-localhost/apache_bytes node-6/apache-localhost/apache_connections node-6/apache-localhost/apache_idle_workers node-6/apache-localhost/apache_requests node-6/apache-localhost/apache_scoreboard-closing node-6/apache-localhost/apache_scoreboard-dnslookup node-6/apache-localhost/apache_scoreboard-finishing node-6/apache-localhost/apache_scoreboard-idle_cleanup node-6/apache-localhost/apache_scoreboard-keepalive node-6/apache-localhost/apache_scoreboard-logging node-6/apache-localhost/apache_scoreboard-open node-6/apache-localhost/apache_scoreboard-reading node-6/apache-localhost/apache_scoreboard-sending node-6/apache-localhost/apache_scoreboard-starting node-6/apache-localhost/apache_scoreboard-waiting node-6/check_openstack_api-cinder-api/gauge-RegionOne ... Skip ...
# collectdctl getval node-6/swap/swap-free value=1.923355e+09
Config Files
All config files are in /etc/collectd/
/etc/collectd/conf.d stores plugin configuration files
# ls -lsa /etc/collectd/conf.d/ 4 -rw-r----- 1 root root 169 Jan 18 16:38 05-logfile.conf 4 -rw-r----- 1 root root 71 Jan 18 16:38 10-cpu.conf 4 -rw-r----- 1 root root 289 Jan 18 16:38 10-df.conf 4 -rw-r----- 1 root root 145 Jan 18 16:38 10-disk.conf 4 -rw-r----- 1 root root 189 Jan 18 16:38 10-interface.conf 4 -rw-r----- 1 root root 72 Jan 18 16:38 10-load.conf 4 -rw-r----- 1 root root 74 Jan 18 16:38 10-memory.conf 4 -rw-r----- 1 root root 77 Jan 18 16:38 10-processes.conf 4 -rw-r----- 1 root root 138 Jan 18 16:38 10-swap.conf 4 -rw-r----- 1 root root 73 Jan 18 16:38 10-users.conf 4 -rw-r----- 1 root root 189 Jan 18 16:38 10-write_http.conf 4 -rw-r----- 1 root root 66 Jan 18 16:38 processes-config.conf
On controller there are more metrics:
d# ls -1 05-logfile.conf 10-apache.conf 10-cpu.conf 10-dbi.conf 10-df.conf 10-disk.conf 10-interface.conf 10-load.conf 10-match_regex.conf 10-memcached.conf 10-memory.conf 10-mysql.conf 10-processes.conf 10-swap.conf 10-target_notification.conf 10-users.conf 10-write_http.conf 99-chain-PostCache.conf dbi_cinder_services.conf dbi_mysql_status.conf dbi_neutron_agents.conf dbi_nova_services.conf mysql-nova.conf openstack.conf processes-config.conf