|
|
Строка 3: |
Строка 3: |
|
[[Категория:MOS FUEL]] |
|
[[Категория:MOS FUEL]] |
|
[[Категория:Linux]] |
|
[[Категория:Linux]] |
|
+ |
[[Категория:Monitoring]] |
|
=Heka Debugging= |
|
=Heka Debugging= |
|
From time to time something went wrong with any software so there are a few words about debugging. |
|
From time to time something went wrong with any software so there are a few words about debugging. |
Версия 15:41, 9 февраля 2016
Heka Debugging
From time to time something went wrong with any software so there are a few words about debugging.
Save data to file
The easiest way to debug is to configure FileOutput (https://hekad.readthedocs.org/en/v0.10.0/config/outputs/file.html) and save all data in debug file.
To encode output is used RstEncoder (https://hekad.readthedocs.org/en/v0.10.0/config/encoders/rst.html)
To enable such debugging it is necessary to add config file, e.g /etc/lma_collector/00_debug.toml
Attention: this configuration is dangerous because it will create I/O load. Please be careful with debugging on production!
[RstEncoder]
[output_file]
type = "FileOutput"
message_matcher = "Fields[aggregator] == NIL"
path = "/var/log/heka-debug.log"
perm = "666"
flush_count = 100
flush_operator = "OR"
encoder = "RstEncoder"
Example of output:
tail -100 /var/log/heka-debug.log
... <SKIP> ...
:Timestamp: 2016-01-31 16:58:42.232999936 +0000 UTC
:Type: metric
:Hostname: node-6
:Pid: 5505
:Uuid: 2a73fa6f-21fc-40b0-acf9-7fe96bcb9b80
:Logger: collectd
:Payload: {"type":"gauge","values":[0],"type_instance":"q-agent-notifier-dvr-update_fanout_57e5b132e51840ebb6.messages","meta":{"0":true},"dsnames":["value"],"plugin":"rabbitmq_info","time":1454259522.233,"interval":10,"host":"node-6","dstypes":["gauge"],"plugin_instance":""}
:EnvVersion:
:Severity: 6
:Fields:
| name:"type" type:string value:"gauge"
| name:"source" type:string value:"rabbitmq_info"
| name:"openstack_release" type:string value:"2015.1.0-7.0"
| name:"deployment_mode" type:string value:"ha_compact"
| name:"openstack_roles" type:string value:"primary-controller"
| name:"queue" type:string value:"q-agent-notifier-dvr-update_fanout_57e5b132e51840ebb6"
| name:"tag_fields" type:string value:"queue"
| name:"openstack_region" type:string value:"RegionOne"
| name:"name" type:string value:"rabbitmq_queue_messages"
| name:"hostname" type:string value:"node-6"
| name:"value" type:double value:0
| name:"deployment_id" type:string value:"3"
| name:"environment_label" type:string value:"test2"
| name:"interval" type:double value:10
:Timestamp: 2016-01-31 16:58:42 +0000 UTC
:Type: test_decoder
:Hostname: node-6
:Pid: 0
:Uuid: bbc94442-67c8-4445-acf8-341b14c2d43f
:Logger: test_input
:Payload:
:EnvVersion:
:Severity: 7
:Fields:
| name:"test_decoderData" type:string value:"888999888"
| name:"test_decoderCount" type:double value:1212
| name:"test_decoderDataToNumber" type:double value:8.88999888e+08
:Timestamp: 2016-01-31 16:58:42 +0000 UTC
:Type: log
:Hostname: node-6
:Pid: 11905
:Uuid: 36cfae57-cb3e-4777-a886-60a351fb230c
:Logger: pacemaker
:Payload: cib_process_ping: Reporting our current digest to node-6.domain.tld: 70954c66c71f324b3851af4d92ef93dd for 0.90.152 (0x1c12960 0)
:EnvVersion:
:Severity: 7
:Fields:
| name:"deployment_mode" type:string value:"ha_compact"
| name:"environment_label" type:string value:"test2"
| name:"openstack_region" type:string value:"RegionOne"
| name:"deployment_id" type:string value:"3"
| name:"severity_label" type:string value:"DEBUG"
| name:"programname" type:string value:"cib"
| name:"openstack_roles" type:string value:"primary-controller"
| name:"openstack_release" type:string value:"2015.1.0-7.0"
Debug data filtering
To avoid I/O overload if is possible to filter messages.
To filter data only from test_decoder and save it into the /var/log/heka-debug-test-decoder.log file is used the following configuration:
01_debug_test_decoder_data.toml
[output_file]
type = "FileOutput"
message_matcher = "Type == 'test_decoder'"
path = "/var/log/heka-debug-test-decoder.log"
perm = "666"
flush_count = 100
flush_operator = "OR"
encoder = "RstEncoder"
Remark: depends on your current Heka configuration you may need to enable RstEncoder. In this example it is already enabled in 00_debug.toml file.
Example of output:
:Timestamp: 2016-01-31 17:07:16 +0000 UTC
:Type: test_decoder
:Hostname: node-6
:Pid: 0
:Uuid: 38ffa77b-8f5c-4f73-aba5-f5de1f15f0b8
:Logger: test_input
:Payload:
:EnvVersion:
:Severity: 7
:Fields:
| name:"test_decoderData" type:string value:"888999888"
| name:"test_decoderCount" type:double value:29
| name:"test_decoderDataToNumber" type:double value:8.88999888e+08
:Timestamp: 2016-01-31 17:07:17 +0000 UTC
:Type: test_decoder
:Hostname: node-6
:Pid: 0
:Uuid: 1545163c-cc97-46aa-aba8-431ab6ca3338
:Logger: test_input
:Payload:
:EnvVersion:
:Severity: 7
:Fields:
| name:"test_decoderData" type:string value:"888999888"
| name:"test_decoderCount" type:double value:30
| name:"test_decoderDataToNumber" type:double value:8.88999888e+08
:Timestamp: 2016-01-31 17:07:18 +0000 UTC
:Type: test_decoder
:Hostname: node-6
:Pid: 0
:Uuid: 5da75795-63fd-48f8-ac52-44db368f54ed
:Logger: test_input
:Payload:
:EnvVersion:
:Severity: 7
:Fields:
| name:"test_decoderData" type:string value:"888999888"
| name:"test_decoderCount" type:double value:31
| name:"test_decoderDataToNumber" type:double value:8.88999888e+08
If Heka can't start
If Heka is not able to start it is possible to run it manually:
- Disable resorce in CRM
- Run crm resource command:
#crm resource
crm(live)resource# list
... <SKIP> ...
Clone Set: clone_lma_collector [lma_collector]
Started: [ node-6.domain.tld ]
- Stop lma_ollector resource:
#crm(live)resource# stop lma_collector
crm(live)resource# stop clone_lma_collector
- Check lma_collector resource status. It should be stopped.
#crm(live)resource# list
Clone Set: clone_lma_collector [lma_collector]
Stopped: [ node-6.domain.tld ]
Now it is safe to run Heka manually.
#screen
- In screeen section switch into heka user
#sudo -H -u heka /bin/bash
$hekad -config /etc/lma_collector/
2016/01/31 17:23:39 Error reading config: open /etc/lma_collector/01_debug_test_decoder_data.toml: permission denied
- As you can see in example above error is related to file permissions:
## ls -lsa /etc/lma_collector/01_debug_test_decoder_data.toml
4 -rwx------ 1 root root 199 Jan 31 17:06 /etc/lma_collector/01_debug_test_decoder_data.toml
- File is owned by root and can't be read by heka. To fix it it is nessesary to fix owner or permittions
#chown heka:heka /etc/lma_collector/01_debug_test_decoder_data.toml
$ hekad -config /etc/lma_collector/
2016/01/31 17:28:13 Pre-loading: [RstEncoder]
2016/01/31 17:28:13 Pre-loading: [output_file]
2016/01/31 17:28:13 Pre-loading: [output_file]
2016/01/31 17:28:13 Pre-loading: [openstack_error_amqp]
2016/01/31 17:28:13 Pre-loading: [openstack_info_amqp]
2016/01/31 17:28:13 Pre-loading: [openstack_warn_amqp]
2016/01/31 17:28:13 Pre-loading: [collectd_decoder]
2016/01/31 17:28:13 Pre-loading: [http-check_decoder]
2016/01/31 17:28:13 Pre-loading: [keystone_7_0_decoder]
2016/01/31 17:28:13 Pre-loading: [keystone_wsgi_decoder]
2016/01/31 17:28:13 Pre-loading: [mm_decoder]
2016/01/31 17:28:13 Pre-loading: [mysql_decoder]
2016/01/31 17:28:13 Pre-loading: [notification_decoder]
2016/01/31 17:28:13 Pre-loading: [openstack_decoder]
2016/01/31 17:28:13 Pre-loading: [ovs_decoder]
2016/01/31 17:28:13 Pre-loading: [pacemaker_decoder]
2016/01/31 17:28:13 Pre-loading: [rabbitmq_decoder]
2016/01/31 17:28:13 Pre-loading: [swift_decoder]
2016/01/31 17:28:13 Pre-loading: [system_decoder]
2016/01/31 17:28:13 Pre-loading: [elasticsearch_encoder]
2016/01/31 17:28:13 Pre-loading: [influxdb_encoder]
2016/01/31 17:28:13 Pre-loading: [nagios_afd_nodes_encoder]
2016/01/31 17:28:13 Pre-loading: [nagios_afd_nodes_encoder_debug]
2016/01/31 17:28:13 Pre-loading: [nagios_gse_global_clusters_encoder]
2016/01/31 17:28:13 Pre-loading: [nagios_gse_node_clusters_encoder]
2016/01/31 17:28:13 Pre-loading: [afd_api_backends_filter]
2016/01/31 17:28:13 Pre-loading: [afd_api_endpoints_filter]
2016/01/31 17:28:13 Pre-loading: [afd_node_controller_cpu_filter]
2016/01/31 17:28:13 Pre-loading: [afd_node_controller_log-fs_filter]
2016/01/31 17:28:13 Pre-loading: [afd_node_controller_root-fs_filter]
2016/01/31 17:28:13 Pre-loading: [afd_node_mysql-nodes_mysql-fs_filter]
2016/01/31 17:28:13 Pre-loading: [afd_service_apache_worker_filter]
2016/01/31 17:28:13 Pre-loading: [afd_service_cinder-api_http_errors_filter]
2016/01/31 17:28:13 Pre-loading: [afd_service_glance-api_http_errors_filter]
2016/01/31 17:28:14 Pre-loading: [afd_service_heat-api_http_errors_filter]
2016/01/31 17:28:14 Pre-loading: [afd_service_keystone-admin-api_http_errors_filter]
2016/01/31 17:28:14 Pre-loading: [afd_service_keystone-public-api_http_errors_filter]
2016/01/31 17:28:14 Pre-loading: [afd_service_mysql_node-status_filter]
2016/01/31 17:28:14 Pre-loading: [afd_service_neutron-api_http_errors_filter]
2016/01/31 17:28:14 Pre-loading: [afd_service_nova-api_http_errors_filter]
2016/01/31 17:28:14 Pre-loading: [afd_service_rabbitmq_disk_filter]
2016/01/31 17:28:14 Pre-loading: [afd_service_rabbitmq_memory_filter]
2016/01/31 17:28:14 Pre-loading: [afd_service_rabbitmq_queue_filter]
2016/01/31 17:28:14 Pre-loading: [afd_service_swift-api_http_errors_filter]
2016/01/31 17:28:14 Pre-loading: [afd_workers_filter]
2016/01/31 17:28:14 Pre-loading: [gse_global_filter]
2016/01/31 17:28:14 Pre-loading: [gse_node_filter]
2016/01/31 17:28:14 Pre-loading: [gse_service_filter]
2016/01/31 17:28:14 Pre-loading: [heka_monitoring_filter]
2016/01/31 17:28:14 Pre-loading: [http_metrics_filter]
2016/01/31 17:28:14 Pre-loading: [influxdb_accumulator_filter]
2016/01/31 17:28:14 Pre-loading: [influxdb_annotation_filter]
2016/01/31 17:28:14 Pre-loading: [instance_state_filter]
2016/01/31 17:28:14 Pre-loading: [resource_creation_time_filter]
2016/01/31 17:28:14 Pre-loading: [service_heartbeat_filter]
2016/01/31 17:28:14 Pre-loading: [collectd_httplisten]
2016/01/31 17:28:14 Pre-loading: [http-check_httplisten]
2016/01/31 17:28:14 Pre-loading: [aggregator_tcpinput]
2016/01/31 17:28:14 Pre-loading: [keystone_7_0_logstreamer]
2016/01/31 17:28:14 Pre-loading: [keystone_wsgi_logstreamer]
2016/01/31 17:28:14 Pre-loading: [test_input]
2016/01/31 17:28:14 Pre-loading: [mysql_logstreamer]
2016/01/31 17:28:14 Pre-loading: [openstack_7_0_logstreamer]
2016/01/31 17:28:14 Pre-loading: [openstack_dashboard_logstreamer]
2016/01/31 17:28:14 Pre-loading: [ovs_logstreamer]
2016/01/31 17:28:14 Pre-loading: [pacemaker_logstreamer]
2016/01/31 17:28:14 Pre-loading: [rabbitmq_logstreamer]
2016/01/31 17:28:14 Pre-loading: [swift_logstreamer]
2016/01/31 17:28:14 Pre-loading: [system_logstreamer]
2016/01/31 17:28:14 Pre-loading: [aggregator_decoder]
2016/01/31 17:28:14 Pre-loading: [aggregator_tcpoutput]
2016/01/31 17:28:14 Pre-loading: [DashboardOutput]
2016/01/31 17:28:14 Pre-loading: [elasticsearch_output]
2016/01/31 17:28:14 Pre-loading: [influxdb_output]
2016/01/31 17:28:14 Pre-loading: [nagios_afd_nodes_output]
2016/01/31 17:28:14 Pre-loading: [nagios_afd_nodes_output_file]
2016/01/31 17:28:14 Pre-loading: [nagios_gse_global_clusters_output]
2016/01/31 17:28:14 Pre-loading: [nagios_gse_node_clusters_output]
2016/01/31 17:28:14 Pre-loading: [aggregator_flag_decoder]
2016/01/31 17:28:14 Pre-loading: [openstack_splitter]
2016/01/31 17:28:14 Pre-loading: [rabbitmq_splitter]
2016/01/31 17:28:14 Pre-loading: [ProtobufDecoder]
2016/01/31 17:28:14 Loading: [ProtobufDecoder]
2016/01/31 17:28:14 Pre-loading: [ProtobufEncoder]
2016/01/31 17:28:14 Loading: [ProtobufEncoder]
2016/01/31 17:28:14 Pre-loading: [TokenSplitter]
2016/01/31 17:28:14 Loading: [TokenSplitter]
2016/01/31 17:28:14 Pre-loading: [HekaFramingSplitter]
2016/01/31 17:28:14 Loading: [HekaFramingSplitter]
2016/01/31 17:28:14 Pre-loading: [NullSplitter]
2016/01/31 17:28:14 Loading: [NullSplitter]
2016/01/31 17:28:14 Loading: [collectd_decoder]
2016/01/31 17:28:14 Loading: [http-check_decoder]
2016/01/31 17:28:14 Loading: [keystone_7_0_decoder]
2016/01/31 17:28:14 Loading: [keystone_wsgi_decoder]
2016/01/31 17:28:14 Loading: [mm_decoder]
2016/01/31 17:28:14 Loading: [mysql_decoder]
2016/01/31 17:28:14 Loading: [notification_decoder]
2016/01/31 17:28:14 Loading: [openstack_decoder]
2016/01/31 17:28:14 Loading: [ovs_decoder]
2016/01/31 17:28:14 Loading: [pacemaker_decoder]
2016/01/31 17:28:14 Loading: [rabbitmq_decoder]
2016/01/31 17:28:14 Loading: [swift_decoder]
2016/01/31 17:28:14 Loading: [system_decoder]
2016/01/31 17:28:14 Loading: [aggregator_flag_decoder]
2016/01/31 17:28:14 Loading: [aggregator_decoder]
2016/01/31 17:28:14 Loading: [RstEncoder]
2016/01/31 17:28:14 Loading: [elasticsearch_encoder]
2016/01/31 17:28:14 Loading: [influxdb_encoder]
2016/01/31 17:28:14 Loading: [nagios_afd_nodes_encoder]
2016/01/31 17:28:14 Loading: [nagios_afd_nodes_encoder_debug]
2016/01/31 17:28:14 Loading: [nagios_gse_global_clusters_encoder]
2016/01/31 17:28:14 Loading: [nagios_gse_node_clusters_encoder]
2016/01/31 17:28:14 Loading: [openstack_splitter]
2016/01/31 17:28:14 Loading: [rabbitmq_splitter]
2016/01/31 17:28:14 Loading: [openstack_error_amqp]
2016/01/31 17:28:14 Loading: [openstack_info_amqp]
2016/01/31 17:28:14 Loading: [openstack_warn_amqp]
2016/01/31 17:28:14 Loading: [collectd_httplisten]
2016/01/31 17:28:14 Loading: [http-check_httplisten]
2016/01/31 17:28:14 Loading: [aggregator_tcpinput]
2016/01/31 17:28:14 Loading: [keystone_7_0_logstreamer]
2016/01/31 17:28:14 Loading: [keystone_wsgi_logstreamer]
2016/01/31 17:28:14 Loading: [test_input]
2016/01/31 17:28:14 Loading: [mysql_logstreamer]
2016/01/31 17:28:14 Loading: [openstack_7_0_logstreamer]
2016/01/31 17:28:14 Loading: [openstack_dashboard_logstreamer]
2016/01/31 17:28:14 Loading: [ovs_logstreamer]
2016/01/31 17:28:14 Loading: [pacemaker_logstreamer]
2016/01/31 17:28:14 Loading: [rabbitmq_logstreamer]
2016/01/31 17:28:14 Loading: [swift_logstreamer]
2016/01/31 17:28:14 Loading: [system_logstreamer]
2016/01/31 17:28:14 Loading: [afd_api_backends_filter]
2016/01/31 17:28:14 Loading: [afd_api_endpoints_filter]
2016/01/31 17:28:14 Loading: [afd_node_controller_cpu_filter]
2016/01/31 17:28:14 Loading: [afd_node_controller_log-fs_filter]
2016/01/31 17:28:14 Loading: [afd_node_controller_root-fs_filter]
2016/01/31 17:28:14 Loading: [afd_node_mysql-nodes_mysql-fs_filter]
2016/01/31 17:28:14 Loading: [afd_service_apache_worker_filter]
2016/01/31 17:28:14 Loading: [afd_service_cinder-api_http_errors_filter]
2016/01/31 17:28:14 Loading: [afd_service_glance-api_http_errors_filter]
2016/01/31 17:28:14 Loading: [afd_service_heat-api_http_errors_filter]
2016/01/31 17:28:14 Loading: [afd_service_keystone-admin-api_http_errors_filter]
2016/01/31 17:28:14 Loading: [afd_service_keystone-public-api_http_errors_filter]
2016/01/31 17:28:14 Loading: [afd_service_mysql_node-status_filter]
2016/01/31 17:28:14 Loading: [afd_service_neutron-api_http_errors_filter]
2016/01/31 17:28:14 Loading: [afd_service_nova-api_http_errors_filter]
2016/01/31 17:28:14 Loading: [afd_service_rabbitmq_disk_filter]
2016/01/31 17:28:14 Loading: [afd_service_rabbitmq_memory_filter]
2016/01/31 17:28:14 Loading: [afd_service_rabbitmq_queue_filter]
2016/01/31 17:28:14 Loading: [afd_service_swift-api_http_errors_filter]
2016/01/31 17:28:14 Loading: [afd_workers_filter]
2016/01/31 17:28:14 Loading: [gse_global_filter]
2016/01/31 17:28:14 Loading: [gse_node_filter]
2016/01/31 17:28:14 Loading: [gse_service_filter]
2016/01/31 17:28:14 Loading: [heka_monitoring_filter]
2016/01/31 17:28:14 Loading: [http_metrics_filter]
2016/01/31 17:28:14 Loading: [influxdb_accumulator_filter]
2016/01/31 17:28:14 Loading: [influxdb_annotation_filter]
2016/01/31 17:28:14 Loading: [instance_state_filter]
2016/01/31 17:28:14 Loading: [resource_creation_time_filter]
2016/01/31 17:28:14 Loading: [service_heartbeat_filter]
2016/01/31 17:28:14 Loading: [output_file]
2016/01/31 17:28:14 Loading: [output_file]
2016/01/31 17:28:14 Loading: [aggregator_tcpoutput]
2016/01/31 17:28:14 Loading: [DashboardOutput]
2016/01/31 17:28:14 Loading: [elasticsearch_output]
2016/01/31 17:28:14 Loading: [influxdb_output]
2016/01/31 17:28:14 Loading: [nagios_afd_nodes_output]
2016/01/31 17:28:14 Loading: [nagios_afd_nodes_output_file]
2016/01/31 17:28:14 Loading: [nagios_gse_global_clusters_output]
2016/01/31 17:28:14 Loading: [nagios_gse_node_clusters_output]
2016/01/31 17:28:14 Starting hekad...
2016/01/31 17:28:14 Output started: output_file
2016/01/31 17:28:14 Output started: aggregator_tcpoutput
2016/01/31 17:28:14 Output started: DashboardOutput
2016/01/31 17:28:14 Output started: influxdb_output
2016/01/31 17:28:14 Output started: nagios_afd_nodes_output
2016/01/31 17:28:14 Output started: elasticsearch_output
2016/01/31 17:28:14 Output started: nagios_afd_nodes_output_file
2016/01/31 17:28:14 Output started: nagios_gse_global_clusters_output
2016/01/31 17:28:14 Output started: nagios_gse_node_clusters_output
2016/01/31 17:28:14 Filter started: afd_api_endpoints_filter
2016/01/31 17:28:14 Filter started: afd_node_controller_root-fs_filter
2016/01/31 17:28:14 Filter started: afd_service_cinder-api_http_errors_filter
2016/01/31 17:28:14 Filter started: afd_service_keystone-admin-api_http_errors_filter
2016/01/31 17:28:14 Filter started: afd_service_mysql_node-status_filter
2016/01/31 17:28:14 Filter started: afd_service_nova-api_http_errors_filter
2016/01/31 17:28:14 Filter started: afd_service_rabbitmq_memory_filter
2016/01/31 17:28:14 Filter started: gse_service_filter
2016/01/31 17:28:14 Filter started: afd_service_rabbitmq_disk_filter
2016/01/31 17:28:14 Filter started: influxdb_annotation_filter
2016/01/31 17:28:14 Filter started: influxdb_accumulator_filter
2016/01/31 17:28:14 Filter started: afd_api_backends_filter
2016/01/31 17:28:14 Filter started: afd_node_controller_cpu_filter
2016/01/31 17:28:14 Filter started: afd_service_swift-api_http_errors_filter
2016/01/31 17:28:14 Filter started: service_heartbeat_filter
2016/01/31 17:28:14 Filter started: afd_service_apache_worker_filter
2016/01/31 17:28:14 Filter started: afd_service_heat-api_http_errors_filter
2016/01/31 17:28:14 Filter started: afd_service_neutron-api_http_errors_filter
2016/01/31 17:28:14 Filter started: afd_service_rabbitmq_queue_filter
2016/01/31 17:28:14 Filter started: http_metrics_filter
2016/01/31 17:28:14 Filter started: afd_workers_filter
2016/01/31 17:28:14 Filter started: heka_monitoring_filter
2016/01/31 17:28:14 Filter started: instance_state_filter
2016/01/31 17:28:14 Filter started: resource_creation_time_filter
2016/01/31 17:28:14 Filter started: afd_node_controller_log-fs_filter
2016/01/31 17:28:14 Filter started: afd_node_mysql-nodes_mysql-fs_filter
2016/01/31 17:28:14 Filter started: gse_node_filter
2016/01/31 17:28:14 Filter started: afd_service_glance-api_http_errors_filter
2016/01/31 17:28:14 Filter started: afd_service_keystone-public-api_http_errors_filter
2016/01/31 17:28:14 Filter started: gse_global_filter
2016/01/31 17:28:14 MessageRouter started.
2016/01/31 17:28:14 Input started: test_input
2016/01/31 17:28:14 Input started: openstack_7_0_logstreamer
2016/01/31 17:28:14 Input started: pacemaker_logstreamer
2016/01/31 17:28:14 Input started: rabbitmq_logstreamer
... <SKIP> ...
So heka was successfully started and you can see debug messages.
To detach from screen use ctr-a d shortcut key combination.