一 应用场景描述
最近在研究日志平台解决方案。最终选择使用目前比较流行的ELK框架,即Elasticsearch,Logstash,Kibana三个开源软件的组合来构建日志平台。其中Elasticsearch用于日志搜索,Logstash用于日志的收集,过滤,处理等,Kibana用于日志的界面展示。最核心的就是要先了解Logstash的工作原理。
二 Logstash介绍
Logstash是一款用于接收,处理并输出日志的工具。Logstash可以处理各种各样的日志,包括系统日志,WEB容器日志如Apache日志和Nginx日志和Tomcat日志等,各种应用日志等。
三 Logstash简单使用
Logstash是用ruby语言编写,Jruby作为ruby解释器。所以运行Logstash只需要安装Java就行。
在CentOS上安装Java
yum -y install java-1.7.0-openjdk*
$ java -version
java version "1.7.0_75"
OpenJDK Runtime Environment (rhel-2.5.4.0.el6_6-x86_64 u75-b13)
OpenJDK 64-Bit Server VM (build 24.75-b04, mixed mode)
wget
tar zxvf logstash-1.4.2.tar.gz
cd logstash-1.4.2
使用bin/logstash agent --help 查看参数说明
-e 后面直接跟配置信息,而不通过-f 参数指定配置文件。可以用于快速测试
在命令行运行
$ bin/logstash -e 'input {stdin {} } output {stdout {} }'
然后再输入一些信息
$ bin/logstash -e 'input {stdin {} } output {stdout {} }'
hello world
2015-01-31T12:02:20.438+0000 xxxxx hello world
这里通过stdin输入信息,然后通过stdout输出信息。在输入hello world后Logstash将处理后的信息输出到屏幕
$ bin/logstash -e 'input {stdin {} } output {stdout { codec => rubydebug } }'goodnight moon{ "message" => "goodnight moon", "@version" => "1", "@timestamp" => "2015-01-31T12:09:38.564Z", "host" => "xxxx-elk-log"}
存储日志到Elasticsearch
wget
unzip elasticsearch-1.4.2.zip
cd elasticsearch-1.4.2
./bin/elasticsearch
Logstash和Elasticsearch的版本要一致
$bin/logstash -e 'input { stdin {} } output { elasticsearch { host => localhost }}'you know,for logs
这里logstash从屏幕接收信息,然后将输出结果发送到Elasticsearch,然后验证Elasticsearch是否从Logstash接收了数据
$ curl 'http://localhost:9200/_search?pretty'{ "took" : 2, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 1.0, "hits" : [ { "_index" : "logstash-2015.01.31", "_type" : "logs", "_id" : "W6HMXGx2Tw25sTX7OwZPug", "_score" : 1.0, "_source":{"message":"you know,for logs","@version":"1","@timestamp":"2015-01-31T12:43:53.630Z","host":"jidong-elk-log"} } ] }}
另外可以通过Elasticearch-kopf插件访问查看Logstash数据
使用一下方式安装
bin/plugin -install lmenezes/elasticsearch-kopf
然后通过
访问
使用多种输出方式
$bin/logstash -e 'input { stdin {} } output { elasticsearch { host => localhost } stdout {} }'multiple outputs2015-01-31T13:03:43.426+0000 jidong-elk-log multiple outputs
这里除了将从键盘输入的内容输出到Elasticsearch外,还输出到屏幕
$ curl 'http://localhost:9200/_search?pretty'{ "took" : 2, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 1.0, "hits" : [ { "_index" : "logstash-2015.01.31", "_type" : "logs", "_id" : "W6HMXGx2Tw25sTX7OwZPug", "_score" : 1.0, "_source":{"message":"you know,for logs","@version":"1","@timestamp":"2015-01-31T12:43:53.630Z","host":"jidong-elk-log"} }, { "_index" : "logstash-2015.01.31", "_type" : "logs", "_id" : "kMXKoQglQNCDYOEyOmAnhg", "_score" : 1.0, "_source":{"message":"multiple outputs","@version":"1","@timestamp":"2015-01-31T13:03:43.426Z","host":"jidong-elk-log"} } ] }}
Elasticsearch默认是根据日期来创建索引,每天创建一个索引,如logstash-2015.01.31
Logstash事件的生命周期 The life of an event
Inputs,Outputs,Codecs,Filters是Logstash配置的核心。
Inputs 传送日志数据到Logstash,主要有以下几个插件可以使用
file 从一个文件中读入日志数据
syslog 默认监听514端口,接收来自syslog的日志,并根据RFC3164格式解析
redis 从redis读入日志数据,通常redis在一个集中Logstash部署架构中作为一个broker来缓冲来自Logstash agent或其他方式发送过来的日志。
lumberjack 处理使用lumberjack协议发送过来的日志。现在叫做logstash-forwarder
Filters 用于根据各种匹配条件对Logstash事件进行过滤处理,主要有以下几个插件
grok 解析任意文本并将它结构化
mutate 对事件进行添加,删除,移动,替换,修改等更改操作
drop 丢掉特定事件
clone 克隆事件
geoip 添加IP地址的物理位置信息
Outputs 是Logstash pipeline的最后一个阶段。一个事件可以有多种输出。常用的有以下几个插件
elasticsearch 将事件数据写入到Elasticsearch
file 将事件数据写入到磁盘文件
Codecs 是用于流过滤,可以添加到input或output。主要有plain,json等
使用配置文件
conf/logstash-simple.conf
input { stdin {} }output { elasticsearch { host => localhost } stdout { codec => rubydebug }}
$ sudo bin/logstash -f conf/logstash-simple.conf config file{ "message" => "config file", "@version" => "1", "@timestamp" => "2015-02-01T02:38:15.347Z", "host" => "xxxxxx"}
curl 'http://localhost:9200/_search?pretty'
"_index" : "logstash-2015.02.01", "_type" : "logs", "_id" : "NW2e8LdWSwuNE-aJZNtd-w", "_score" : 1.0, "_source":{"message":"config file","@version":"1","@timestamp":"2015-02-01T02:38:15.347Z","host":"xxxxxx"} } ] }
Filter测试
$ cat conf/logstash-filter.conf input { stdin {} }filter { grok { match => { "message" => "%{COMBINEDAPACHELOG}" } } date { match => [ "timestamp","dd/MMM/yyyy:HH:mm:ss Z" ] }}output { elasticsearch { host => localhost } stdout { codec => rubydebug }}
在屏幕输入一行Apache日志
$ sudo bin/logstash -f conf/logstash-filter.conf 127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] "GET /xampp/status.php HTTP/1.1" 200 3891 "http://cadenza/xampp/navi.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0"{ "message" => "127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] \"GET /xampp/status.php HTTP/1.1\" 200 3891 \"http://cadenza/xampp/navi.php\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\"", "@version" => "1", "@timestamp" => "2013-12-11T08:01:45.000Z", "host" => "xxxxxxx", "clientip" => "127.0.0.1", "ident" => "-", "auth" => "-", "timestamp" => "11/Dec/2013:00:01:45 -0800", "verb" => "GET", "request" => "/xampp/status.php", "httpversion" => "1.1", "response" => "200", "bytes" => "3891", "referrer" => "\"http://cadenza/xampp/navi.php\"", "agent" => "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\""}
"_index" : "logstash-2013.12.11", "_type" : "logs", "_id" : "QusW5lY5T8a9wqgCcottnA", "_score" : 1.0, "_source":{"message":"127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] \"GET /xampp/status.php HTTP/1.1\" 200 3891 \"http://cadenza/xampp/navi.php\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\"","@version":"1","@timestamp":"2013-12-11T08:01:45.000Z","host":"jidong-elk-log","clientip":"127.0.0.1","ident":"-","auth":"-","timestamp":"11/Dec/2013:00:01:45 -0800","verb":"GET","request":"/xampp/status.php","httpversion":"1.1","response":"200","bytes":"3891","referrer":"\"http://cadenza/xampp/navi.php\"","agent":"\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\""}
案例一,使用Logstash处理Apache日志
$ cat /tmp/access.log 71.141.244.242 - kurt [18/May/2011:01:48:10 -0700] "GET /admin HTTP/1.1" 301 566 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3"134.39.72.245 - - [18/May/2011:12:40:18 -0700] "GET /favicon.ico HTTP/1.1" 200 1189 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2; .NET4.0C; .NET4.0E)"98.83.179.51 - - [18/May/2011:19:35:08 -0700] "GET /css/main.css HTTP/1.1" 200 1837 "http://www.safesand.com/information.htm" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1"
$ cat conf/logstash-apache.conf input { file { path => "/tmp/access_log" start_position => beginning } }filter { if [path] =~"access" { mutate { replace => { "type" => "apache_access" } } grok { match => { "message" => "%{COMBINEDAPACHELOG}" } } date { match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ] } }}output { elasticsearch { host => localhost } stdout { codec => rubydebug }}
启动Logstash后可以看到Logstash将/tmp/access_log的日志数据处理了
$ sudo bin/logstash -f conf/logstash-apache.conf Using milestone 2 input plugin 'file'. This plugin should be stable, but if you see strange behavior, please let us know! For more information on plugin milestones, see http://logstash.net/docs/1.4.2/plugin-milestones {:level=>:warn}{ "message" => "71.141.244.242 - kurt [18/May/2011:01:48:10 -0700] \"GET /admin HTTP/1.1\" 301 566 \"-\" \"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3\"", "@version" => "1", "@timestamp" => "2011-05-18T08:48:10.000Z", "host" => "jidong-elk-log", "path" => "/tmp/access_log", "type" => "apache_access", "clientip" => "71.141.244.242", "ident" => "-", "auth" => "kurt", "timestamp" => "18/May/2011:01:48:10 -0700", "verb" => "GET", "request" => "/admin", "httpversion" => "1.1", "response" => "301", "bytes" => "566", "referrer" => "\"-\"", "agent" => "\"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3\""}{ "message" => "134.39.72.245 - - [18/May/2011:12:40:18 -0700] \"GET /favicon.ico HTTP/1.1\" 200 1189 \"-\" \"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2; .NET4.0C; .NET4.0E)\"", "@version" => "1", "@timestamp" => "2011-05-18T19:40:18.000Z", "host" => "jidong-elk-log", "path" => "/tmp/access_log", "type" => "apache_access", "clientip" => "134.39.72.245", "ident" => "-", "auth" => "-", "timestamp" => "18/May/2011:12:40:18 -0700", "verb" => "GET", "request" => "/favicon.ico", "httpversion" => "1.1", "response" => "200", "bytes" => "1189", "referrer" => "\"-\"", "agent" => "\"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2; .NET4.0C; .NET4.0E)\""}{ "message" => "98.83.179.51 - - [18/May/2011:19:35:08 -0700] \"GET /css/main.css HTTP/1.1\" 200 1837 \"http://www.safesand.com/information.htm\" \"Mozilla/5.0 (Windows NT 6.0; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1\"", "@version" => "1", "@timestamp" => "2011-05-19T02:35:08.000Z", "host" => "jidong-elk-log", "path" => "/tmp/access_log", "type" => "apache_access", "clientip" => "98.83.179.51", "ident" => "-", "auth" => "-", "timestamp" => "18/May/2011:19:35:08 -0700", "verb" => "GET", "request" => "/css/main.css", "httpversion" => "1.1", "response" => "200", "bytes" => "1837", "referrer" => "\"http://www.safesand.com/information.htm\"", "agent" => "\"Mozilla/5.0 (Windows NT 6.0; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1\""}
查看Elasticsearch
curl 'http://localhost:9200/_search?pretty'
案例二,使用Logstash处理来自syslog的日志
$ cat conf/logstash-syslog.conf input { tcp { port => 5000 type => syslog } udp { port => 5000 type => syslog }}filter { if [type] == "syslog" { grok { match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}"} add_field => [ "received_at", "%{@timestamp}" ] add_field => [ "receieved_from", "%{host}" ] } syslog_pri {} date { match => [ "syslog_timestamp", "MMM d HH:mm:ss","MMM dd HH:mm:ss" ] } }}output { elasticsearch { host => localhost } stdout { codec => rubydebug}}
启动logstash
$ sudo bin/logstash -f conf/logstash-syslog.conf
通过telnet连接到5000端口,然后发送日志信息给Logstash
$ telnet localhost 5000Trying ::1...Connected to localhost.Escape character is '^]'.Dec 23 12:11:43 louis postfix/smtpd[31499]: connect from unknown[95.75.93.154]Dec 23 14:42:56 louis named[16000]: client 199.48.164.7#64817: query (cache) 'amsterdamboothuren.com/MX/IN' deniedDec 23 14:30:01 louis CRON[619]: (www-data) CMD (php /usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log)Dec 22 18:28:06 louis rsyslogd: [origin software="rsyslogd" swVersion="4.2.0" x-pid="2253" x-info="http://www.rsyslog.com"] rsyslogd was HUPed, type 'lightweight'.
查看Logstash屏幕输出
$ sudo bin/logstash -f conf/logstash-syslog.conf { "message" => "Dec 23 12:11:43 louis postfix/smtpd[31499]: connect from unknown[95.75.93.154]\r", "@version" => "1", "@timestamp" => "2015-12-23T04:11:43.000Z", "host" => "0:0:0:0:0:0:0:1:34337", "type" => "syslog", "syslog_timestamp" => "Dec 23 12:11:43", "syslog_hostname" => "louis", "syslog_program" => "postfix/smtpd", "syslog_pid" => "31499", "syslog_message" => "connect from unknown[95.75.93.154]\r", "received_at" => "2015-02-01 05:01:48 UTC", "receieved_from" => "0:0:0:0:0:0:0:1:34337", "syslog_severity_code" => 5, "syslog_facility_code" => 1, "syslog_facility" => "user-level", "syslog_severity" => "notice"}{ "message" => "Dec 23 14:42:56 louis named[16000]: client 199.48.164.7#64817: query (cache) 'amsterdamboothuren.com/MX/IN' denied\r", "@version" => "1", "@timestamp" => "2015-12-23T06:42:56.000Z", "host" => "0:0:0:0:0:0:0:1:34337", "type" => "syslog", "syslog_timestamp" => "Dec 23 14:42:56", "syslog_hostname" => "louis", "syslog_program" => "named", "syslog_pid" => "16000", "syslog_message" => "client 199.48.164.7#64817: query (cache) 'amsterdamboothuren.com/MX/IN' denied\r", "received_at" => "2015-02-01 05:01:48 UTC", "receieved_from" => "0:0:0:0:0:0:0:1:34337", "syslog_severity_code" => 5, "syslog_facility_code" => 1, "syslog_facility" => "user-level", "syslog_severity" => "notice"}{ "message" => "Dec 23 14:30:01 louis CRON[619]: (www-data) CMD (php /usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log)\r", "@version" => "1", "@timestamp" => "2015-12-23T06:30:01.000Z", "host" => "0:0:0:0:0:0:0:1:34337", "type" => "syslog", "syslog_timestamp" => "Dec 23 14:30:01", "syslog_hostname" => "louis", "syslog_program" => "CRON", "syslog_pid" => "619", "syslog_message" => "(www-data) CMD (php /usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log)\r", "received_at" => "2015-02-01 05:01:48 UTC", "receieved_from" => "0:0:0:0:0:0:0:1:34337", "syslog_severity_code" => 5, "syslog_facility_code" => 1, "syslog_facility" => "user-level", "syslog_severity" => "notice"}{ "message" => "Dec 22 18:28:06 louis rsyslogd: [origin software=\"rsyslogd\" swVersion=\"4.2.0\" x-pid=\"2253\" x-info=\"http://www.rsyslog.com\"] rsyslogd was HUPed, type 'lightweight'.\r", "@version" => "1", "@timestamp" => "2015-12-22T10:28:06.000Z", "host" => "0:0:0:0:0:0:0:1:34337", "type" => "syslog", "syslog_timestamp" => "Dec 22 18:28:06", "syslog_hostname" => "louis", "syslog_program" => "rsyslogd", "syslog_message" => "[origin software=\"rsyslogd\" swVersion=\"4.2.0\" x-pid=\"2253\" x-info=\"http://www.rsyslog.com\"] rsyslogd was HUPed, type 'lightweight'.\r", "received_at" => "2015-02-01 05:01:53 UTC", "receieved_from" => "0:0:0:0:0:0:0:1:34337", "syslog_severity_code" => 5, "syslog_facility_code" => 1, "syslog_facility" => "user-level", "syslog_severity" => "notice"}
参考文档