一 应用场景描述

最近在研究日志平台解决方案。最终选择使用目前比较流行的ELK框架,即Elasticsearch,Logstash,Kibana三个开源软件的组合来构建日志平台。其中Elasticsearch用于日志搜索,Logstash用于日志的收集,过滤,处理等,Kibana用于日志的界面展示。最核心的就是要先了解Logstash的工作原理。

二 Logstash介绍

Logstash是一款用于接收,处理并输出日志的工具。Logstash可以处理各种各样的日志,包括系统日志,WEB容器日志如Apache日志和Nginx日志和Tomcat日志等,各种应用日志等。

三 Logstash简单使用

Logstash是用ruby语言编写,Jruby作为ruby解释器。所以运行Logstash只需要安装Java就行。

在CentOS上安装Java

yum -y install java-1.7.0-openjdk*

$ java -version

java version "1.7.0_75"

OpenJDK Runtime Environment (rhel-2.5.4.0.el6_6-x86_64 u75-b13)

OpenJDK 64-Bit Server VM (build 24.75-b04, mixed mode)

wget

tar zxvf logstash-1.4.2.tar.gz

cd logstash-1.4.2

使用bin/logstash agent --help 查看参数说明

-e  后面直接跟配置信息,而不通过-f 参数指定配置文件。可以用于快速测试

在命令行运行

$ bin/logstash -e 'input {stdin {} } output {stdout {} }'

然后再输入一些信息

$ bin/logstash -e 'input {stdin {} } output {stdout {} }'

hello world

2015-01-31T12:02:20.438+0000 xxxxx hello world

这里通过stdin输入信息,然后通过stdout输出信息。在输入hello world后Logstash将处理后的信息输出到屏幕

$ bin/logstash -e 'input {stdin {} } output {stdout { codec => rubydebug  } }'goodnight moon{      "message" => "goodnight moon",      "@version" => "1",        "@timestamp" => "2015-01-31T12:09:38.564Z",      "host" => "xxxx-elk-log"}

存储日志到Elasticsearch

wget 

unzip elasticsearch-1.4.2.zip

cd elasticsearch-1.4.2

./bin/elasticsearch

Logstash和Elasticsearch的版本要一致

$bin/logstash -e 'input { stdin {} } output { elasticsearch { host => localhost }}'you know,for logs

这里logstash从屏幕接收信息,然后将输出结果发送到Elasticsearch,然后验证Elasticsearch是否从Logstash接收了数据

$ curl 'http://localhost:9200/_search?pretty'{  "took" : 2,  "timed_out" : false,  "_shards" : {    "total" : 5,    "successful" : 5,    "failed" : 0  },  "hits" : {    "total" : 1,    "max_score" : 1.0,    "hits" : [ {      "_index" : "logstash-2015.01.31",      "_type" : "logs",      "_id" : "W6HMXGx2Tw25sTX7OwZPug",      "_score" : 1.0,      "_source":{"message":"you know,for logs","@version":"1","@timestamp":"2015-01-31T12:43:53.630Z","host":"jidong-elk-log"}    } ]  }}

另外可以通过Elasticearch-kopf插件访问查看Logstash数据

使用一下方式安装

bin/plugin -install lmenezes/elasticsearch-kopf

然后通过

 访问

使用多种输出方式

$bin/logstash -e 'input { stdin {} } output { elasticsearch { host => localhost } stdout {}   }'multiple outputs2015-01-31T13:03:43.426+0000 jidong-elk-log multiple outputs

这里除了将从键盘输入的内容输出到Elasticsearch外,还输出到屏幕

$ curl 'http://localhost:9200/_search?pretty'{  "took" : 2,  "timed_out" : false,  "_shards" : {    "total" : 5,    "successful" : 5,    "failed" : 0  },  "hits" : {    "total" : 2,    "max_score" : 1.0,    "hits" : [ {      "_index" : "logstash-2015.01.31",      "_type" : "logs",      "_id" : "W6HMXGx2Tw25sTX7OwZPug",      "_score" : 1.0,      "_source":{"message":"you know,for logs","@version":"1","@timestamp":"2015-01-31T12:43:53.630Z","host":"jidong-elk-log"}    }, {      "_index" : "logstash-2015.01.31",      "_type" : "logs",      "_id" : "kMXKoQglQNCDYOEyOmAnhg",      "_score" : 1.0,      "_source":{"message":"multiple outputs","@version":"1","@timestamp":"2015-01-31T13:03:43.426Z","host":"jidong-elk-log"}    } ]  }}

Elasticsearch默认是根据日期来创建索引,每天创建一个索引,如logstash-2015.01.31 

Logstash事件的生命周期 The life of an event

Inputs,Outputs,Codecs,Filters是Logstash配置的核心。

Inputs 传送日志数据到Logstash,主要有以下几个插件可以使用

file 从一个文件中读入日志数据

syslog 默认监听514端口,接收来自syslog的日志,并根据RFC3164格式解析

redis 从redis读入日志数据,通常redis在一个集中Logstash部署架构中作为一个broker来缓冲来自Logstash agent或其他方式发送过来的日志。

lumberjack  处理使用lumberjack协议发送过来的日志。现在叫做logstash-forwarder

Filters 用于根据各种匹配条件对Logstash事件进行过滤处理,主要有以下几个插件

grok  解析任意文本并将它结构化

mutate 对事件进行添加,删除,移动,替换,修改等更改操作

drop  丢掉特定事件

clone 克隆事件

geoip 添加IP地址的物理位置信息

Outputs 是Logstash pipeline的最后一个阶段。一个事件可以有多种输出。常用的有以下几个插件

elasticsearch  将事件数据写入到Elasticsearch

file        将事件数据写入到磁盘文件

Codecs 是用于流过滤,可以添加到input或output。主要有plain,json等

使用配置文件

conf/logstash-simple.conf

input {  stdin {}     }output {  elasticsearch {    host => localhost                }  stdout {    codec => rubydebug         }}

$ sudo bin/logstash -f conf/logstash-simple.conf config file{       "message" => "config file",      "@version" => "1",    "@timestamp" => "2015-02-01T02:38:15.347Z",          "host" => "xxxxxx"}

curl 'http://localhost:9200/_search?pretty'

  "_index" : "logstash-2015.02.01",      "_type" : "logs",      "_id" : "NW2e8LdWSwuNE-aJZNtd-w",      "_score" : 1.0,      "_source":{"message":"config file","@version":"1","@timestamp":"2015-02-01T02:38:15.347Z","host":"xxxxxx"}    } ]  }

Filter测试

$ cat conf/logstash-filter.conf input {     stdin {}      }filter {  grok {   match => {      "message" => "%{COMBINEDAPACHELOG}"            }      }  date {   match => [ "timestamp","dd/MMM/yyyy:HH:mm:ss Z" ]       }}output {  elasticsearch {     host => localhost                }  stdout {     codec => rubydebug         }}

在屏幕输入一行Apache日志

$ sudo bin/logstash -f conf/logstash-filter.conf 127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] "GET /xampp/status.php HTTP/1.1" 200 3891 "http://cadenza/xampp/navi.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0"{        "message" => "127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] \"GET /xampp/status.php HTTP/1.1\" 200 3891 \"http://cadenza/xampp/navi.php\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\"",       "@version" => "1",     "@timestamp" => "2013-12-11T08:01:45.000Z",           "host" => "xxxxxxx",       "clientip" => "127.0.0.1",          "ident" => "-",           "auth" => "-",      "timestamp" => "11/Dec/2013:00:01:45 -0800",           "verb" => "GET",        "request" => "/xampp/status.php",    "httpversion" => "1.1",       "response" => "200",          "bytes" => "3891",       "referrer" => "\"http://cadenza/xampp/navi.php\"",          "agent" => "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\""}

      "_index" : "logstash-2013.12.11",      "_type" : "logs",      "_id" : "QusW5lY5T8a9wqgCcottnA",      "_score" : 1.0,      "_source":{"message":"127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] \"GET /xampp/status.php HTTP/1.1\" 200 3891 \"http://cadenza/xampp/navi.php\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\"","@version":"1","@timestamp":"2013-12-11T08:01:45.000Z","host":"jidong-elk-log","clientip":"127.0.0.1","ident":"-","auth":"-","timestamp":"11/Dec/2013:00:01:45 -0800","verb":"GET","request":"/xampp/status.php","httpversion":"1.1","response":"200","bytes":"3891","referrer":"\"http://cadenza/xampp/navi.php\"","agent":"\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\""}

案例一,使用Logstash处理Apache日志

$ cat /tmp/access.log 71.141.244.242 - kurt [18/May/2011:01:48:10 -0700] "GET /admin HTTP/1.1" 301 566 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3"134.39.72.245 - - [18/May/2011:12:40:18 -0700] "GET /favicon.ico HTTP/1.1" 200 1189 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2; .NET4.0C; .NET4.0E)"98.83.179.51 - - [18/May/2011:19:35:08 -0700] "GET /css/main.css HTTP/1.1" 200 1837 "http://www.safesand.com/information.htm" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1"

$ cat conf/logstash-apache.conf input {  file {    path => "/tmp/access_log"    start_position => beginning       }      }filter {  if [path] =~"access" {   mutate {     replace => {        "type" => "apache_access"                }          }   grok {     match => {         "message" => "%{COMBINEDAPACHELOG}"              }        }   date {     match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]        }        }}output {  elasticsearch { host => localhost }  stdout { codec => rubydebug }}

启动Logstash后可以看到Logstash将/tmp/access_log的日志数据处理了

$ sudo bin/logstash  -f conf/logstash-apache.conf Using milestone 2 input plugin 'file'. This plugin should be stable, but if you see strange behavior, please let us know! For more information on plugin milestones, see http://logstash.net/docs/1.4.2/plugin-milestones {:level=>:warn}{        "message" => "71.141.244.242 - kurt [18/May/2011:01:48:10 -0700] \"GET /admin HTTP/1.1\" 301 566 \"-\" \"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3\"",       "@version" => "1",     "@timestamp" => "2011-05-18T08:48:10.000Z",           "host" => "jidong-elk-log",           "path" => "/tmp/access_log",           "type" => "apache_access",       "clientip" => "71.141.244.242",          "ident" => "-",           "auth" => "kurt",      "timestamp" => "18/May/2011:01:48:10 -0700",           "verb" => "GET",        "request" => "/admin",    "httpversion" => "1.1",       "response" => "301",          "bytes" => "566",       "referrer" => "\"-\"",          "agent" => "\"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3\""}{        "message" => "134.39.72.245 - - [18/May/2011:12:40:18 -0700] \"GET /favicon.ico HTTP/1.1\" 200 1189 \"-\" \"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2; .NET4.0C; .NET4.0E)\"",       "@version" => "1",     "@timestamp" => "2011-05-18T19:40:18.000Z",           "host" => "jidong-elk-log",           "path" => "/tmp/access_log",           "type" => "apache_access",       "clientip" => "134.39.72.245",          "ident" => "-",           "auth" => "-",      "timestamp" => "18/May/2011:12:40:18 -0700",           "verb" => "GET",        "request" => "/favicon.ico",    "httpversion" => "1.1",       "response" => "200",          "bytes" => "1189",       "referrer" => "\"-\"",          "agent" => "\"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2; .NET4.0C; .NET4.0E)\""}{        "message" => "98.83.179.51 - - [18/May/2011:19:35:08 -0700] \"GET /css/main.css HTTP/1.1\" 200 1837 \"http://www.safesand.com/information.htm\" \"Mozilla/5.0 (Windows NT 6.0; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1\"",       "@version" => "1",     "@timestamp" => "2011-05-19T02:35:08.000Z",           "host" => "jidong-elk-log",           "path" => "/tmp/access_log",           "type" => "apache_access",       "clientip" => "98.83.179.51",          "ident" => "-",           "auth" => "-",      "timestamp" => "18/May/2011:19:35:08 -0700",           "verb" => "GET",        "request" => "/css/main.css",    "httpversion" => "1.1",       "response" => "200",          "bytes" => "1837",       "referrer" => "\"http://www.safesand.com/information.htm\"",          "agent" => "\"Mozilla/5.0 (Windows NT 6.0; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1\""}

查看Elasticsearch

curl 'http://localhost:9200/_search?pretty'

案例二,使用Logstash处理来自syslog的日志

$ cat conf/logstash-syslog.conf input {  tcp {    port => 5000    type => syslog      }  udp {    port => 5000    type => syslog      }}filter { if [type] == "syslog" {  grok {   match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}"}   add_field => [ "received_at", "%{@timestamp}" ]   add_field => [ "receieved_from", "%{host}" ]       }  syslog_pri {}  date {            match => [ "syslog_timestamp", "MMM d HH:mm:ss","MMM dd HH:mm:ss" ]       }     }}output {   elasticsearch { host => localhost }   stdout { codec => rubydebug}}

启动logstash

$ sudo bin/logstash  -f conf/logstash-syslog.conf

通过telnet连接到5000端口,然后发送日志信息给Logstash

$ telnet localhost 5000Trying ::1...Connected to localhost.Escape character is '^]'.Dec 23 12:11:43 louis postfix/smtpd[31499]: connect from unknown[95.75.93.154]Dec 23 14:42:56 louis named[16000]: client 199.48.164.7#64817: query (cache) 'amsterdamboothuren.com/MX/IN' deniedDec 23 14:30:01 louis CRON[619]: (www-data) CMD (php /usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log)Dec 22 18:28:06 louis rsyslogd: [origin software="rsyslogd" swVersion="4.2.0" x-pid="2253" x-info="http://www.rsyslog.com"] rsyslogd was HUPed, type 'lightweight'.

查看Logstash屏幕输出

$ sudo bin/logstash  -f conf/logstash-syslog.conf {                 "message" => "Dec 23 12:11:43 louis postfix/smtpd[31499]: connect from unknown[95.75.93.154]\r",                "@version" => "1",              "@timestamp" => "2015-12-23T04:11:43.000Z",                    "host" => "0:0:0:0:0:0:0:1:34337",                    "type" => "syslog",        "syslog_timestamp" => "Dec 23 12:11:43",         "syslog_hostname" => "louis",          "syslog_program" => "postfix/smtpd",              "syslog_pid" => "31499",          "syslog_message" => "connect from unknown[95.75.93.154]\r",             "received_at" => "2015-02-01 05:01:48 UTC",          "receieved_from" => "0:0:0:0:0:0:0:1:34337",    "syslog_severity_code" => 5,    "syslog_facility_code" => 1,         "syslog_facility" => "user-level",         "syslog_severity" => "notice"}{                 "message" => "Dec 23 14:42:56 louis named[16000]: client 199.48.164.7#64817: query (cache) 'amsterdamboothuren.com/MX/IN' denied\r",                "@version" => "1",              "@timestamp" => "2015-12-23T06:42:56.000Z",                    "host" => "0:0:0:0:0:0:0:1:34337",                    "type" => "syslog",        "syslog_timestamp" => "Dec 23 14:42:56",         "syslog_hostname" => "louis",          "syslog_program" => "named",              "syslog_pid" => "16000",          "syslog_message" => "client 199.48.164.7#64817: query (cache) 'amsterdamboothuren.com/MX/IN' denied\r",             "received_at" => "2015-02-01 05:01:48 UTC",          "receieved_from" => "0:0:0:0:0:0:0:1:34337",    "syslog_severity_code" => 5,    "syslog_facility_code" => 1,         "syslog_facility" => "user-level",         "syslog_severity" => "notice"}{                 "message" => "Dec 23 14:30:01 louis CRON[619]: (www-data) CMD (php /usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log)\r",                "@version" => "1",              "@timestamp" => "2015-12-23T06:30:01.000Z",                    "host" => "0:0:0:0:0:0:0:1:34337",                    "type" => "syslog",        "syslog_timestamp" => "Dec 23 14:30:01",         "syslog_hostname" => "louis",          "syslog_program" => "CRON",              "syslog_pid" => "619",          "syslog_message" => "(www-data) CMD (php /usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log)\r",             "received_at" => "2015-02-01 05:01:48 UTC",          "receieved_from" => "0:0:0:0:0:0:0:1:34337",    "syslog_severity_code" => 5,    "syslog_facility_code" => 1,         "syslog_facility" => "user-level",         "syslog_severity" => "notice"}{                 "message" => "Dec 22 18:28:06 louis rsyslogd: [origin software=\"rsyslogd\" swVersion=\"4.2.0\" x-pid=\"2253\" x-info=\"http://www.rsyslog.com\"] rsyslogd was HUPed, type 'lightweight'.\r",                "@version" => "1",              "@timestamp" => "2015-12-22T10:28:06.000Z",                    "host" => "0:0:0:0:0:0:0:1:34337",                    "type" => "syslog",        "syslog_timestamp" => "Dec 22 18:28:06",         "syslog_hostname" => "louis",          "syslog_program" => "rsyslogd",          "syslog_message" => "[origin software=\"rsyslogd\" swVersion=\"4.2.0\" x-pid=\"2253\" x-info=\"http://www.rsyslog.com\"] rsyslogd was HUPed, type 'lightweight'.\r",             "received_at" => "2015-02-01 05:01:53 UTC",          "receieved_from" => "0:0:0:0:0:0:0:1:34337",    "syslog_severity_code" => 5,    "syslog_facility_code" => 1,         "syslog_facility" => "user-level",         "syslog_severity" => "notice"}

参考文档