分布式日志分析系统研究
logstash日志收集+elasticsearch分布式搜索+kibanaweb前端显示
1.logstash 1.1.13(基于java,实际安装提供的是一个jar包)提供了shipper和indexer两部分程序。shipper可以从队列等组件接收日志信息,然后放入redis队列,然后indexer将内容取出
2.reids 2.6.14 (logstash使用,传递信息)
3.elasticsearch 0.90 (代码语言java :indexer将日志信息发送到elasticserach进行存储,需要指定目录存储数据,可以完成分布式存储与搜索)
4.kibana 2.0 (日志展示的前端程序,使用ruby编写。需要ruby1.9.3支持)
logstash 不存在安装的问题,因此我自己定义了安装过程将官网提供的jar包作为lib存放于/usr/local/logstash/lib中
还包含etc —–配置文件
logs—-日志信息
配置文件shipper.conf的配置
nput { file{ type => "total" sincedb_path => "/" path => "/var/log/*.TOTAL.log" } file{ type => "error" sincedb_path => "/" path => "var/log/*.ERROR.log" } } filter { grok { patterns_dir => "/usr/local/logstash/etc/patterns" pattern => "%{SYSLOGTIMESTAMP:timestamp} %{IP:host_ip} %{DATE:date}: %{TIME:time} %{TYPE_WORD:module_type} %{LOG_LEVEL:log_level}: %{GREEDYDATA:contain}" } } output { #stdout { debug => true debug_format => "json"} redis { host => "127.0.0.1" port => "6379" data_type => "list" key => "logstash" } }
配置方面需要注意logstash官方文档有坑
shiper填写过滤器的时候可以加载自定义正则配置文件,就是说将日志的索引字段进行分离,这部分官方写的是:
Jan 1 06:25:43 mailserver14 postfix/cleanup[21403]: BEF25A72965: message-id=<20130101142543.5828399CCAF@mailserver14.example.com>
filter {
grok {
patterns_dir => "./patterns"
pattern => "%{SYSLOGBASE} %{POSTFIX_QUEUEID:queue_id}: %{GREEDYDATA:message}"
}
}
可是在重新结构化过的logstash文件夹中需要使用绝对路径,由于用脚本启动可能在任意路径下,因此比较不好定位
完成了init脚本的编写
#!/bin/sh #made by blackjack V1.0 #time:2013-08-20 . /etc/profile LOGSTASH_PATH="/usr/local/logstash" LOGSTASH_LIB_PATH="$LOGSTASH_PATH/lib" LOGSTASH_ETC_PATH="$LOGSTASH_PATH/etc" SHIPPER="$LOGSTASH_ETC_PATH/shipper.conf" INDEXER="$LOGSTASH_ETC_PATH/indexer.conf" LOGSTASH_LOG_PATH="$LOGSTASH_PATH/logs" if [ -z $JAVA_HOME ];then echo "please set JAVA_HOME!" exit fi if [ ! -d $LOGSTASH_PATH ];then echo "please put logstash to /usr/local path" mkdir $LOGSTASH_PATH exit fi if [ ! -d $LOGSTASH_LIB_PATH ];then echo "please put jar to $LOGSTASH_LIB_PATH for logstash" mkdir $LOGSTASH_LIB_PATH exit else jar_file=($(ls -t $LOGSTASH_LIB_PATH/logstash-*.jar)) JAR_FILE=${jar_file[0]} fi if [ ! -d $LOGSTASH_LOG_PATH ];then mkdir $LOGSTASH_LOG_PATH echo "LOGSTASH_LOG_PATH is not exist!" fi start() { if [ -z $1 ];then echo "starting shipper and indexer..." nohup $JAVA_HOME/bin/java -jar $JAR_FILE agent -f $SHIPPER & >/dev/null nohup $JAVA_HOME/bin/java -jar $JAR_FILE agent -f $INDEXER & >/dev/null else if [ $1 = "shipper" ];then echo "starting shipper only..." nohup $JAVA_HOME/bin/java -jar $JAR_FILE agent -f $SHIPPER & >/dev/null else if [ $1 = "indexer" ];then echo "starting indexer only..." nohup $JAVA_HOME/bin/java -jar $JAR_FILE agent -f $INDEXER & >/dev/null else echo "input error!" fi fi fi } stop() { java_pid=($(ps ax|grep $JAR_FILE|grep -v grep|awk '{print $1}'|xargs)) if [ -z $java_pid ];then echo "logstash is not running" else for pid in ${java_pid[*]} do kill $pid if [ $(echo $?) -eq 0 ];then echo "$pid is stoped!" else echo "$pid is not stoped!" fi done fi } help() { $JAVA_HOME/bin/java -jar $JAR_FILE agent --help } config_test() { if [ -z $1 ];then for conf in $SHIPPER $INDEXER do echo "start check $conf" $JAVA_HOME/bin/java -jar $JAR_FILE agent -t -f $conf done else if [ $1 = "shipper" ];then $JAVA_HOME/bin/java -jar $JAR_FILE agent -t -f $SHIPPER else if [ $1 = "indexer" ];then $JAVA_HOME/bin/java -jar $JAR_FILE agent -t -f $INDEXER else echo "input error!" fi fi fi } case $1 in start) #config_test if [ $(echo $?) -eq 0 ];then start $2 else echo "config file error!" exit fi ;; stop) stop ;; restart) stop sleep 2 start ;; $JAVA_HOME/bin/java -jar $JAR_FILE agent -t -f $INDEXER else echo "input error!" fi fi fi } case $1 in start) #config_test if [ $(echo $?) -eq 0 ];then start $2 else echo "config file error!" exit fi ;; stop) stop ;; restart) stop sleep 2 start ;; help) help ;; checkconf) config_test $2 ;; *) echo "Usage:$0 (start|stop|restart|help|checkconf[shipper|indexer])" esac
————————————————————————————
redis编译安装即可,使用生产的配置习惯
elasticsearch官网下载解压缩即可使用,将目录拷贝到/usr/local/elasticsearch
运行/usr/local/elasticsearch/bin/elasticsearch即可使用,此脚本为shell脚本调用java。
需要修改/usr/local/elasticsearch/config目录中的配置文件,修改data目录,我这里修改为/var/lib/elasticsearch
编写init脚本
#!/bin/sh #made by blackjack v1.0 #time:2013-08-20 elastic_home=/usr/local/elasticsearch elastic_bin=$elastic_home/bin elastic_pid=/tmp/elasticsearch.pid start(){ if [ -f $elastic_pid ];then echo "pid_file ($elastic_pid) is exist!" else $elastic_bin/elasticsearch -p $elastic_pid fi } stop(){ cat $elastic_pid|xargs kill rm -f $elastic_pid } status(){ if [[ -z $(ps ax|grep $elastic_home|grep -v grep) ]];then echo "elasticsearch is not running..." else echo "elasticsearch is running..." fi } case $1 in start) start status ;; stop) stop sleep 3 status ;; restart) stop sleep 3 status sleep 5 start sleep 1 status ;; status) status ;; *) echo "Usage:$0 (start|stop|restart|status)" ;; esac
kibana内容待续,需要将rubyrails转为其他运行模式再运行
看来以后不能欠债,kibana后来都已经发布3.0了,kibana的安装过程类似一般的rubyrails安装,用nginx作为前端代理运行就好了。
这个核心问题其实还是logstash在非规则日志匹配上如果正则发生问题,会性能无比差劲,甚至不能运行。这点还有待加强。
如果kibana安装还有问题之后我再补充吧
也可以参考这篇文章
http://enable.blog.51cto.com/747951/1049411