分布式日志分析系统研究
logstash日志收集+elasticsearch分布式搜索+kibanaweb前端显示
1.logstash 1.1.13(基于java,实际安装提供的是一个jar包)提供了shipper和indexer两部分程序。shipper可以从队列等组件接收日志信息,然后放入redis队列,然后indexer将内容取出
2.reids 2.6.14 (logstash使用,传递信息)
3.elasticsearch 0.90 (代码语言java :indexer将日志信息发送到elasticserach进行存储,需要指定目录存储数据,可以完成分布式存储与搜索)
4.kibana 2.0 (日志展示的前端程序,使用ruby编写。需要ruby1.9.3支持)
logstash 不存在安装的问题,因此我自己定义了安装过程将官网提供的jar包作为lib存放于/usr/local/logstash/lib中
还包含etc —–配置文件
logs—-日志信息
配置文件shipper.conf的配置
nput {
file{
type => "total"
sincedb_path => "/"
path => "/var/log/*.TOTAL.log"
}
file{
type => "error"
sincedb_path => "/"
path => "var/log/*.ERROR.log"
}
}
filter {
grok {
patterns_dir => "/usr/local/logstash/etc/patterns"
pattern => "%{SYSLOGTIMESTAMP:timestamp} %{IP:host_ip} %{DATE:date}: %{TIME:time} %{TYPE_WORD:module_type} %{LOG_LEVEL:log_level}: %{GREEDYDATA:contain}"
}
}
output {
#stdout { debug => true debug_format => "json"}
redis { host => "127.0.0.1" port => "6379" data_type => "list" key => "logstash" }
}
配置方面需要注意logstash官方文档有坑
shiper填写过滤器的时候可以加载自定义正则配置文件,就是说将日志的索引字段进行分离,这部分官方写的是:
Jan 1 06:25:43 mailserver14 postfix/cleanup[21403]: BEF25A72965: message-id=<20130101142543.5828399CCAF@mailserver14.example.com>
filter {
grok {
patterns_dir => "./patterns"
pattern => "%{SYSLOGBASE} %{POSTFIX_QUEUEID:queue_id}: %{GREEDYDATA:message}"
}
}
可是在重新结构化过的logstash文件夹中需要使用绝对路径,由于用脚本启动可能在任意路径下,因此比较不好定位
完成了init脚本的编写
#!/bin/sh
#made by blackjack V1.0
#time:2013-08-20
. /etc/profile
LOGSTASH_PATH="/usr/local/logstash"
LOGSTASH_LIB_PATH="$LOGSTASH_PATH/lib"
LOGSTASH_ETC_PATH="$LOGSTASH_PATH/etc"
SHIPPER="$LOGSTASH_ETC_PATH/shipper.conf"
INDEXER="$LOGSTASH_ETC_PATH/indexer.conf"
LOGSTASH_LOG_PATH="$LOGSTASH_PATH/logs"
if [ -z $JAVA_HOME ];then
echo "please set JAVA_HOME!"
exit
fi
if [ ! -d $LOGSTASH_PATH ];then
echo "please put logstash to /usr/local path"
mkdir $LOGSTASH_PATH
exit
fi
if [ ! -d $LOGSTASH_LIB_PATH ];then
echo "please put jar to $LOGSTASH_LIB_PATH for logstash"
mkdir $LOGSTASH_LIB_PATH
exit
else
jar_file=($(ls -t $LOGSTASH_LIB_PATH/logstash-*.jar))
JAR_FILE=${jar_file[0]}
fi
if [ ! -d $LOGSTASH_LOG_PATH ];then
mkdir $LOGSTASH_LOG_PATH
echo "LOGSTASH_LOG_PATH is not exist!"
fi
start() {
if [ -z $1 ];then
echo "starting shipper and indexer..."
nohup $JAVA_HOME/bin/java -jar $JAR_FILE agent -f $SHIPPER & >/dev/null
nohup $JAVA_HOME/bin/java -jar $JAR_FILE agent -f $INDEXER & >/dev/null
else
if [ $1 = "shipper" ];then
echo "starting shipper only..."
nohup $JAVA_HOME/bin/java -jar $JAR_FILE agent -f $SHIPPER & >/dev/null
else
if [ $1 = "indexer" ];then
echo "starting indexer only..."
nohup $JAVA_HOME/bin/java -jar $JAR_FILE agent -f $INDEXER & >/dev/null
else
echo "input error!"
fi
fi
fi
}
stop() {
java_pid=($(ps ax|grep $JAR_FILE|grep -v grep|awk '{print $1}'|xargs))
if [ -z $java_pid ];then
echo "logstash is not running"
else
for pid in ${java_pid[*]}
do
kill $pid
if [ $(echo $?) -eq 0 ];then
echo "$pid is stoped!"
else
echo "$pid is not stoped!"
fi
done
fi
}
help() {
$JAVA_HOME/bin/java -jar $JAR_FILE agent --help
}
config_test() {
if [ -z $1 ];then
for conf in $SHIPPER $INDEXER
do
echo "start check $conf"
$JAVA_HOME/bin/java -jar $JAR_FILE agent -t -f $conf
done
else
if [ $1 = "shipper" ];then
$JAVA_HOME/bin/java -jar $JAR_FILE agent -t -f $SHIPPER
else
if [ $1 = "indexer" ];then
$JAVA_HOME/bin/java -jar $JAR_FILE agent -t -f $INDEXER
else
echo "input error!"
fi
fi
fi
}
case $1 in
start)
#config_test
if [ $(echo $?) -eq 0 ];then
start $2
else
echo "config file error!"
exit
fi
;;
stop)
stop
;;
restart)
stop
sleep 2
start
;;
$JAVA_HOME/bin/java -jar $JAR_FILE agent -t -f $INDEXER
else
echo "input error!"
fi
fi
fi
}
case $1 in
start)
#config_test
if [ $(echo $?) -eq 0 ];then
start $2
else
echo "config file error!"
exit
fi
;;
stop)
stop
;;
restart)
stop
sleep 2
start
;;
help)
help
;;
checkconf)
config_test $2
;;
*)
echo "Usage:$0 (start|stop|restart|help|checkconf[shipper|indexer])"
esac
————————————————————————————
redis编译安装即可,使用生产的配置习惯
elasticsearch官网下载解压缩即可使用,将目录拷贝到/usr/local/elasticsearch
运行/usr/local/elasticsearch/bin/elasticsearch即可使用,此脚本为shell脚本调用java。
需要修改/usr/local/elasticsearch/config目录中的配置文件,修改data目录,我这里修改为/var/lib/elasticsearch
编写init脚本
#!/bin/sh
#made by blackjack v1.0
#time:2013-08-20
elastic_home=/usr/local/elasticsearch
elastic_bin=$elastic_home/bin
elastic_pid=/tmp/elasticsearch.pid
start(){
if [ -f $elastic_pid ];then
echo "pid_file ($elastic_pid) is exist!"
else
$elastic_bin/elasticsearch -p $elastic_pid
fi
}
stop(){
cat $elastic_pid|xargs kill
rm -f $elastic_pid
}
status(){
if [[ -z $(ps ax|grep $elastic_home|grep -v grep) ]];then
echo "elasticsearch is not running..."
else
echo "elasticsearch is running..."
fi
}
case $1 in
start)
start
status
;;
stop)
stop
sleep 3
status
;;
restart)
stop
sleep 3
status
sleep 5
start
sleep 1
status
;;
status)
status
;;
*)
echo "Usage:$0 (start|stop|restart|status)"
;;
esac
kibana内容待续,需要将rubyrails转为其他运行模式再运行
看来以后不能欠债,kibana后来都已经发布3.0了,kibana的安装过程类似一般的rubyrails安装,用nginx作为前端代理运行就好了。
这个核心问题其实还是logstash在非规则日志匹配上如果正则发生问题,会性能无比差劲,甚至不能运行。这点还有待加强。
如果kibana安装还有问题之后我再补充吧
也可以参考这篇文章
http://enable.blog.51cto.com/747951/1049411
