由于业务需要,对安全日志的处理也是工作的一部分
流程
转发端口——>导出全量日志——>提取(处理)日志
转发端口
将es数据库端口转发至对外监听
ssh -g -L 9913:localhost:9200 root@localhost -p 60000
使用logstash提取es索引日志
./logstash.bat -f ../config/config.conf
input{
elasticsearch{
hosts => "xx.xx.xx.xx:9913"
index => "las-e-2022-07-03"
size => 10000
query => '{"_source" : ["sourceEvent"]}'
}
}
filter {
mutate {
remove_field => ["@timestamp"]
}
}
output{
file {
codec => plain { format => "%{sourceEvent}"}
path => "D:\703.txt"
}
}
使用python正则提取导出的数据
python3 geturl.py
#coding=utf-8
from time import *
begin_time = time()
import re
f=open('D:\\703.txt','r', encoding='UTF-8')
pat=re.compile(" WEB: IP (\d+\.\d+\.\d+\.\d+):\d+\(\d+\.\d+\.\d+\.\d+:\d+\)->(\d+\.\d+\.\d+\.\d+:\d+)\(\d+\.\d+\.\d+\.\d+:\d+\),.+?,.+?, URL (.+?),")
save=open('D:\\703-1.txt','w+')
count=0
for line in f:
result=re.findall(pat,line)
if len(result)!=0:
save.write(str(result)+"\n")
count+=1
if count%100000==0:
print (count)
end_time = time()
run_time = end_time-begin_time
print ('time',run_time)
处理结果

排序去重
cat 5-1-WEB.txt | sort | uniq > 5-1-WEB-final.txt