欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

修改CDH进程NN、DN日志输出为JSON格式

程序员文章站 2022-07-14 16:15:13
...

我们知道SparkSQL是可以直接读取JSON数据的,如果我们要通过Flume采集日志通过Spark处理后进行可视化。那么将日志改造成JSON后在Spark处理阶段就非常方便了。

本文主要讲解如何将CDH中HDFS两个进程NN、DN日志改成JSON格式。

一、查看DN原始日志格式

2018-01-15 11:48:28,916 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-971029-172.16.16.52-1501126925757:blk_1079359445_5619066, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
2018-01-15 11:48:28,919 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 127.0.0.1, dest: 127.0.0.1, op: REQUEST_SHORT_CIRCUIT_FDS, blockid: 1079359445, srvID: d529dee8-b904-4afa-9c88-3b5205a89465, success: true
2018-01-15 11:49:30,354 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-971029-172.16.16.52-1501126925757:blk_1079359448_5619069 src: /172.16.16.54:42497 dest: /172.16.16.52:50010
2018-01-15 11:49:30,356 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /172.16.16.54:42497, dest: /172.16.16.52:50010, bytes: 84677, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_826126525_1, offset: 0, srvID: d529dee8-b904-4afa-9c88-3b5205a89465, blockid: BP-971029-172.16.16.52-1501126925757:blk_1079359448_5619069, duration: 1205695
2018-01-15 11:49:30,356 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-971029-172.16.16.52-1501126925757:blk_1079359448_5619069, type=LAST_IN_PIPELINE, downstreams=0:[] terminating

二、改造
1、进入CM界面HDFS组件–>配置–>搜索DataNode log修改CDH进程NN、DN日志输出为JSON格式

找到”DataNode 日志记录高级配置代码段(安全阀)”如果英文版则是”DataNode Logging Advanced Configuration Snippet (Safety Valve)”。填写如下参数

log4j.appender.RFA.layout.ConversionPattern = {"time":"%d{yyyy-MM-dd HH:mm:ss,SSS}","logtype":"%p","loginfo":"%c:%m"}%n 

如图
修改CDH进程NN、DN日志输出为JSON格式

2、同理NameNode:
进入CM界面HDFS组件–>配置–>搜索DataNode log

找到”NameNode 日志记录高级配置代码段(安全阀)”如果英文版则是”NameNode Logging Advanced Configuration Snippet (Safety Valve)”。填写如下参数

log4j.appender.RFA.layout.ConversionPattern = {"time":"%d{yyyy-MM-dd HH:mm:ss,SSS}","logtype":"%p","loginfo":"%c:%m"}%n

3、在CM主界面点击HDFS服务后的黄色电源按钮,使其配置生效

三、查看修改后的日志格式已是JSON格式了

[root@hadoop003 hadoop-hdfs]# tail -F  hadoop-cmf-hdfs-DATANODE-hadoop003.log.out
{"time":"2018-01-15 12:03:53,549","logtype":"INFO","loginfo":"org.apache.hadoop.hdfs.server.datanode.DataNode:Receiving BP-1517073770-172.16.15.80-1508233672475:blk_1074110567_369761 src: /172.16.15.80:51852 dest: /172.16.15.82:50010"}
{"time":"2018-01-15 12:03:53,558","logtype":"INFO","loginfo":"org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace:src: /172.16.15.80:51852, dest: /172.16.15.82:50010, bytes: 56, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_221506452_125617, offset: 0, srvID: c8d02836-fc45-40ec-92fe-120500ea70a9, blockid: BP-1517073770-172.16.15.80-1508233672475:blk_1074110567_369761, duration: 5428474"}
{"time":"2018-01-15 12:03:53,559","logtype":"INFO","loginfo":"org.apache.hadoop.hdfs.server.datanode.DataNode:PacketResponder: BP-1517073770-172.16.15.80-1508233672475:blk_1074110567_369761, type=HAS_DOWNSTREAM_IN_PIPELINE terminating"}
{"time":"2018-01-15 12:04:02,508","logtype":"INFO","loginfo":"org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:Scheduling blk_1074110567_369761 file /data/1/dn/current/BP-1517073770-172.16.15.80-1508233672475/current/finalized/subdir5/subdir160/blk_1074110567 for deletion"}
{"time":"2018-01-15 12:04:02,509","logtype":"INFO","loginfo":"org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:Deleted BP-1517073770-172.16.15.80-1508233672475 blk_1074110567_369761 file /data/1/dn/current/BP-1517073770-172.16.15.80-1508233672475/current/finalized/subdir5/subdir160/blk_1074110567"}
{"time":"2018-01-15 12:04:53,559","logtype":"INFO","loginfo":"org.apache.hadoop.hdfs.server.datanode.DataNode:Receiving BP-1517073770-172.16.15.80-1508233672475:blk_1074110568_369762 src: /172.16.15.80:51929 dest: /172.16.15.82:50010"}
{"time":"2018-01-15 12:04:53,571","logtype":"INFO","loginfo":"org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace:src: /172.16.15.80:51929, dest: /172.16.15.82:50010, bytes: 56, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_68167622_125617, offset: 0, srvID: c8d02836-fc45-40ec-92fe-120500ea70a9, blockid: BP-1517073770-172.16.15.80-1508233672475:blk_1074110568_369762, duration: 7570382"}
{"time":"2018-01-15 12:04:53,573","logtype":"INFO","loginfo":"org.apache.hadoop.hdfs.server.datanode.DataNode:PacketResponder: BP-1517073770-172.16.15.80-1508233672475:blk_1074110568_369762, type=HAS_DOWNSTREAM_IN_PIPELINE terminating"}
{"time":"2018-01-15 12:04:56,510","logtype":"INFO","loginfo":"org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:Scheduling blk_1074110568_369762 file /data/1/dn/current/BP-1517073770-172.16.15.80-1508233672475/current/finalized/subdir5/subdir160/blk_1074110568 for deletion"}
{"time":"2018-01-15 12:04:56,510","logtype":"INFO","loginfo":"org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:Deleted BP-1517073770-172.16.15.80-1508233672475 blk_1074110568_369762 file /data/1/dn/current/BP-1517073770-172.16.15.80-1508233672475/current/finalized/subdir5/subdir160/blk_1074110568"}
相关标签: json hdfs