标签:app run cal step self pre roo values nal
代码:
# cat pv_hour.py #!/usr/bin/env python # coding=utf-8 from mrjob.job import MRJob from nginx_accesslog_parser import NginxLineParser class PvDay(MRJob): nginx_line_parser = NginxLineParser() def mapper(self, _, line): self.nginx_line_parser.parse(line) _, tm = str(self.nginx_line_parser.time_local).split() h, m, s = tm.split(‘:‘) yield h, 1 # 每小时的 def reducer(self, key, values): yield key, sum(values) def main(): PvDay.run() if __name__ == ‘__main__‘: main()
执行结果
# python3 pv_hour.py access_all.log-20161227 No configs found; falling back on auto-configuration Creating temp directory /tmp/pv_hour.root.20161228.025503.341576 Running step 1 of 1... Streaming final output from /tmp/pv_hour.root.20161228.025503.341576/output... "14" 21158 "15" 20958 "16" 16080 "17" 14194 "18" 13114 "19" 16898 "20" 18870 "21" 14067 "22" 14053 "23" 12683 "00" 13185 "01" 14785 "02" 12449 "03" 7364 "04" 3628 "05" 9074 "06" 9317 "07" 11887 "08" 13492 "09" 19564 "10" 18390 "11" 15697 "12" 17518 "13" 18785 Removing temp directory /tmp/pv_hour.root.20161228.025503.341576...
三、基于hadoop的nginx访问日志分析--计算时刻pv
标签:app run cal step self pre roo values nal
原文地址:http://www.cnblogs.com/xiaoming279/p/6228622.html