码迷,mamicode.com
首页 > 系统相关 > 详细

[Linux]结合awk删除hdfs指定日期前的数据

时间:2015-08-11 21:29:39      阅读:189      评论:0      收藏:0      [点我收藏+]

标签:

业务背景

约定五天前的HDFS数据为过期版本数据,写一个脚本自动删除过期版本数据

$ hadoop fs -ls /user/pms/workspace/ouyangyewei/data
Found 9 items
drwxr-xr-x   - pms pms          0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-01
drwxr-xr-x   - pms pms          0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-02
drwxr-xr-x   - pms pms          0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-03
drwxr-xr-x   - pms pms          0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-04
drwxr-xr-x   - pms pms          0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-05
drwxr-xr-x   - pms pms          0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-06
drwxr-xr-x   - pms pms          0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-07
drwxr-xr-x   - pms pms          0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-08
drwxr-xr-x   - pms pms          0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-09

脚本实现

# ---------------------------------------------------------
#
# 删除历史版本(五天前的为过期版本数据)
#
# ---------------------------------------------------------

old_version=$(hadoop fs -ls /user/pms/workspace/ouyangyewei/data | awk ‘BEGIN{ five_days_ago=strftime("%F", systime()-5*24*3600) }{ split($8,arr,"/"); if(arr[7]<five_days_ago){printf "%s\n", $8} }‘)
arr=(${old_version// / })
for version in ${arr[@]} 
do 
    hadoop fs -rmr $version
done

执行以后

$ hadoop fs -ls /user/pms/workspace/ouyangyewei/data
Found 4 items
drwxr-xr-x   - pms pms          0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-06
drwxr-xr-x   - pms pms          0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-07
drwxr-xr-x   - pms pms          0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-08
drwxr-xr-x   - pms pms          0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-09

版权声明:本文为博主原创文章,未经博主允许不得转载。

[Linux]结合awk删除hdfs指定日期前的数据

标签:

原文地址:http://blog.csdn.net/yeweiouyang/article/details/47426823

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!