Delete partition folders in hdfs older than N days
I want to delete partition folders older than N days.
The command below gives folders that are exactly 50 days ago. I want a list of all folders that are less than 50 days old
hadoop fs -ls /data/publish/DMPD/VMCP/staging/tvmcpr_usr_prof/chgdt=`date --date '50 days ago' +\%Y-\%m-\%d`
0
source to share
2 answers
This can be done with a bash
script
today=`date +'%s'`
hdfs dfs -ls /data/publish/DMPD/VMCP/staging/tvmcpr_usr_prof/ | grep "^d" | while read line ; do
dir_date=$(echo ${line} | awk '{print $6}')
difference=$(( ( ${today} - $(date -d ${dir_date} +%s) ) / ( 24*60*60 ) ))
filePath=$(echo ${line} | awk '{print $8}')
if [ ${difference} -lt 50 ]; then
echo "${filepath}"
fi
done
0
source to share