Copy all files to adoop directory except 1
I am writing a shell script to put all my files in the hadoop directory.
I used the command:
hadoop dfs -put /opt/nikoo28/resources/conf ./
Now this copies the conf folder to my home directory of the howop, overwriting everything.
However, there is one file "doNotCopy.txt" that I don't want to copy. Is there a way that I can skip a specific file?
source to share
Add these lines to your shell script:
mkdir /opt/copy
mv /opt/nikoo28/doNotCopy.txt /opt/copy/doNotCopy.txt
hadoop dfs -put /opt/nikoo28/resources/conf ./ && mv /opt/copy/doNotCopy.txt /opt/nikoo28/doNotCopy.txt
Just move the file you don't want to copy to another folder. Run the hadoop fs -put command . Now return the file to its original position.
If you want to preserve file permissions, do the following:
mkdir /opt/copy
cp -p /opt/nikoo28/doNotCopy.txt /opt/copy/doNotCopy.txt
rm /opt/nikoo28/doNotCopy.txt
hadoop dfs -put /opt/nikoo28/resources/conf ./ && cp -p /opt/copy/doNotCopy.txt /opt/nikoo28/doNotCopy.txt
NOTE: Add sudo if you get permission errors when creating a directory, moving a file, or copying a file.
source to share
I see in Apache Hadoop docs #put :
Usage: hasoop fs -put ...
Copy a single src or multiple srcs from the local file system to the target file system. Also reads input from stdin and writes to the target filesystem.
And then a useful example
hasoop fs -put - hdfs: //nn.example.com/hadoop/hadoopfile Reads input from stdin.
So maybe you can use the find
grepping expression of that file and then go to hadoop
:
find /opt/nikoo28/resources/conf ! -name "doNotCopy.txt" | hadoop dfs -put - ./
source to share