How do I dump a file into a Hadoop HDFS directory using pyrenean salting?
I am in a virtual machine in the directory that contains my Python (2.7) class. I am trying to expand an instance of my class to a directory in my HDFS.
I am trying to do something like:
import pickle
my_obj = MyClass() # the class instance that I want to pickle
with open('hdfs://domain.example.com/path/to/directory/') as hdfs_loc:
pickle.dump(my_obj, hdfs_loc)
From what research I've done, I think something like snakebite might help ... but does anyone have more specific suggestions?
+3
source to share
1 answer
Here's the job if you are running a Jupyter notebook with sufficient permissions:
import pickle
my_obj = MyClass() # the class instance that I want to pickle
local_filename = "pickle.p"
hdfs_loc = "//domain.example.com/path/to/directory/"
with open(local_filename, 'wb') as f:
pickle.dump(my_obj, f)
!!hdfs dfs -copyFromLocal $local_filename $hdfs_loc
+1
source to share