How do I use external libraries with virtualenv?
I am trying to figure out how to use external libraries. I have a program that runs successfully on Spark and now I am trying to import external libraries. I use virtualenv
, and every time I submit it, it complains that it cannot find the file.
Here's one of the many send commands I've tried:
/path/to/spark-1.1.0-bin-hadoop2.4/bin/spark-submit ua_analysis.py --py-files `pwd`/venv/lib/python2.7/site-packages
I tried to add files separately with a flag --py-files
, I also tried the following subdirectories.
venv/lib
venv/python2.7
venv/lib/python2.7/site-packages/<package_name>
They all throw the following error:
ImportError: ('No module named <module>', <function subimport at 0x7f287255dc80>, (<module>,))
org.apache.spark.api.python.PythonRDD$$anon$1.read(PythonRDD.scala:124)
org.apache.spark.api.python.PythonRDD$$anon$1.<init>(PythonRDD.scala:154)
org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:87)
....
I have also tried copying these files to the directory pyspark
with no success.
+3
source to share