Is there a way to run NUTCH with different config files?
I was wondering if it is possible to run the same NUTCH instance with a different set of config files? I don't see any parameters in the argument list to allow such a thing.
I only want to run NUTCH on one computer and I don't want to duplicate the nutch instance.
Does anyone know an easy way to do this or do I need to modify the bin / nutch script itself to do this.
Thank.
source to share
This chickpea question should be helpful. The answer describes how to create your own conf directory and point to it using an environment variable $NUTCH_CONF_DIR
.
source to share
You can use Unix symbolic links and change that link in your script. For example, if you have regex-urlfilter-conf_A.txt and regex-urlfilter-conf_B.txt. In a script, before running nutch:
for conf A:
ln -sf $NUTCH_FOLDER/conf/regex-urlfilter-conf_A.txt $NUTCH_FOLDER/conf/regex-urlfilter.txt
for conf B:
ln -sf $NUTCH_FOLDER/conf/regex-urlfilter-conf_B.txt $NUTCH_FOLDER/conf/regex-urlfilter.txt
source to share