Is there a way to run NUTCH with different config files?

I was wondering if it is possible to run the same NUTCH instance with a different set of config files? I don't see any parameters in the argument list to allow such a thing.

I only want to run NUTCH on one computer and I don't want to duplicate the nutch instance.

Does anyone know an easy way to do this or do I need to modify the bin / nutch script itself to do this.



source to share

2 answers

This chickpea question should be helpful. The answer describes how to create your own conf directory and point to it using an environment variable $NUTCH_CONF_DIR




You can use Unix symbolic links and change that link in your script. For example, if you have regex-urlfilter-conf_A.txt and regex-urlfilter-conf_B.txt. In a script, before running nutch:

for conf A:

ln -sf $NUTCH_FOLDER/conf/regex-urlfilter-conf_A.txt $NUTCH_FOLDER/conf/regex-urlfilter.txt


for conf B:

ln -sf $NUTCH_FOLDER/conf/regex-urlfilter-conf_B.txt $NUTCH_FOLDER/conf/regex-urlfilter.txt




All Articles