Integrate Solr-5.2.1 with Nutch crawl data?
The tutorial says:
Nutch has already generated crawl data from the seed url. Following are the steps to delegate search to Solr for searchable links:
Backing up the original Solr example
schema.xml
:
mv ${APACHE_SOLR_HOME}/example/solr/collection1/conf/schema.xml ${APACHE_SOLR_HOME}/example/solr/collection1/conf/schema.xml.org
But the problem is that such a directory does not exist, for example /example/solr/collection1/conf
.
In which directory will I find this file schema.xml
? Or which file schema.xml
to replace?
source to share
AFAIK Solr 5.x uses a default managed schema that will be generated on the fly based on input documents. However, you can copy the contents of the schema.xml file to ${APACHE_SOLR_HOME}/solr/server/solr/$CORE_NAME/conf/managed-schema
. But before copying your schema, make sure it is in 5.x schema format. (Some of the old circuit components may have changed)
source to share