RemoteSolrException: ERROR: [doc = 2] unknown field 'firstName'

Question

RemoteSolrException: ERROR: [doc = 2] unknown field 'firstName'

I wrote a Spring project that uses SolrInputDocument to add data from tables. I used doc.addField () method

doc.addField("actorId",a.getId()); doc.addField("firstName",a.getFirstName());

(posting only a few of them) to add the data I got from MySql.

When I try to add these values to the SOLR index, I get the following error.

Exception in thread "main" org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: ERROR: [doc=2] unknown field 'firstName' at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:552) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)

I ask you to help me figure out where I will mention the "id" and "firstName" fields in any other file so that SOLR knows that I am using them as parameters to add data.

+3

java spring mysql solr solrj

user3415038 May 7 '17 at 5:32

source to share

1 answer

freedev · Accepted Answer · 2017-05-08T00:38:30+0000

When the post RemoteSolrException

message ERROR: [doc=2] unknown field ...

clearly means that the field you are trying to insert is not in your index (core or collection).

Absolutely you need to read the Solr documentation because most of Solr's Information Retrieval (IR) logic is hidden in the Solr schema construct. I suggest reading Solr's Overview of Documents, Fields and Schema .

Anyway, I would try to give you a little guidance and advice to avoid what was harder for me to understand.

First of all, you need to recognize the difference between Solr running as a standalone server or SolrCloud mode. The first is a server that has a configuration written locally to disk for each index (named kernel). The latter is a cluster configuration where more Solr instances behave like a single server (i.e. Distributed Search, shards, replicas, failover, etc.) and the configuration is stored in the Zookeeper ensemble.

I highly recommend starting with the offline configuration, apart from all the differences, the offline configuration is readily available on your disk and has all the IR features found in SolrCloud.

Again, you should also know the difference between the index running at managed-schema

and schema.xml

:

managed-schema

is the name of the schema file that Solr uses by default to support making schema changes at runtime through the schema API or Schemaless Mode.

schema.xml

is the traditional name of a schema file that can be manually edited by users using ClassicIndexSchemaFactory.

In this case, it is important to understand that in Solr you can define a class of fields, for example, an entire field with a name ending in _s

(string) or _i

(integer), these classes are called in Solr Dynamic fields .

In configuration managed-schema

(aka Schemaless), all the most important field types are ready to use (e.g. strings, integers, booleans, dates, currency, text_general, etc.). This makes it possible to load your data instantly, all you have to do is add the correct suffix at the end of each field:

productName

becomes productName_s
manufacturer

becomes manufacturer_s
quantity

becomes manufacturer_i
dateInvoice

become dateInvoice_d
price

becomes price_c

Dynamic fields can be accessed in either schematic or traditional schematic mode.

So why is this difference? Well, part of the historical reasons, I think Solr engineers were trying to let users load their data into Solr indexes more easily. But when you start writing your own schema.xml

, then you will finally have access to the power of IR that has made Solr and the Lucene engine so famous and one of the best open source, open source full-text servers.

Most likely you are already using schemaless mode in your index, so just change your field name to firstName_s

and try loading the data again.

Regarding the field id

, in schematization mode, the id

field is a special field used as the primary key and is a kind of "reserved name, you don't need to add a suffix.

The field id

is of type string.

RemoteSolrException: ERROR: [doc = 2] unknown field 'firstName'

More articles: