Basic UIMA with SOLR

I am trying to connect UIMA to Solr. I downloaded Solr 3.5 dist and worked successfully with nutch and tika on windows 7 using solrcell and curl via cygwin. To begin with, I copied 6 cans from solr/contrib/uima/lib

to work /lib

in solr. Then I read the readme.txt file in solr/contrib/uima/lib

and edited both solrconfig.xml and schema.xml files to no avail. Then I found this link which seemed a little more applicable as I didn't want to use Alchemy or OpenCalais: Topics however, when I run curl, which imports the pdf file via solrcell, I am not getting the extra UIMA fields, and I am not getting anything from my logs. Test.pdf is parsed and I can see the pdf in Solr using:

curl 'http://localhost:8080/solr/update/extract?fmap.content=content&' -F "file=@test.pdf"



<updateRequestProcessorChain name="uima">
  <processor class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory">
    <lst name="uimaConfig">
      <lst name="runtimeParameters">
        <str name="host">http://localhost</str>
        <str name="port">8080</str>
      <str name="analysisEngine">C:\uima\desc\com\rondhuit\uima\desc\NextAnnotatorDescriptor.xml</str>
      <bool name="ignoreErrors">true</bool>
      <str name="logField">id</str>
      <lst name="analyzeFields">
        <bool name="merge">false</bool>
        <arr name="fields">
      <lst name="fieldMappings">
        <lst name="type">
          <str name="name"></str>
          <lst name="mapping">
            <str name="feature">entity</str>
            <str name="fieldNameFeature">uname</str>
            <str name="dynamicField">*_sm</str>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />

<requestHandler name="/update/uima" class="solr.XmlUpdateRequestHandler">
  <lst name="defaults">
    <str name="update.chain">uima</str>


AND I ALSO REGULATED MY request by Hand:

<requestHandler name="/update" class="solr.XmlUpdateRequestHandler">
    <lst name="defaults">
      <str name="update.processor">uima</str>



<!-- fields for UIMA -->
<field name="uname" type="string" indexed="true" stored="true" multiValued="true" required="false"/>
<dynamicField name="*_sm"  type="string"  indexed="true"  stored="true"/>


All I am trying to do is get UIMA to pull names from text (just to start as a demo) and cannot figure out what I am doing wrong. Thank you in advance for that.


source to share

1 answer

Not sure if this has ever been addressed, but in case anyone else is looking, I had the same issue yesterday. It turned out that I was calling / updating / fetching to use solrcell, which does not use uima because it is integrated into / update.



All Articles