Solr Tomcat org.apache.solr.common.SolrException:

Ubuntu 14.04

I installed using sudo apt-get install solr-tomcat.

It seems the "core" functionality was installed and is working, but not any Plugins (or I just don't know where to look).

I am trying to use the extract function, which is a Plugin.

When I attempt, I get this

org.apache.solr.common.SolrException: lazy loading error at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:260) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.solr.common.SolrException: Error loading class 'solr.extraction.ExtractingRequestHandler' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:414) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:419) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:455) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:251) ... 16 more Caused by: java.lang.ClassNotFoundException: solr.extraction.ExtractingRequestHandler at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:274) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:398) ... 19 more

I manually downloaded the latest version just to investigate, and did find an "extraction" directory, and, within, a file called ExtractingRequestHandler.java.

I'm not a Java guy. I looked for a similar path to what the error indicated and found "/usr/share/maven-repo/org/apache/solr/solr-core", which does not strictly correspond with "org.apache.solr.core". So, I'm hesitant to assume anything here.

Solr is searching for the "extraction" directory - I'm just not sure where.

My solrconfig.xml file has this

<requestHandler name="/update/extract" 
                  startup="lazy"
                  class="solr.extraction.ExtractingRequestHandler" >
    <lst name="defaults">
      <!-- All the main content goes into "text"... if you need to return
           the extracted text or do highlighting, use a stored field. -->
      <str name="fmap.content">text</str>
      <str name="lowernames">true</str>
      <str name="uprefix">ignored_</str>

      <!-- capture link hrefs but ignore div attributes -->
      <str name="captureAttr">true</str>
      <str name="fmap.a">links</str>
      <str name="fmap.div">ignored_</str>
    </lst>
  </requestHandler>

I tried copying that extraction directory into both these directories: /usr/share/maven-repo/org/apache/solr/ /etc/solr

in hopes that either was the 'home' solr directory, no luck.

I added

<lib dir="/var/lib/solr/contrib/extraction/src/java/org/apache/solr/handler/extraction" />

to my solrconfig.xml and observed this in tomcat logs:

INFO: Adding 'file:/var/lib/solr/contrib/extraction/src/java/org/apache/solr/handler/extraction/ExtractingRequestHandler.java' to classloader

Still, no luck.

Thanks in advance.

Edit:

I checked my tomcat logs and I see the following:

Nov 23, 2014 12:53:44 AM org.apache.solr.core.RequestHandlers initHandlersFromConfig INFO: adding lazy requestHandler: solr.extraction.ExtractingRequestHandler

It's odd because it indicates that a "lazy requestHandler" was created at solr.extraction.ExtractingRequestHandler, which contradicts the error:

Error loading class 'solr.extraction.ExtractingRequestHandler'

If I remove the "startup = 'lazy'" attribute from the config above, when I restart Tomcat, there is an error:

SEVERE: org.apache.solr.common.SolrException: Error loading class 'solr.extraction.ExtractingRequestHandler'

It seems you're adding a source directory to the classpath - you'll need to add the compiled version with its dependencies to the classpath instead.

In the normal distribution (from the Solr homepage) this is located in contrib/extraction/lib, where it also bundles the other dependencies for the extraction module.

From the README.txt in that directory:

Getting Started

You will need Solr up and running. Then, simply add the extraction JAR file, plus the Tika dependencies (in the ./lib folder) to your Solr Home lib directory. See http://wiki.apache.org/solr/ExtractingRequestHandler for more details on hooking it in and configuring.

Although it seems you're following an older version of how to do this (from your request handler name). The current configuration is detailed in the community wiki, and there's a complete example bundled in the distribution as well.

Retrieving and extracting the files:

$ wget http://<mirror>/lucene/solr/4.10.2/solr-4.10.2.tgz
$ tar xvzf solr-4.10.2.tgz
$ cd solr-4.10.2/contrib/extraction/lib/
$ ls
apache-mime4j-core-0.7.2.jar    pdfbox-1.8.4.jar
apache-mime4j-dom-0.7.2.jar     poi-3.10.1.jar
aspectjrt-1.6.11.jar            poi-ooxml-3.10.1.jar
bcmail-jdk15-1.45.jar           poi-ooxml-schemas-3.10.1.jar
bcprov-jdk15-1.45.jar           poi-scratchpad-3.10.1.jar
boilerpipe-1.1.0.jar            rome-0.9.jar
commons-compress-1.7.jar        tagsoup-1.2.1.jar
dom4j-1.6.1.jar                 tika-core-1.5.jar
fontbox-1.8.4.jar               tika-parsers-1.5.jar
icu4j-53.1.jar                  tika-xmp-1.5.jar
isoparser-1.0-RC-1.jar          vorbis-java-core-0.1.jar
jdom-1.0.jar                    vorbis-java-tika-0.1.jar
jempbox-1.8.4.jar               xercesImpl-2.9.1.jar
jhighlight-1.0.jar              xmlbeans-2.6.0.jar
juniversalchardet-1.0.3.jar     xmpcore-5.1.2.jar
metadata-extractor-2.6.2.jar    xz-1.4.jar