Update
I've configured both xms (initial memory) and xmx (maximum memory allocation jvm paramters, after a restart i've hooked up Visual VM to monitor the Tomcat memory usage. While the indexing process is running, the memory usage of Tomcat seems ok, memory consumption is in range of defined jvm params. (see image)
So it seems that filesystem buffers are consuming all the leftover memory, and does not drop memory? Is there a way handle this behaviour, like change nGram size or directoryFactory?
I'm pretty new to Solr and Tomcat, but here we go:
OS Windows server 2008
Tomcat Service version 7.0 (64 bit)
- Only running Solr
- No optional JVM parameters set, but Solr config through GUI
Solr version 4.5.0.
- One Core instance (both for querying and indexing)
Schema config:
- minGramSize="2" maxGramSize="20"
- most of the fields are stored = "true" (required)
Solr config:
- ramBufferSizeMB: 100
- maxIndexingThreads: 8
- directoryFactory: MMapDirectory
- autocommit: maxdocs 10000, maxtime 15000, opensearcher false
- cache (defaults):
filtercache initialsize:512 size: 512 autowarm: 0
queryresultcache initialsize:512 size: 512 autowarm: 0
documentcache initialsize:512 size: 512 autowarm: 0
We're using a .Net Service (based on Solr.Net) for updating and inserting documents on a single Solr Core instance. The size of documents sent to Solr vary from 1 Kb up to 8Mb, we're sending the documents in batches, using one or multiple threads. The current size of the Solr Index is about 15GB.
The indexing service is running around 3 a 4 hours to complete all inserts and updates to Solr. While the indexing process is running the Tomcat process memory usage keeps growing up to > 7GB Ram and does not reduce, even after 24 hours.
After a restart of Tomcat, or a Reload Core in the Solr Admin the memory drops back to 1 a 2 GB Ram. Memory leak?
Is it possible to configure the max memory usage for the Solr process on Tomcat?
Are there other alternatives? Best practices?
Thanks
You can setup JVM memory setting on tomcat. I usually do this with setenv.bat file in bin directory of tomcat (same directory as the catalina.bat/.sh files are).
Adjust following values as per your needs:
set JAVA_OPTS=%JAVA_OPTS% -Xms256m -Xmx512m"
Here are clear instruction on it:
http://wiki.razuna.com/display/ecp/Adjusting+Memory+Settings+for+Tomcat
At first you have to set XMX parameter to limit maximum memory that can be used by Tomcat. But in case of SOLR you have to remember that it uses a lot of memory outside of JVM to handle filesystem buffers. So never use more than 50% of available memory for Tomcat in this case.
I have the following setup (albeit a much smaller problem)...
5000 documents, document sizes range from 1MB to 30MB.
We have a requirement to run under 1GB for the Tomcat process on a 2 CPU / 2GB system
After bit of experimentation I came up with these settings for JAVA.
-Xms448m
-Xmx768m
-XX:+UseConcMarkSweepGC
-XX:+UseParNewGC
-XX:ParallelCMSThreads=4
-XX:PermSize=64m
-XX:MaxPermSize=64m
-XX:NewSize=384m
-XX:MaxNewSize=384m
-XX:TargetSurvivorRatio=90
-XX:SurvivorRatio=6
-XX:+CMSParallelRemarkEnabled
-XX:CMSInitiatingOccupancyFraction=55
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+OptimizeStringConcat
-XX:+UseCompressedOops
-XX:MinHeapFreeRatio=5
-XX:MaxHeapFreeRatio=5
These helped but I encountered issues with the OutOfMemory and Tomcat using too much memory even with such a small dataset.
Solution Or things/configuration I have set so far that seem to hold well are as follows:
- Disable all caches other than QueryResultCache
- Do not include text/content fields in your query only include the id
- Do not use row size greater than 10 and do not include highlighting.
- If you are using highlighting (this is the biggest culprit), get the document identifiers from the query first and then do the query again with highlighting and the search terms with the id field included.
Finally for the memory problem. I had to grudgingly implement an unorthodox approach to solve the tomcat/java memory hogging issue (as java never gives back memory to the OS).
I created a memory governor service that runs with debug privilege and calls windows API to force tomcat process to release memory. I also have a global mutex to prevent access to tomcat while this happens when a call comes in.
Surprisingly this approach is working out well but not without its own perils if you do not have the option to control access to Tomcat.
If you find a better solution/configuration changes please let us know.