When using Solr client in your app, what is the max size of a text
multi line field?
Can I send huge xml documents as text?
E.g.
SolrInputDocument document = new SolrInputDocument();
document.addField("id", rec.getId());
document.addField("hugeTextFile_txt", hugeTextFile);
UpdateResponse response = solr.add(document);
solr.commit();
Update
I used the same unit test using text
fieldType. Below is the declaration I used. Please note that I have removed analyzer section from declaration.
<fieldType name="text" class="solr.TextField"/>
I was able to add 500,000,000 characters and index it successfully. For higher value I got Java heap space
error, which is not related to the solr.
I tried to perform a simple test by adding a large value to a field. The limit I found is 32,766 bytes. After that It throws IllegalArgumentException
. The fieldType
for email
was string
.
<fieldType name="string" class="solr.StrField" sortMissingLast="true" />
@Test
public void test() throws IOException, SolrServerException {
SolrInputDocument document = new SolrInputDocument();
document.addField("profileId", TestConstants.PROFILE_ID);
StringBuilder builder = new StringBuilder();
for (int i = 0; i<32767; i++) {
builder.append((char)((i%26)+'a'));
}
document.addField("email", builder.toString());
solrClient.add(document);
solrClient.commit();
}
Exception thrown by above for 32767 and more:
Caused by: java.lang.IllegalArgumentException: Document contains at least one immense term in field="email" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 97, 98, 99, 100]...', original message: bytes can be at most 32766 in length; got 32767
I hope this would help.
changing the solr field to "text_general" and updating the solr schema helped
commands to update solr schema:
solrctl instancedir --update "directory that contains the schema file with the edited solr field"
solrctl collection --update "collection-name to update"