Handling multiple documents from Elastic Search us

2019-08-31 05:27发布

问题:

Am fetching documents from elastic search using Java API. I am able to fetch only one document from the responseBody properly.

How can i handle if i get multiple documents as response.

Earlier i used RestHighLevelClient with that API i able to handle multiple documents with the help of SearchHit[] searchHits = searchResponse.getHits().getHits();.

With RestClient API, am not able to do that.,

Please find my below code, that am able to fetch document from elastic search and parsing it as JSON Object. ( working properly for single document)

private final static String ATTACHMENT = "document_attachment";
    private final static String TYPE = "doc";
    static long BUFFER_SIZE = 520 * 1024 * 1024;   //  <---- set buffer to 520MB instead of 100MB


    public static void main(String args[])
    {
        RestClient restClient = null;
        Response contentSearchResponse=null;
        String responseBody = null;
        JSONObject source = null;
        String path = null;
        String filename = null;
        int id = 0;
        ResponseHits responseHits = null;

        RestClientBuilder builder =  null; 

        try {

        restClient = RestClient.builder(
                        new HttpHost("localhost", 9200, "http"),
                        new HttpHost("localhost", 9201, "http")).build();

        } catch (Exception e) {
            System.out.println(e.getMessage());
        }

        SearchRequest contentSearchRequest = new SearchRequest(ATTACHMENT); 
        SearchSourceBuilder contentSearchSourceBuilder = new SearchSourceBuilder();
        contentSearchRequest.types(TYPE);
        QueryBuilder attachmentQB = QueryBuilders.matchQuery("attachment.content", "activa");
        contentSearchSourceBuilder.query(attachmentQB);
        contentSearchSourceBuilder.size(50);
        contentSearchRequest.source(contentSearchSourceBuilder);
        System.out.println("Request --->"+contentSearchRequest.toString());

        Map<String, String> params = Collections.emptyMap();
        HttpEntity entity = new NStringEntity(contentSearchSourceBuilder.toString(), ContentType.APPLICATION_JSON);
        HttpAsyncResponseConsumerFactory.HeapBufferedResponseConsumerFactory consumerFactory =
                new HttpAsyncResponseConsumerFactory.HeapBufferedResponseConsumerFactory((int) BUFFER_SIZE);


        try {
            contentSearchResponse = restClient.performRequest("GET", "/document_attachment/doc/_search", params, entity, consumerFactory);
        } catch (IOException e1) {
            e1.printStackTrace();
        } 
        try {
            responseBody = EntityUtils.toString(contentSearchResponse.getEntity());
        } catch (ParseException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
        System.out.println("Converting to JSON");
        JSONObject jsonObject = new JSONObject(responseBody);
        JSONObject  hits = jsonObject.getJSONObject("hits");
        JSONArray hitsArray=hits.getJSONArray("hits");
        for(int i=0;i<hitsArray.length();i++) {
            JSONObject obj= hitsArray.getJSONObject(i);
            source = obj.getJSONObject("_source");
            id = Integer.parseInt(source.opt("id").toString());
            path = source.optString("path");
            filename = source.optString("filename");

        }

        JSONObject jsonBody = new JSONObject();
        jsonBody.put("id", id);
        jsonBody.put("path", path);
        jsonBody.put("filename", filename);
        System.out.println("Response --->"+jsonBody.toString());

        }

回答1:

If you use

RestClientBuilder builder = RestClient.builder(
            new HttpHost("localhost", 
            9200, 
            "http"));
RestHighLevelClient restHighLevelClient = new RestHighLevelClient(builder);

You can fetch multiple results like this:

SearchResponse search1 = restHighLevelClient.search(searchRequest);
for (SearchHit hit : searchResponse.getHits()) {
        try {
            Map<String, Object> sourceAsMap = hit.getSourceAsMap();
            JSONObject jo = new JSONObject(hit.getSourceAsMap());
         } catch (JSONException) {
            //TODO do some useful here
            //e.printStackTrace();
         }
}

So you can iterate over multiple hits of your request. And don't have the Elasticserach relates output in your resultset.



回答2:

Use scroll api. which will useful when resultset is large.

From Doc

While a search request returns a single “page” of results, the scroll API can be used to retrieve large numbers of results (or even all results) from a single search request, in much the same way as you would use a cursor on a traditional database.

Similar links

Elastic Search Scroll Behaviour

Documentation

Parallel Scan & Scroll an Elasticsearch Index