Why is Azure Search taking 1400 miliiseconds to re

2019-06-14 19:16发布

问题:

I have an index in azure search which has the following json

        "id": "1847234520751",
        "orderNo": "1847234520751",
        "orderType": "ONLINE",
        "orderState": "OPROCESSING",
        "orderDate": "2018-10-02T18:28:07Z",
        "lastModified": "2018-11-01T19:13:46Z",
        "docType": "SALES_ORDER",
        "paymentType": "PREPAID",
        "buyerInfo_primaryContact_name_firstName": "",
        "buyerInfo_primaryContact_name_lastName": "",
        "buyerInfo_primaryContact_email_emailAddress": "test@gmail.com"

I have indexed almost 0.8 million documents and have written the following JAVA code to query azure search

        IndexSearchOptions options = new IndexSearchOptions();
        options.setSearchFields("orderNo");
        long startTime1 = System.currentTimeMillis();
        IndexSearchResult result = indexClient.search(filter, options);
        long stopTime1 = System.currentTimeMillis();
        long elapsedTime1 = stopTime1 - startTime1;
        System.out.println("elapsed time " + elapsedTime1);

The timings for this comes out to be 1400 miliseconds. If anyone can help me reduce this time, it would be really really helpful

回答1:

If you are trying to simply return a document based on an orderNo, rather than doing a full text search, I would recommend using the "Lookup" API to do so

https://docs.microsoft.com/en-us/rest/api/searchservice/lookup-document

Also, using a client side timer to calculate elapsed time will not give you accurate results. The time elapsed will be affected by many factors, including your client machine configuration and your network performance. If you are interested in how much time it took the server to process your request, I would suggest experimenting with the REST api, and then inspect the "elapsed-time" value in the response header of your search query. This will be more useful for monitoring your search performance as it will omit any time spent on the network. If you do so, I would suggest running multiple queries and then take the average elapsed time as a metric.

If you see that te elapsed time is quick, but that the search query it still relatively slow due to network performance issues, then make sure to re-use the Search Client object in between calls, rather than creating a new one for each call, as this is a common reason queries do not get optimal latency.

Finally, here's a full article about tuning performance for your Azure Search service.

https://docs.microsoft.com/en-us/azure/search/search-performance-optimization

In your case, it seems like you are trying to speed up single query performances, rather than trying to increase how many queries can be handled at once. If your query was particularly complex (e.g. trying to return a lot of documents while using sorting and faceting), increasing the number of partitions could help, as your 0.8 million document will be spread across multiple machines, allowing each of them to execute the search over a smaller amount of documents in parallel, rather than relying on a single machine to process the full load. However, in your case, the query look relatively simple, so my suggestion would be as I mentioned above and collect accurate metric first to understand if the bottleneck is during the processing of the request or if its network related.

Hope this helps