quotas on appengine search api for java

2019-02-12 21:48发布

问题:

I am testing the new app engine search api for java and I have the following code that tries to add ~3000 documents on an index:

List<Document> documents = new ArrayList<Document>();
    for (FacebookAlbum album: user.listAllAlbums()) {
        Document doc = Document.newBuilder()
                .setId(album.getId())
                .addField(Field.newBuilder().setName("name").setText(album.getFullName()))
                .addField(Field.newBuilder().setName("albumId").setText(album.getAlbumId()))
                .addField(Field.newBuilder().setName("createdTime").setDate(Field.date(album.getCreatedTime())))
                .addField(Field.newBuilder().setName("updatedTime").setDate(Field.date(album.getUpdatedTime())))
                .build();
        documents.add(doc);
    }     

    try {
        // Add all the documents.
        getIndex(facebookId).add(documents);
    } catch (AddException e) {
        if (StatusCode.TRANSIENT_ERROR.equals(e.getOperationResult().getCode())) {
            // retry adding document
        }
    }

However, I am getting the following exception:

Uncaught exception from servlet
java.lang.IllegalArgumentException: number of documents, 3433, exceeds maximum 200
at com.google.appengine.api.search.IndexImpl.addAsync(IndexImpl.java:196)
at com.google.appengine.api.search.IndexImpl.add(IndexImpl.java:380)
at photomemories.buildIndexServlet.doGet(buildIndexServlet.java:47)

Is there a quota on the number of documents I can insert with an add call set to 200?

If I try to insert one document at a time to the index with the following code:

 for (FacebookAlbum album: user.listAllAlbums()) {
        Document doc = Document.newBuilder()
                .setId(album.getId())
                .addField(Field.newBuilder().setName("name").setText(album.getFullName()))
                .addField(Field.newBuilder().setName("albumId").setText(album.getAlbumId()))
                .addField(Field.newBuilder().setName("createdTime").setDate(Field.date(album.getCreatedTime())))
                .addField(Field.newBuilder().setName("updatedTime").setDate(Field.date(album.getUpdatedTime())))
                .build();

         try {
            // Add the document.
            getIndex(facebookId).add(doc);
        } catch (AddException e) {
            if (StatusCode.TRANSIENT_ERROR.equals(e.getOperationResult().getCode())) {
                // retry adding document
            }
        }

    }     

I am getting the following exception:

com.google.apphosting.api.ApiProxy$OverQuotaException: The API call search.IndexDocument() required more quota than is available.
at com.google.apphosting.runtime.ApiProxyImpl$AsyncApiFuture.success(ApiProxyImpl.java:479)
at com.google.apphosting.runtime.ApiProxyImpl$AsyncApiFuture.success(ApiProxyImpl.java:382)
at com.google.net.rpc3.client.RpcStub$RpcCallbackDispatcher$1.runInContext(RpcStub.java:786)
at com.google.tracing.TraceContext$TraceContextRunnable$1.run(TraceContext.java:455)

I thought the quota on the api calls was 20k/day (see here: https://developers.google.com/appengine/docs/java/search/overview#Quotas).

Any ideas on what is going on ?

回答1:

There are a few things going on here. Most importantly, and this is something that will be clarified in the documentation very soon, the Search API Call quota also accounts for the number of documents being added/updated. So a single Add call that inserts 10 documents will reduce your daily Search API Call quota by 10.

Yes, the maximum number of documents that may be indexed in a single add call is 200. However, at this stage there is also a short term burst quota in place that limits you to about 100 API calls per minute.

All the above means that, for now at least, it's safest to not add more than 100 documents per Add request. Doing so via Task Queue as recommended by Shay is also a very good idea.



回答2:

I think (can't find a validation for it) that there is a per minute quota limit, you should index your documents using a queue to make sure you gradually index them.



回答3:

Docs mention a per minute quota also, 20k is only 13.9 per minute.

https://developers.google.com/appengine/docs/quotas