How to insert into documentDB from Excel file cont

2019-05-17 09:52发布

问题:

I have an Excel file that originally had about 200 rows, and I was able to convert the excel file to a data table and everything got inserted into the documentdb correctly.

The Excel file now has 5000 rows and it is not inserting after 30-40 records insertion and rest of all the rows are not inserted into the documentdb

I found some exception as below.

Microsoft.Azure.Documents.DocumentClientException: Exception: Microsoft.Azure.Documents.RequestRateTooLargeException, message: {"Errors":["Request rate is large"]}

My code is :

    Service service = new Service();
    foreach(data in exceldata) //exceldata contains set of rows
    {
    var student = new Student();
    student.id= "";
    student.name = data.name;
    student.age = data.age;
    student.class = data.class;
    student.id = service.savetoDocumentDB(collectionLink,student); //collectionlink is a string stored in web.config
    students.add(student);
    }

Class Service
{
 public async Task<string> AddDocument(string collectionLink, Student data)
        {
            this.DeserializePayload(data);
            var result = await Client.CreateDocumentAsync(collectionLink, data);
            return result.Resource.Id;
        }
}

Am I doing anything wrong? Any help would be greatly appreciable.

回答1:

Update:

As of 4/8/15, DocumentDB has released a data import tool, which supports JSON files, MongoDB, SQL Server, and CSV files. You can find it here: http://www.microsoft.com/en-us/download/details.aspx?id=46436

In this case, you can save your Excel file as a CSV and then bulk-import records using the data import tool.

Original Answer:

DocumentDB Collections are provisioned 2,000 request-units per second. It's important to note - the limits are expressed in terms of request-units and not requests; so writing larger documents costs more than smaller documents, and scanning is more expensive than index seeks.

You can measure the overhead of any operations (CRUD) by inspecting the x-ms-request-charge HTTP response header or the RequestCharge property in the ResourceResponse/FeedResponse objects returned by the SDK.

A RequestRateTooLargeException is thrown when you exhaust the provisioned throughput. Some solutions include:

  • Back off w/ a short delay and retry whenever you encounter the exception. A recommended retry delay is included in the x-ms-retry-after-ms HTTP response header. Alternatively, you could simply batch requests with a short delay
  • Use lazy indexing for faster ingestion rate. DocumentDB allows you to specify indexing policies at the collection level. By default, the index is updated synchronously on each write to the collection. This enables the queries to honor the same consistency level as that of the document reads without any delay for the index to “catch up”. Lazy indexing can be used to amortize the work required to index content over a longer period of time. It is important to note, however, that when lazy indexing is enabled, query results will be eventually consistent regardless of the consistency level configured for the DocumentDB account.
  • As mentioned, each collection has a limit of 2,000 RUs - you can increase throughput by sharding / partitioning your data across multiple collections and capacity units.
  • Delete empty collections to utilize all provisioned throughput - every document collection created in a DocumentDB account is allocated reserved throughput capacity based on the number of Capacity Units (CUs) provisioned, and the number of collections created. A single CU makes available 2,000 request units (RUs) and supports up to 3 collections. If only one collection is created for the CU, the entire CU throughput will be available for the collection. Once a second collection is created, the throughput of the first collection will be halved and given to the second collection, and so on. To maximize throughput available per collection, I'd recommend the number of capacity units to collections is 1:1.

References:

  • DocumentDB Performance Tips: http://azure.microsoft.com/blog/2015/01/27/performance-tips-for-azure-documentdb-part-2/
  • DocumentDB Limits: http://azure.microsoft.com/en-us/documentation/articles/documentdb-limits/