Efficiency of searching using whereArrayContains

2019-01-15 20:07发布

I am curious as to the efficiency of searching for documents in a collection using this code. As the number of documents in the collection grows and the number of items in the array grows will this search become very inefficient? Is there a better way of doing this or is there a schema change I can make to the database to better optimize this? Is there somewhere I can find the time complexity of these functions for the firestore documentation maybe?

Query query = db.collection("groups").whereArrayContains("members", userid);


ALTERNATIVE SOLUTION

I originally wanted to try storing the group ids under the user so as to only grab the groups for that current user, but ran into issues and never found a solution for setting a FireStoreRecyclerOptions using multiple ids to query by.

Example:

for(String groupid : list) {
    Query query = db.collection("test-groups").document(groupid);

    FirestoreRecyclerOptions<GroupResponse> response = new FirestoreRecyclerOptions.Builder<GroupResponse>()
            .setQuery(query, GroupResponse.class)
            .build();
}

Is there a way to add multiple queries to the FirestoreRecyclerOptions?

1条回答
The star\"
2楼-- · 2019-01-15 20:26

As the number of documents in the collection grows and the number of items in the array grows will this search become very inefficient?

The problem isn't the fact the that the search will become very inefficient, the problem is that the documents have limits. So there are some limits when it comes to how much data you can put into a document. According to the official documentation regarding usage and limits:

Maximum size for a document: 1 MiB (1,048,576 bytes)

As you can see, you are limited to 1 MiB total of data in a single document. When we are talking about storing text, you can store pretty much but as your array getts bigger, be careful about this limitation.

If you are storing large amount of data in arrays and those arrays should be updated by lots of users, there is another limitation that you need to take care of. So you are limited to 1 write per second on every document. So if you have a situation in which a lot of users al all trying to write/update data to the same documents all at once, you might start to see some of this writes to fail. So, be careful about this limitation too.

As you probably noticed, queries in Cloud Firestore are very fast and this is because Firestore automatically creates an indexes for any fields you have in your document.

If you think that you'll be querying for a parent based on their containing a specific memeber of a collection, then use maps and not arrays.

There many posts out there that say that arrays don't work well on Cloud Firestore because when you have data that can be altered by multiple clients, it's very easy to get confused because you cannot know what is happening and on which field. If I'm using a map and users want to edit several different fields, even the exact same field, we generally know what is happening. In an arrays, thing are different. Try to think what might happen if a user wants to edit a value at index 0, some other user wants to delete the value at index 0 you'll end up having very different result and why not, array out of bounds exceptions. So Firestore actions with arrays are a little bit different. So you cannot perform actions like, insert, update or delete at a specific index. But if don't care about the exact order that you store element into an array, then you should use arrays. Firestore added a few days ago some features to add or remove specific elements but only if don't care about the exact position of them. See here official documentation.

As a conclusion, put data in the same document only if you need it to be display it together. Also don't make them so big so you'll need to download more data that you actually need. So put data in collection when you want to search for individuals fields of that data or if you want your data to have room to grow. Leave your data as a map field if you want to seach your parent object based on that data. And if you got items that you generally use them as flags, go ahead with arrays.

Also don't worry about slow query in Firestore.

查看更多
登录 后发表回答