Firebase Database Bandwidth Calculation

2019-03-11 16:37发布

问题:

I've published an android app 2 weeks ago called MyPetrol and within three days hit roughly 90k users in Malaysia. After that, I took down the app due to huge Firebase Database bandwidth consumption (117GB for the 3 days). I'm a self-taught hobbyist who do not come from IT related background, so I'm really troubled by this. Hope that someone can help.

The app is a crowd-sourcing app for petrol price. User can enter the price of petrol at a particular station and other users who agree with the stated price can "like" it. Once a price is updated, the "like" count is reset.

Upon opening the app, it queries Google Places API Web Services for nearby petrol station (max 20 stations). With that, it hooks listeners to the stations' data in Firebase Database. The data structure for each station looks like this.

prices{
    ChIJpXJ4phI4zDERJqFTBzawpXk={
        placeID='ChIJpXJ4phI4zDERJqFTBzawpXk',
        company=500,
        lat=3.2095573,
        lng=101.7185698,
        name='Shell Malaysia (Iznora Enterprise)',
        firebaseID='xxx',
        userName='xxx',
        time=1491833181946,
        ron95=2.0,
        ron97=1.7,
        diesel=2.0,
        isValid=true
    }
}

To keep track of the likes, there is a section of data as

likes{
    ChIJpXJ4phI4zDERJqFTBzawpXk={
        firebaseID1=true,
        firebaseID2=true,
        firebaseID3=true
    }
}

I read Firebase database bandwidth usage when read with Query that we can use (Firebase.getDefaultConfig().setLogLevel(Level.DEBUG)) for traffic check, but I didn't see anything regarding the upload and download bandwidth in logcat. Only something like this...

04-10 22:39:19.250 3015-3192/? D/RepoOperation: onDataUpdate: /prices/ChIJpXJ4phI4zDERJqFTBzawpXk
04-10 22:39:19.250 3015-3192/? D/RepoOperation: onDataUpdate: /prices/ChIJpXJ4phI4zDERJqFTBzawpXk {time=1491833181946, firebaseID=xxx, valid=true, diesel=2, ron97=1.7000000476837158, ron95=2, placeID=ChIJpXJ4phI4zDERJqFTBzawpXk, name=Shell Malaysia (Iznora Enterprise), userName=xxx, company=500, lat=3.2095573, lng=101.7185698}
04-10 22:39:19.268 3015-3015/? D/EventRaiser: Raising /prices/ChIJpXJ4phI4zDERJqFTBzawpXk: VALUE: {time=1491833181946, firebaseID=xxx, valid=true, diesel=2, ron97=1.7000000476837158, ron95=2, placeID=ChIJpXJ4phI4zDERJqFTBzawpXk, name=Shell Malaysia (Iznora Enterprise), userName=xxx, company=500, lat=3.2095573, lng=101.7185698}
04-10 22:39:19.273 3015-3015/? D/EventRaiser: Raising /likes/ChIJpXJ4phI4zDERJqFTBzawpXk: VALUE: null

In the end, I used Android Device Monitor to check the traffic for each action. Below are the average results for Firebase only. Google Map and other http queries are not included.

+----------------------------+-----------+-----------+-----------------------------------------+
|           Action           | Rx(bytes) | Tx(bytes) |                  Notes                  |
+----------------------------+-----------+-----------+-----------------------------------------+
| onPause                    |      2942 |      4680 | detach all listeners                    |
| onResume                   |     10143 |      5204 | reattach all listeners for 15 stations  |
| click "like" by self       |       620 |       535 | write action + download /likes/placeID  |
| update price by self       |      1642 |      1783 | write action + download /places/placeID |
| click "like" by other user |       382 |       112 | download /likes/placeID                 |
| update price by other user |       423 |       104 | download /places/placeID                |
+----------------------------+-----------+-----------+-----------------------------------------+

Right before I took the app down, I had 5.7MB of data in the database. I can guarantee that I hooked up the listener directly for each station as /prices/placeID, so I did not retrieve the whole "Station" data, but only data for that specific station. Similarly for the "like". The listeners are also detached onPause.

I do not have any log of user actions available, hence it is difficult for me to trace back what happened. However, whenever the users open the app, Google Place API must be queried, so I know that during the 3 days, I had 245k queries. Thus for each user session.

117GB / 245k session = ~480kB/session

That seems huge. I have zero experience with bandwidth etc, so I might be wrong. Even if I assume that all users did the extremely unlikely action below, I still can't fill up the bandwidth.

+----------------------------+-----------+-------+--------------+----------------------------------------------------------+
|           Action           | Rx(bytes) | Times | Total(bytes) |                          Notes                           |
+----------------------------+-----------+-------+--------------+----------------------------------------------------------+
| onPause                    |      2942 |    10 |        29420 | Pause and resume 10 times, this does not update the map. |
| onResume                   |     10143 |    10 |       101430 |                                                          |
| click "like" by self       |       620 |    15 |         9300 | Click like on all 15 stations                            |
| update price by self       |      1642 |    15 |        24630 | Update price on all 15 stations                          |
| click "like" by other user |       382 |   100 |        38200 | 100 other users clicked per session                      |
| update price by other user |       423 |   100 |        42300 | 100 other users updated the price per session            |
| Total                      |           |       |       245280 |                                                          |
+----------------------------+-----------+-------+--------------+----------------------------------------------------------+

For a normal user, I'd expect to have around 50kB max per session only, so Firebase seems to be consuming x10 amount of bandwidth. So my questions:

  1. Did I do the calculation right for bandwidth per session?
  2. Is the traffic determined using Android Device Monitor correct? Am I missing something? Is there a better way to check?
  3. How does Firebase calculate the bandwidth? Does it include uploads as well? Is there any hidden bandwidth?

Sorry for the long post. Appreciate if someone can help. Thank you.

回答1:

So, back to the high bandwidth consumption. It has to do with Query of Firebase Database. I have a query below when new user sign up.

Query priceQuery = getPricesRef().orderByChild("firebaseID").equalTo(myFirebaseID);

This was the beginning of my nightmare beause I did not set .indexOn for that database. Read the official Firebase documentation here. The documentation regarding Query is very poor. It only states that performance will be poor without index, but never mention about the bandwidth:

Based on the statement above, my assumption was that without .indexOn, a query to Firebase may take longer to reply, since Firebase server may take longer to provide the search result.

However, as answered in Firebase database bandwidth usage when read with Query, the whole database is downloaded first from Firebase and then sorted and queried on Client's side! I guess that's what they meant by realtime client libraries can execute ad-hoc queries without specifying indexes.

As my database grows, each user sign up begins to consume higher bandwidth by downloading the whole prices database. Towards the end, it was consuming ~800kB per user for just signing up. So make sure to use .indexOn always!



回答2:

I found the bug. It is completely unrelated to the above question, but I'm documenting it here to help other users. So first, to answer my own questions.

Question (1) - Yes. That estimate of 480kB per session should be quite accurate.

Question (2) and (3) - The bandwidth consumed by Firebase should be lower than those recorded by Android Device Monitor. I contacted the support team and got the following reply.

For your bandwidth questions, the GB downloaded only measures the amount of data being sent by the Firebase database to your client app. This means data retrieval to your application.You might want to check the following links for more information:

  • https://groups.google.com/forum/#!topic/firebase-talk/T1sADDJYuIc
  • https://groups.google.com/forum/#!topic/firebase-talk/cM9N476RDfE

Hence, deducting some overhead, the actual bandwidth should be slightly lower.