Should one average Firebase Active User Metrics (D

2020-06-22 05:18发布

问题:

I am trying to understand whether it is better to report month-over-month on the current Firebase "Active" User metrics report (view graph below), or rather self-calculate and report the average of each of these metrics's values over a specific period.

At first-glance the dashboard shows you 1-day, 7-day, and 28-day active users for the month of December 2018, but it is in fact only the last day of the selected date range's values that is shown (on the right). This is great to know, but a bit misleading to compare only the last date's values for my month-over-month analysis. An alternative approach could be to self-calculate the average over the selected period:

Applied to the Firebase Demo data set, I got the numbers below:

Firebase Dashboard:

  • 28-day Active users: 8661
  • 7-day Active users: 3874
  • 1-day Active users: 1111

My Calculated Average:

  • 28-day Active users: 8762
  • 7-day Active users: 3663
  • 1-day Active users: 1112

The delta difference is small here, but I am seeing some significant differences on our application which has millions of active users per month.

Question:

  • If you are using Firebase currently, how do you report on it?
  • Do you copy and paste the last day of the selected period and report on that for a month, or do you also average each of the 1/7/28-day metric to get a better representation of the month?
  • If you average your metrics, could you please explain why?

回答1:

To answer my own question I would like to first revisit the definitions, and then run over the calculations.

Based on the supporting Firebase documents, I summarized the definitions for each of the metrics below. It is very important to state that only the unique users should be counted over each of the metrics (given selected date range).

  • 1-day active users: A 1-day unique active user has engaged with an app in the device foreground AND has logged a user_engagement event within the last 1-day period (given selected date range).
  • 7-day active users: A 7-day unique active user has engaged with an app in the device foreground AND has logged a user_engagement event within the last 7-day period (given selected date range).
  • 28-day active users: A 28-day unique active user has engaged with an app in the device foreground AND has logged a user_engagement event within the last 28-day period (given selected date range).

In the cells below you can see how the metrics are calculated for December:

Methodology to Calculate Each Metric / Audience:

  • Calculate DAUs for a specific month by using: Average 1-day active user metric.
  • Calculate WAUs for a specific month by using: Average 7-day active user metric. I calculated this by averaging the snapshots at 7, 14, 21, 28 December.
  • Calculate MAUs for a specific month by using: Non-averaged 28-day active user metric. The main reason for not averaging this metric's value is, because I want to have only one snapshot of the entire month. If I would have used averages here I would also account for users that were active in a previous month.

AVG 1-day Unique Active User Metric (Android, Dec 2018)

# StandardSQL
SELECT
  ROUND(AVG(users),0) AS users
FROM 
(
  SELECT
  event_date,
  COUNT(DISTINCT user_pseudo_id) AS users
FROM `<id>.events_*`
WHERE
  event_name = 'user_engagement'
  AND _TABLE_SUFFIX BETWEEN '20181201' AND '20181231'
  AND platform = "ANDROID"
GROUP BY 1
) table

# or you could also use code below, but you will have to add in the remaining days' code to query against the entire month. 

-- Set your variables here
WITH timeframe AS (SELECT DATE("2018-12-01") AS start_date, DATE("2018-12-31") AS end_date)

-- Query your variables here
SELECT ROUND(AVG(users),0) AS users
FROM
(
SELECT event_date, COUNT(DISTINCT user_pseudo_id) AS users
FROM `<id>.events_*`AS z, timeframe AS t
WHERE
  event_name = 'user_engagement'
  AND _TABLE_SUFFIX > FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 1 DAY))
  AND _TABLE_SUFFIX <= FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL 0 DAY))
  AND platform = "ANDROID"
GROUP BY 1

UNION ALL 

SELECT event_date, COUNT(DISTINCT user_pseudo_id) AS users
FROM `<id>.events_*`AS z, timeframe AS t
WHERE
  event_name = 'user_engagement'
  AND _TABLE_SUFFIX > FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 2 DAY))
  AND _TABLE_SUFFIX <= FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 1 DAY))
  AND platform = "ANDROID"
GROUP BY 1
... 
...
...
...
) avg_1_day_active_users

AVG 7-day Unique Active User Metric (Android, Dec 2018)

-- Set your variables here
WITH timeframe AS (SELECT DATE("2018-12-01") AS start_date, DATE("2018-12-31") AS end_date)

-- Query your variables here
SELECT ROUND(AVG(users),0) AS users
FROM
(
SELECT COUNT(DISTINCT user_pseudo_id) AS users
FROM `<id>.events_*`AS z, timeframe AS t
WHERE
  event_name = 'user_engagement'
  AND _TABLE_SUFFIX > FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 7 DAY))
  AND _TABLE_SUFFIX <= FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL 0 DAY))
  AND platform = "ANDROID"

UNION ALL

SELECT COUNT(DISTINCT user_pseudo_id) AS users
FROM `<id>.events_*`AS z, timeframe AS t
WHERE
  event_name = 'user_engagement'
  AND _TABLE_SUFFIX > FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 14 DAY))
  AND _TABLE_SUFFIX <= FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 7 DAY))
  AND platform = "ANDROID"

UNION ALL

SELECT COUNT(DISTINCT user_pseudo_id) AS users
FROM `<id>.events_*`AS z, timeframe AS t
WHERE
  event_name = 'user_engagement'
  AND _TABLE_SUFFIX > FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 21 DAY))
  AND _TABLE_SUFFIX <= FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 14 DAY))
  AND platform = "ANDROID"

UNION ALL

SELECT COUNT(DISTINCT user_pseudo_id) AS users
FROM `<id>.events_*`AS z, timeframe AS t
WHERE
  event_name = 'user_engagement'
  AND _TABLE_SUFFIX > FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 28 DAY))
  AND _TABLE_SUFFIX <= FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 21 DAY))
  AND platform = "ANDROID"
) avg_7_day_active_users

Non-averaged 28-day Unique Active User Metric (Android, Dec 2018)

# StandardSQL
-- Set your variables here
WITH timeframe AS (SELECT DATE("2018-12-01") AS start_date, DATE("2018-12-31") AS end_date)

-- Query your variables here
SELECT COUNT(DISTINCT user_pseudo_id) AS users
FROM `<id>.events_*`AS z, timeframe AS t
WHERE
  event_name = 'user_engagement'
  AND _TABLE_SUFFIX > FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 28 DAY))
  AND _TABLE_SUFFIX <= FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL 0 DAY))
  AND platform = "ANDROID"

Side Notes:

  • I know some companies still calculate their MAUs over a 30-day period. So you will have to test and see what works best for your company.
  • The only problem I have with the MAU-calculation, is that it does not yet take into account the starting days of each month. Perhaps one could take the average of Day31 - 28days, Day30 - 28days, Day29 - 28days, Day28 - 28days ...
  • I found the Firebase Team's sample queries also helpful, but their active metrics only addresses the active user count at time when the query is executes (view example below):
SELECT
  COUNT(DISTINCT user_id)
FROM
  /* PLEASE REPLACE WITH YOUR TABLE NAME */
  `YOUR_TABLE.events_*`
WHERE
  event_name = 'user_engagement'
  /* Pick events in the last N = 20 days */
  AND event_timestamp > UNIX_MICROS(TIMESTAMP_SUB(CURRENT_TIMESTAMP, INTERVAL 20 DAY))
  /* PLEASE REPLACE WITH YOUR DESIRED DATE RANGE */
  AND _TABLE_SUFFIX BETWEEN '20180521' AND '20240131';