SQL Query to find users that install and uninstall

2020-06-29 03:38发布

问题:

I am trying to find the users that install and uninstall the App on the same day using the data from Firebase Analytics in Google BigQuery

This is where I got so far. I have a query that gives me users (or app_instance_id) who install or uninstall the App:

SELECT event.date,
user_dim.app_info.app_instance_id,
event.name
FROM `app_name.app_events_20180303`,
UNNEST(event_dim) AS event
WHERE (event.name = "app_remove" OR event.name = "first_open")
ORDER BY app_instance_id, event.date

It gives me the following result where I can see that row 1 and 2 are the same user that installs and uninstalls the App:

I´ve tried to modify the previous query by using

WHERE (event.name = "app_remove" AND event.name = "first_open")

which gives: Query returned zero records.

Do you have any suggestions on how to achieve this? Thanks.

回答1:

Try this, although I did not test it;

SELECT date,
       app_instance_id
FROM 
(SELECT event.date,
user_dim.app_info.app_instance_id,
event.name
FROM `app_name.app_events_20180303`,
UNNEST(event_dim) AS event
WHERE (event.name = "app_remove" OR event.name = "first_open"))
GROUP BY app_instance_id, date
HAVING COUNT(*) = 2
ORDER BY app_instance_id, date


回答2:

Usually we can join the table by itself to find out such result, something like:

SELECT t1.date, t1.app_instance_id
FROM event as t1, event as t2
WHERE t1.date = t2.date and t1.app_instance_id = t2.app_instance_id and t1.name = "app_remove" and t2.name = "first_open"
ORDER by t1.app_instance_id, t1.date



回答3:

To start, it's worth noting that iOS does not yield app_remove, so this query only counts Android users who go through the install/uninstall pattern.

I created a sub-set of users who emitted first_open and app_remove, and counted those entries grouped by the date. I only kept instances where users installed and removed the app the same number of times in a day (greater than zero).

Then I tallied the distinct users.

SELECT COUNT(DISTINCT(user_id)) as transient_user_count
FROM (
    SELECT event_date,
    user_id,
    COUNT(if(event_name = "first_open", user_id, NULL)) as user_first_open,
    COUNT(if(event_name = "app_remove", user_id, NULL)) as user_app_remove
    FROM `your_app.analytics_123456.events_*`
--     WHERE (_TABLE_SUFFIX between '20191201' and '20191211')
    GROUP BY user_id, event_date
    HAVING user_first_open > 0 AND user_first_open = user_app_remove
)

If you're not able to rely on user_id, then the documentation suggests that you may be able to rely on the user_pseudo_id