Trying to get a list:
visitorid, time first visit, time of hit where transaction occurred.
What've I've written is only grabbing rows that have transaction revenue. I am also trying to convert visitStartTime which is a unix date, to a regular date via Date(visitStartTime) but that's failing in the group by because of the outputted date.
Any direction super helpful.
SELECT
fullvisitorID,
visitNumber,
visitStartTime,
hits.transaction.transactionRevenue
FROM
[75718103.ga_sessions_20150310],
[75718103.ga_sessions_20150309],
[75718103.ga_sessions_20150308],
[75718103.ga_sessions_20150307],
[75718103.ga_sessions_20150306],
[75718103.ga_sessions_20150305],
[75718103.ga_sessions_20150304],
[75718103.ga_sessions_20150303],
[75718103.ga_sessions_20150302],
WHERE totals.transactions >=1
GROUP BY
fullvisitorID, visitNumber, visitStartTime, hits.transaction.transactionRevenue;
visitStartTime is defined as POSIX time in Google Analytics schema, which means number of seconds since epoch. BigQuery TIMESTAMP is encoded as number of microseconds since epoch. Therefore, to get start time as TIMESTAMP, I used
TIMESTAMP(INTEGERvisitStartTime*1000000))
. hits.time contains number of milliseconds since first hit, therefore to get time of transactions, they needed to be multiplied by 1000 to get to microsecond granularity, henceTIMESTAMP(INTEGER(visitStartTime*1000000 + hits.time*1000))
. Since hits is repeated RECORD, no GROUP BY is necessary, the data model already has all the hits grouped together. Putting it all together:Mosha's solution is simple and elegant, but is too simple, actually it calculates the time between the first pageview and each transaction inside one visit, so it does not calculate the time between the first visit and the first transaction of one visitor. So if you calculate the average time using Mosha's query it will be 1.33 minute. But if you use the query I created it will be 9.91 minutes. My SQL skills are quite rusted, so it probably can be improved.
Masha's query (avg. time between the first pageview and each transaction inside one visit):
My query (avg. time between the first visit and the first transaction of one visitor):
I left some extra fields so if you use it without the first SELECT you can see some interesting data.
Ps: Thanks Mosha for showing how to calculate the time.