I am working in BigQuery with Google Analytics data on a topic to understand the page sequence paths. The pages are scoped to the custom dimension (hit level) with is grouping a set of pages. Furthermore, TimeStamp is included to gain more insights about the sequence.
Query (legacy sql):
SELECT
date,
STRFTIME_UTC_USEC(SEC_TO_TIMESTAMP(visitStartTime + hits.time/1000),"%Y-%m-%d %H:%M:%S") AS TimeStamp,
concat(fullvisitorid,".", string(visitId)) as sessionsid,
MAX(IF(hits.customDimensions.index=3, hits.customDimensions.value, NULL))
WITHIN hits AS CustomDimension,
hits.hitNumber as hits.hitNumber,
visitNumber,
FROM
TABLE_DATE_RANGE([XXXXXXXX.ga_sessions_], TIMESTAMP('2017-08-01'),
TIMESTAMP('2017-08-05'))
having sessionsid = "3712616268105648149.1501799276"
ORDER BY
hits.hitNumber AS
Outcome (bases on 1 sessionid): Image outcome query
I’ve got two questions:
1) In the example below the CustomDimension “Thankyou” is the “Thank you” page. Strange fact is that CustomDimension “thankYou” is counted more than once. In this case 2 seconds after each other, but in other cases sometimes at the same moment. GA tracking seems to be fine and can only can send 1 hit (not taking a refresh into consideration). Any ideas if I missing something in the query that is effecting this behaviour?
2) The page sequence is starting CustomDimension “ThankYou”. This is the Thank you page of the order process and then the users continues to the Homepage. It is not logic that a user starts at this page. Do you know if this is caused by the query?
Hope that somebody can give me some advice to put me in the right direction.
Thanks.
Related information and handy examples are to be found on the "BigQuery cookbook" on-line document, "Sequence of hits" sub-chapter. The general pattern is given as:
On the first line, you identify the unique combination of visitor ID and visit IDs. Every hit will belong to a unique combination of these two dimensions. On the first line, you also list the elements of a page path. The FROM statement lists the data source(s). In the WHERE statement, you limit the query to specific types of hits to control what kinds of interactions you observe.
A directly relevant example can be read on the "LunaMetrics Google BigQuery Recipes" page, more precisely as the "All Related Pages in a Session" query: