I am working in BigQuery with Google Analytics data on a topic to understand the page sequence paths. The pages are scoped to the custom dimension (hit level) with is grouping a set of pages. Furthermore, TimeStamp is included to gain more insights about the sequence.
Query (legacy sql):
SELECT
date,
STRFTIME_UTC_USEC(SEC_TO_TIMESTAMP(visitStartTime + hits.time/1000),"%Y-%m-%d %H:%M:%S") AS TimeStamp,
concat(fullvisitorid,".", string(visitId)) as sessionsid,
MAX(IF(hits.customDimensions.index=3, hits.customDimensions.value, NULL))
WITHIN hits AS CustomDimension,
hits.hitNumber as hits.hitNumber,
visitNumber,
FROM
TABLE_DATE_RANGE([XXXXXXXX.ga_sessions_], TIMESTAMP('2017-08-01'),
TIMESTAMP('2017-08-05'))
having sessionsid = "3712616268105648149.1501799276"
ORDER BY
hits.hitNumber AS
Outcome (bases on 1 sessionid): Image outcome query
I’ve got two questions:
1) In the example below the CustomDimension “Thankyou” is the “Thank you” page. Strange fact is that CustomDimension “thankYou” is counted more than once. In this case 2 seconds after each other, but in other cases sometimes at the same moment. GA tracking seems to be fine and can only can send 1 hit (not taking a refresh into consideration). Any ideas if I missing something in the query that is effecting this behaviour?
2) The page sequence is starting CustomDimension “ThankYou”. This is the Thank you page of the order process and then the users continues to the Homepage. It is not logic that a user starts at this page. Do you know if this is caused by the query?
Hope that somebody can give me some advice to put me in the right direction.
Thanks.