Assume my table is this
I am trying to modify my table with this information
I have added two columns where column WhenWasLastBasicSubjectDone
will let you know when in which semester the student completed his latest Basic Course (sorted by Semester). The other column TotalBasicSubjectsDoneTillNow
explains how many times had the student completed Basic Course(Subject) till now (sorted by Semester) ?
I think this is easy to solve with Joins as well as with UDFs but I want to use the power of existing analytical functions in BigQuery and solve it without joins.
You can use window functions for this -- assuming you have a column that specifies ordering. Let me assume that column is semester
:
select t.*,
max( case when subject = 'Basic' then semester end ) over (partition by student order by semester end) as lastbasic,
sum( case when subject = 'Basic' then 1 else 0 end ) over (partition by student order by semester end) as numbasictillnow
from t
Below is for BigQuery Standard SQL
#standardSQL
SELECT *,
LAST_VALUE(IF(subject='Basic',semester,NULL) IGNORE NULLS) OVER(win) AS WhenWasLastBasicSubjectDone ,
COUNTIF(subject='Basic') OVER(win) AS TotalBasicSubjectsDoneTillNow
FROM `project.dataset.table`
WINDOW win AS (PARTITION BY student ORDER BY semester)
You can test, play with above using dummy data from your question as below
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 Student, 'Sub1' Subject, 'Sem1' Semester UNION ALL
SELECT 1, 'Sub2', 'Sem2' UNION ALL
SELECT 1, 'Basic', 'Sem3' UNION ALL
SELECT 1, 'Basic', 'Sem4' UNION ALL
SELECT 1, 'Sub3', 'Sem5' UNION ALL
SELECT 1, 'Sub2', 'Sem6' UNION ALL
SELECT 1, 'Sub3', 'Sem7' UNION ALL
SELECT 1, 'Sub4', 'Sem8'
)
SELECT *,
LAST_VALUE(IF(subject='Basic',semester,NULL) IGNORE NULLS) OVER(win) AS WhenWasLastBasicSubjectDone ,
COUNTIF(subject='Basic') OVER(win) AS TotalBasicSubjectsDoneTillNow
FROM `project.dataset.table`
WINDOW win AS (PARTITION BY student ORDER BY semester)
-- ORDER BY Semester