Lead & Analytical Functions in BigQuery

2019-07-10 04:43发布

问题:

Assume my table is this

I am trying to modify my table with this information

I have added two columns where column WhenWasLastBasicSubjectDone will let you know when in which semester the student completed his latest Basic Course (sorted by Semester). The other column TotalBasicSubjectsDoneTillNow explains how many times had the student completed Basic Course(Subject) till now (sorted by Semester) ?

I think this is easy to solve with Joins as well as with UDFs but I want to use the power of existing analytical functions in BigQuery and solve it without joins.

回答1:

You can use window functions for this -- assuming you have a column that specifies ordering. Let me assume that column is semester:

select t.*,
       max( case when subject = 'Basic' then semester end ) over (partition by student order by semester end) as lastbasic,
       sum( case when subject = 'Basic' then 1 else 0 end ) over (partition by student order by semester end) as numbasictillnow    
from t


回答2:

Below is for BigQuery Standard SQL

#standardSQL
SELECT *,
  LAST_VALUE(IF(subject='Basic',semester,NULL) IGNORE NULLS) OVER(win) AS WhenWasLastBasicSubjectDone ,
  COUNTIF(subject='Basic') OVER(win) AS TotalBasicSubjectsDoneTillNow     
FROM `project.dataset.table`
WINDOW win AS (PARTITION BY student ORDER BY semester)

You can test, play with above using dummy data from your question as below

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1 Student, 'Sub1' Subject, 'Sem1' Semester UNION ALL
  SELECT 1, 'Sub2', 'Sem2' UNION ALL
  SELECT 1, 'Basic', 'Sem3' UNION ALL
  SELECT 1, 'Basic', 'Sem4' UNION ALL
  SELECT 1, 'Sub3', 'Sem5' UNION ALL
  SELECT 1, 'Sub2', 'Sem6' UNION ALL
  SELECT 1, 'Sub3', 'Sem7' UNION ALL
  SELECT 1, 'Sub4', 'Sem8' 
)
SELECT *,
  LAST_VALUE(IF(subject='Basic',semester,NULL) IGNORE NULLS) OVER(win) AS WhenWasLastBasicSubjectDone ,
  COUNTIF(subject='Basic') OVER(win) AS TotalBasicSubjectsDoneTillNow     
FROM `project.dataset.table`
WINDOW win AS (PARTITION BY student ORDER BY semester)
-- ORDER BY Semester