BigQuery to find the Subsequence

Assuming my table is

WITH `sample_project.sample_dataset.table` AS (
  SELECT 'user1' user, 2 sequence, 'T1' ts UNION ALL
  SELECT 'user1', 2, 'T2' UNION ALL
  SELECT 'user1', 1, 'T3' UNION ALL
  SELECT 'user1', 1, 'T4' UNION ALL
  SELECT 'user1', 3, 'T5' UNION ALL
  SELECT 'user1', 2, 'T6' UNION ALL
  SELECT 'user1', 3, 'T7' UNION ALL
  SELECT 'user1', 3, 'T8' 
)

Can I find Subsequence of Integers available in sequence column without using STRING_AGG and REGEX OR JOIN operations ? This is to make query more efficient.

A subsequence is a part of String. For example consider String "banana", A sample subsequence is "anna" as each index character of "anna" from banana is strictly increasing. Characters in a subsequence need not be contiguous.

Say for the above table when order by timestamp (INCREASING), I would get STRING_AGG for sequence column as 22113233. In the String 22113233 subsequence 1 2 3 is available whereas subsequence 3 2 1 is not available. Given a subsequence 213, How can I say if this subsequence is available or not (in 22113233 which sorted by timestamp) ?

Given a subsequence 213, How can I say if this subsequence is available or not (in 22113233 ...

Below example is for BigQuery SQL

#standardSQL
WITH `sequences` AS (
  SELECT '22113233' sequence_list 
), `subsequenses` AS (
  SELECT '123' subsequence UNION ALL
  SELECT '321' UNION ALL
  SELECT '213'
)
SELECT sequence_list, subsequence, 
  REGEXP_CONTAINS(sequence_list, REGEXP_REPLACE(subsequence, '', '.*')) available
FROM `sequences` l
CROSS JOIN `subsequenses` s

with result as below

sequence_list   subsequence     available    
22113233        321             false    
22113233        123             true     
22113233        213             true

if you are looking for specific subsequence - this can be further simplified as

#standardSQL
WITH `sequences` AS (
  SELECT '22113233' sequence_list UNION ALL
  SELECT '11223322'
)
SELECT sequence_list,  
  REGEXP_CONTAINS(sequence_list, REGEXP_REPLACE('213', '', '.*')) available
FROM `sequences`

with result as

sequence_list   available    
22113233        true     
11223322        false

BigQuery to find the Subsequence

问题:

回答1:

收藏的人(0)

BigQuery to find the Subsequence

问题:

回答1:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮