Extracting data using regexp_extract in Google Big

2019-02-19 21:55发布

问题:

I am trying to extract data from a column which has multiple characters and I am only interested in getting the specific string from the input string. My sample input and outputs are as below. How can I implement this using regexp_extract function.Can someone share their thoughts on this if you have worked on GBQ.Thanks.

**

  • SQL:-

**

   SELECT request.url AS url 
    FROM [xyz.abc]
    WHERE regexp_extract(input,r'he=(.{32})') 

**

  • Input:-

**

http://mpp.xyz.com/conv/v=5;m=1;t=16901;ts=20150516234355;he=5e3152eafc50ed0346df7f10095d07c4;catname=Horoscope  
2   http://mpp.xyz.com/conv/v=5;m=1;t=16901;ts=20150516234335;he=5e3152eafc50ed0346df7f10095d07c4;catname=High+Speed+Internet   

**

  • Output :-

** **

5e3152eafc50ed0346df7f10095d07c4
5e3152eafc50ed0346df7f10095d07c4

**

回答1:

It's very simple to do:

select regexp_extract(input,r'he=(.{32})');

or as example:

select regexp_extract('http://mpp.xyz.com/conv/v=5;m=1;t=16901;ts=20150516234355;he=5e3152eafc50ed0346df7f10095d07c4;catname=Horoscope',r'he=(.{32})')