In Apache drill, query to find the Nth occurrence

2019-08-24 03:59发布

问题:

I want the position of '/' from the string where '/' has repeated multiple times and the length of the string varies.

In Apache drill, the INSTR() will not work to find the Nth occurrence of a character in a string. i.e INSTR('/This/looks/like/homework', '/',1,3) to get the output as 12, this query will not work in Apache drill.

I want the position of '/' on Nth Occurrence in Apache drill.

Please help me with the solution.

回答1:

0: jdbc:drill:> SELECT STRPOS(a, REGEXP_REPLACE(a, '^\/.*?\/.*?\/.*?\/',''))-1 from (VALUES('/This/is/apache/drill/queries')) t(a);
+---------+
| EXPR$0  |
+---------+
| 16      |
+---------+
1 row selected (0.15 seconds)


回答2:

This is from drill-user mailing list. Hope it will help you: http://mail-archives.apache.org/mod_mbox/drill-user/201509.mbox/%3CB6A39848-AACB-4AD2-BD62-58C395D6CC9E@maprtech.com%3E

You can use POSTION, STRPOS or LOCATE also to find substring in a string and return the position.

LOCATE is the most useful IMO as it allows you to specify where to starter the search in the string, however none of these have the INSTR or substring-index ability to specify which occurrence of the substring to look for.

A good case for a UDF.

Below is a workaround with subqueries to break down an IP address in the different portions.

select p2.ip_address, p2.part1, p2.part2, substr(p2.rest2, 1, locate('.',p2.rest2)-1) as part3,
substr(rest2, locate('.',rest2)+1) as part4
from
(select p1.ip_address, p1.part1, substr(rest1, 1, locate('.',rest1)-1) as part2, 
substr(rest1, locate('.',rest1)+1) as rest2
from
(select ip_address, substr(ip_address, 1, locate('.',ip_address)-1) as part1,
substr(ip_address, locate('.',ip_address)+1) as rest1 from `/ip`) as p1) as p2
+---------------+--------+--------+--------+--------+
|  ip_address   | part1  | part2  | part3  | part4  |
+---------------+--------+--------+--------+--------+
| 172.16.254.1  | 172    | 16     | 254    | 1      |
+---------------+--------+--------+--------+--------+

You can find all supported Drill string functions here: https://drill.apache.org/docs/string-manipulation/#strpos