How to use regexp on the results of a sub query?

2019-06-02 11:07发布

I have two tables.

User which has id and phone number


id phone_no

1 ---- 9912678

2 ---- 9912323

3 ---- 9912366


Admission Table , which has id phone number

id phone_no

6 --- 991267823

7 --- 991236621

8 --- 435443455

9 --- 243344333

I want to find all the phone number of Admission's table which has same pattern as users table and update it in users table.

So i am trying this

select phone_no  from admission where phone_no REGEXP (SELECT phone_no
FROM  `users` AS user
WHERE user.phone_no REGEXP  '^(99)+[0-9]{8}')

But i am getting this error Subquery returns more than 1 row

Looking for help.

3条回答
We Are One
2楼-- · 2019-06-02 11:51

Try one of these queries:

SELECT a.phone_no
FROM admission a
JOIN users u on a.phone_no LIKE concat(u.phone_no, '__')
WHERE u.phone_no REGEXP  '^(99)+[0-9]+$'

or

SELECT a.phone_no
FROM admission a
JOIN users u on a.phone_no REGEXP concat('^', u.phone_no, '[0-9]{2}$')
WHERE u.phone_no REGEXP  '^(99)+[0-9]+$'

If the number of "trailing digits" is not fixed, you can also use:

LIKE concat(u.phone_no, '%')

or

REGEXP concat('^', u.phone_no, '[0-9]*$')

But in this case you might need to use SELECT DISTICT a.phone_no if it is possible that a users.phone_no is a subsequence of an other users.phone_no (e.g. 99123 and 991234).

Update

After running some tests with 10K rows for users table and 100K rows for admission table i came to the following query:

SELECT a.phone_no
FROM admission a
JOIN users u 
    ON  a.phone_no >= u.phone_no
    AND a.phone_no < CONCAT(u.phone_no, 'z')
    AND a.phone_no LIKE CONCAT(u.phone_no, '%')
    AND a.phone_no REGEXP CONCAT('^', u.phone_no, '[0-9]*$')
WHERE   u.phone_no LIKE  '99%'
    AND u.phone_no REGEXP  '^(99)+[0-9]*$'
UNION SELECT 0 FROM (SELECT 0) dummy WHERE 0

fiddle

This way you can use REGEXP and still have great performance. This query executes almost instantly in my test case.

Logically you only need the REGEXP conditions. But on bigger tables the query might time out. Using a LIKE condition will filter the result set before REGEXP check. But even using LIKE the query doesn't perform very well. For some reason MySQL doesn't use a range check for the join. So i added an explicit range check:

    ON  a.phone_no >= u.phone_no
    AND a.phone_no < CONCAT(u.phone_no, 'z')

With this check you can remove the LIKE condition from the JOIN part.

The UNION part is a replacement for DISTICT. MySQL seems to translate DISTINCT into a GROUP BY statement, which doesn't perform well. Using UNION with an empty result set i force MySQL to remove duplicates after the SELECT. You can remove that line, if you use a fixed number of trailing digits.

You can adjust the REGEXP patterns to your needs:

...
    AND a.phone_no REGEXP CONCAT('^', u.phone_no, '[0-9]{2}$')
...
    AND u.phone_no REGEXP  '^(99)+[0-9]{8}$'
...

If you only need REGEXP to check the length of the phone_no, you can also use a LIKE condition with the '_' placeholder.

    AND a.phone_no LIKE CONCAT(u.phone_no, '__')
...
    AND u.phone_no LIKE '99________$'

or combine a LIKE condition with a STR_LENGTH check.

查看更多
祖国的老花朵
3楼-- · 2019-06-02 12:03

I think this does what you want, I did some improvements (SQLfiddle):

select * from admission a where exists (
  select * from (
     select substr(phone_no, 1, 7) pn from users where phone_no REGEXP '^99[0-9]{5}'
  ) o where a.phone_no like concat(o.pn, '%')
)

I had to modify the regex to get any matches. If the length is fixed the second check can easily be done with like. We look in the user table to see if there exists any phone_no that matches the criteria for the admission number we are currently looking at.

查看更多
孤傲高冷的网名
4楼-- · 2019-06-02 12:03

Never mind regex. Do a simple join using like

select distinct a.phone_no
from user u
join admission a on a.phone_no like concat(u.phone_no, '%')
where u.phone_no like '99%'

The distinct keyword is only needed if there are either duplicate numbers in the admission table, and/or in the user table. Otherwise, it can be omitted.

查看更多
登录 后发表回答