how to convert an sqllite statement from '=

2019-07-25 09:07发布

问题:

I'm trying to create a dbase solution for email threading. When I receive an email.. I want to know if this email belongs to an existing thread.

So I match the subject ie subject = 'cool bro' matches 're: cool bro'

and i also want to match the sender and receiver pair

ie

(sender = 'A@gmail.com' and receiver = 'B@gmail.com')

OR

(sender ='B@gmail.com' and receiver = 'A@gmail.com')

for those exact cases.. this query worked fine (see more details here):

(SELECT COUNT(search_email.threadID) FROM search_email 
                                                        WHERE search_email.subject MATCH '%query%'  AND 
                                                        (
                                                            (search_email.sender = '%sender' AND search_email.tos = '%receiver%')
                                                            OR 
                                                            (search_email.sender = '%receiver%' AND search_email.tos = '%sender%')
                                                        )

problem is when you have a case like this:

(sender = 'Amanda Collins A@gmail.com' and receiver = 'B@gmail.com')

OR

(sender ='Billy Bob B@gmail.com' and receiver = 'A@gmail.com')

clearly.. a MATCH clause (or some regex or somthing) is in order here..

problem is that sqllite doesn't allow MATCH to work with OR: a UNION is needed instead..

but I couldn't figure out a way to convert the above statement into a statement that has UNION/INTERSECT instead:

SELECT * FROM search_email WHERE 
                    search_email.subject MATCH '%query%'  INTERSECT SELECT * FROM 
                                        (SELECT * FROM  (SELECT * FROM search_email WHERE 
                                                                                    search_email.sender MATCH '%sender%' 
                                                                                 INTERSECT SELECT * FROM search_email WHERE search_email.tos MATCH '%receiver%' )
                                        UNION 
                                                                            SELECT * FROM(      SELECT * FROM search_email WHERE search_email.sender MATCH '%sender%' 
                                                                                 INTERSECT SELECT * FROM search_email WHERE search_email.tos MATCH '%receiver%'))

any ideas?

update: it seems the answer is just by changing the first intersect into a union:

SELECT * FROM search_email WHERE 
                    search_email.subject MATCH '%query%'  UNION SELECT * FROM 
                                        (SELECT * FROM  (SELECT * FROM search_email WHERE 
                                                                                    search_email.sender MATCH '%sender%' 
                                                                                 INTERSECT SELECT * FROM search_email WHERE search_email.tos MATCH '%receiver%' )
                                        UNION 
                                                                            SELECT * FROM(      SELECT * FROM search_email WHERE search_email.sender MATCH '%sender%' 
                                                                                 INTERSECT SELECT * FROM search_email WHERE search_email.tos MATCH '%receiver%'))

but this solution seems very inefficient.. mysql compiler takes a while to figure it out.. is there a more efficient way?

回答1:

Using MATCH or LIKE '%string%' will be always very slow. You might be able to improve that using full text search, but you still will not get very far.

There is another way to do this, which is more efficient and is widely supported.

According to RFC2822 (and older RFC822) mail client programs should create and maintain fields in mail headers that facilitate threading support. In practice, all known mail clients actually support that. Those fields are:

  • Message-ID: typically looks like <randomstring@sender.com>
  • In-Reply-To: message-id of the message this one was replied to
  • References: list of message-id's that this message might be linked to

If you extract these headers and save them in separate columms and/or tables in your database, you should be able to easily create threads by using exact comparison (no MATCH '%string%'), so it will be very fast.