How to use min() and max() in an efficient way?

2019-07-24 13:35发布

问题:

I have an sql query where I check if a value is between a max and a min value of a table. I've now implement this as follows:

SELECT spectrum_id, feature_table_id
FROM 'spectrum', 'feature' 
WHERE `spectrum`.msrun_msrun_id = 1
AND `feature`.msrun_msrun_id = 1
AND (SELECT min(rt) FROM `convexhull` WHERE `convexhull`.feature_feature_table_id =  `feature`.feature_table_id) <= scan_start_time 
AND scan_start_time <= (SELECT max(rt) FROM `convexhull` WHERE 'convexhull'.feature_feature_table_id = 'feature'.feature_table_id)
AND (SELECT min(mz) FROM `convexhull` WHERE `convexhull`.feature_feature_table_id = `feature`.feature_table_id) <= base_peak_mz 
AND base_peak_mz <= (SELECT max(mz) FROM `convexhull` WHERE `convexhull`.feature_feature_table_id = `feature`feature_table_id)

This is running very slowly, because I'm selecting from convexhull 4 times every time I run this query, so I tried to improve it using an inner join:

SELECT spectrum_id, feature_table_id 
FROM 'spectrum', 'feature'
INNER JOIN `convexhull` ON `convexhull`.feature_feature_table_id = `feature`.feature_table_id
WHERE `spectrum`.msrun_msrun_id = ? "+ 
AND `feature`.msrun_msrun_id = ? "+
AND min(`convexhull`.rt) <= scan_start_time "+
AND scan_start_time <= max(`convexhull`.rt) "+
AND min(`convexhull`.mz) <= base_peak_mz "+
AND base_peak_mz <= max(`convexhull`.mz)", spectrumFeature_InputValues)

However, the min() and max() statements can only be used after a select statement. How can I make the first query more efficient, so that I can get the min and max rt and mz without having to do 4 queries?

回答1:

EDIT: had a few more mins and looked again and realised all the data comes from that one table so something like this should work

SELECT 
    spectrum_id
    ,feature_table_id
FROM 
    spectrum AS s
    INNER JOIN feature AS f
        on f.msrun_msrun_id = s.msrun_msrun_id
    INNER JOIN (select 
         feature_feature_table_id
         ,min(rt) AS rtMin
        ,max(rt) AS rtMax
        ,min(mz) AS mzMin
        ,max(mz) as mzMax
     FROM 
        convexhull
     GROUP BY 
         feature_feature_table_id
     ) AS t
     ON t.feature_feature_table_id = f.feature_table_id
WHERE
    s.msrun_msrun_id = 1
    AND s.scan_start_time >= t.rtMin
    AND s.scan_start_time <= t.rtMax
    AND base_peak_mz >= t.mxMin
    AND base_peak_mz <= t.mzMax

I think you want to select from the convexhull table and group by feature_feature_table_id getting the min and max rt within that grouping.

you can then wrap that select in brackets give it a name (as t) and join to it.

Hope this is enought to get you on the road.. if not create a sample schema here: http://sqlfiddle.com/

and put in your query and i can modify it.

as a side note, I think you wan to join these tables on a particular field rather than select from both with a where clause compare:

SELECT spectrum_id, feature_table_id
FROM 'spectrum', 'feature' 
WHERE `spectrum`.msrun_msrun_id = 1
AND `feature`.msrun_msrun_id = 1

and:

SELECT 
    spectrum_id
    ,feature_table_id
FROM 
    spectrum AS s
    INNER JOIN feature AS f
        on f.msrun_msrun_id = s.msrun_msrun_id
WHERE
    s.msrun_msrun_id = 1

If i have got something wrong there let me know.