I wrote a statement that takes almost an hour to run so I am asking help so I can get to do this faster. So here we go:
I am making an inner join of two tables :
I have many time intervals represented by intervals and i want to get measure datas from measures only within those intervals.
intervals
: has two columns, one is the starting time, the other the ending time of the interval (number of rows = 1295)
measures
: has two columns, one with the measure, the other with the time the measure has been made (number of rows = one million)
The result I want to get is a table with in the first column the measure, then the time the measure has been done, the begin/end time of the considered interval (it would be repeated for row with a time within the considered range)
Here is my code:
select measures.measure as measure, measures.time as time, intervals.entry_time as entry_time, intervals.exit_time as exit_time
from
intervals
inner join
measures
on intervals.entry_time<=measures.time and measures.time <=intervals.exit_time
order by time asc
Thanks
Your SQL is equivalent to:
The only thing I might suggest is making sure there's an index on m.Time. Then if that doesn't improve performance enough, try adding indices on i.Start_Time and i.End_Time as well
There may be a very efficient way of writing this query if the intervals are deterministic because the query could be converted to an equi-join that would be amenable to more efficient hash joining.
For example if the intervals are all hourly:
Then the join can be written as:
This would reduce the cost of everything up to and including the join pretty much to a full scan of each of the tables.
However, since you have the ORDER BY operation in there, I think that a sort-merge might still beat it as the query is written right now because the optimiser will sort a smaller data set for the sort-merge than it would for the hash join (because in the latter case it would have to sort more columns of data). you could get round this by structuring the query as:
This gives a lower cost estimate than a sort-merge on my 10.2.0.4 test instance but I'd regard it as a little risky.
So, I'd look for a sort-merge or rewrite it to allow the use of a hash join if possible.
The first thing I do is have your database tool generate an execution plan that you can view (this is "Control-L" in MSSQL, but I'm not sure how to do it in Oracle) - that will try to point out the slow parts and, depending on your Server/Editor, it may even recommend some basic indexes. Once you have an execution plan, you can look for any table scans of inner loop joins, both of which are really slow - indexes can help with table scans, and you can add additional join predicates to help alleviate loop joins.
My guess would be the MEASURES needs an index on the TIME column, and you can include the MEASURE column as well to speed lookups. Try this:
Also, though this won't change your execution plan or speed up your query, it may make your join clause a bit easier read:
This just combines your two <= and >= into a single statement.