I am having a problem to group data. I have used the group by clause all the while, but this time I want to group similar valued data together. The need is I have some jobs which have sequence numbers associated. If I can group the jobs together which have more or less the same sequence numbers then I can run a lesser number of jobs.
The data I have is like this.
JobID Sequence
A01 8
A01 6
A01 10
A02 5
A02 10
A02 4
A02 2
A03 8
A03 3
A03 6
A03 10
A04 5
A04 4
A04 2
A04 9
A04 10
From the above data it can be seen that A02 and A04 have more in common and A01 and A03 have more in common.
What query should be done to get similar data group together as below to group A02 and A04 together and A01 and A03 together, the order on the sequence column being immaterial.
JobID Sequence
A01 8
A01 6
A01 10
A03 8
A03 3
A03 6
A03 10
A02 5
A02 10
A02 4
A02 2
A04 5
A04 4
A04 2
A04 9
A04 10
Thanks for the time
david
PS - More explanation.
In the above list JobID A01 contains sequence (8,6,10) A02 contains sequence (5,10,4,2) A03 contains sequence (8,3,6,10) A04 contains sequence (5,4,2,9,10)
So Job A01 and Job A03 have similar sequence numbers and Job A02 and Job A03 have similar sequence numbers. I want to group them based on similar sequence numbers There are many other jobs which have sequences which might match some other job's sequence numbers. I just included 4 jobs to keep the list small.
This is a little more complex of a problem then I feel like thinking all the way through right now, but I'll give you an idea to start with and maybe someone else can help you complete it...
Join the table to itself like so:
I haven't tested that so there could be typos, but you get the idea hopefully. Notice you're joining where the job is not the same, but the sequence is.
Just inferring from other answers... something that may help.
Here you have for every two jobids how simmilar are:
http://sqlfiddle.com/#!3/c28be/9
Building on Brandon Moores answer:
Data setup:
Find totals of all sequences each JobID has in common, order those by most to least, output all the data from each JobId depending on that order:
Gives:
Should do it :)