I'm trying to do a recursive CTE in SQL Server with the following example data
Class Student
------ ------
English Sally <- Sally is what were searching for
English Peter <- Peter's on same Class as Sally
Swedish Peter <- Found because Peter's on this class
Dutch Peter <- Found because Peter's on this class
Finnish Harry <- Not found, no relation to class or student
Swedish Tim <- Found because Peter's on Swedish class
Spanish Lauren <- Not found, no relation to class or student
Spanish Colin <- Not found, no relation to class or student
So I need a CTE, to which I give 'Sally' as parameter, and it will find out all different classes related to Sally, then all Students related to Classes that Sally is in, then all classes related to students in the same classes as Sally, and so on until no more rows found.
But I just cannot figure out how to write the joins, this is what I tried but failed miserably:
WITH myCTE (Class, Student) AS
(
SELECT Class, Student FROM TABLE1 WHERE TABLE1.Student= 'Sally'
UNION ALL
SELECT t.Class, t.Student FROM TABLE1 t
JOIN myCTE t2 ON t2.Class = t.Class
)
SELECT * FROM myCTE
The first problem is that you've got infinite recursion: Sally takes English with Peter, who takes English with Sally, who takes English with Peter...
Once you've sorted that out, you'll need an additional query in your recursive CTE. You're currently joining on Class
to get the other students in the same class, but you also need to join on Student
to get the other classes for the students.
Something like this should work:
WITH cteSource As
(
SELECT
Class,
Student,
-- Create a unique ID for each record:
ROW_NUMBER() OVER (ORDER BY Student, Class) As ID
FROM
TABLE1
),
cteRecursive (Class, Student, IDPath) As
(
SELECT
Class,
Student,
-- Used to exclude records we've already visited:
Convert(varchar(max), '/' + Convert(varchar(10), ID) + '/')
FROM
cteSource
WHERE
Student = 'Sally'
UNION ALL
-- Students in the same class:
SELECT
T.Class,
T.Student,
R.IDPath + Convert(varchar(10), T.ID) + '/'
FROM
cteSource As T
INNER JOIN cteRecursive As R
ON T.Class = R.Class
WHERE
CharIndex('/' + Convert(varchar(10), t.ID) + '/', R.IDPath) = 0
UNION ALL
-- Other classes for the students:
SELECT
T.Class,
T.Student,
R.IDPath + Convert(varchar(10), T.ID) + '/'
FROM
cteSource As T
INNER JOIN cteRecursive As R
ON T.Student = R.Student
WHERE
CharIndex('/' + Convert(varchar(10), t.ID) + '/', R.IDPath) = 0
)
SELECT
Class,
Student,
IDPath
FROM
cteRecursive
;
With your test data, you'll get the following results:
English Sally /7/
English Peter /7/5/
Dutch Peter /7/5/4/
Swedish Peter /7/5/6/
Swedish Tim /7/5/6/8/
Dutch Peter /7/5/6/4/
Swedish Peter /7/5/4/6/
Swedish Tim /7/5/4/6/8/
If you're using SQL 2008 or higher, you might get better performance if you make the IDPath
a HierarchyID
, but you'd need to test with your real data.
EDIT
You might need to change the final select to:
SELECT DISTINCT
Class,
Student
FROM
cteRecursive
to deal with cases where there are multiple paths to the same record. For example, "Dutch/Peter", "Swedish/Peter" and "Swedish/Tim" all appear twice.