Avoid Nested Loops in PostgreSQL

See request below

Select count(*) FROM
(Select distinct Student_ID, Name, Student_Age, CourseID from student) a1
JOIN
(Select distinct CourseID, CourseName, TeacherID from courses) a2
ON a1.CourseID=a2.CourseID
JOIN 
(Select distinct TeacherID, TeacherName, Teacher_Age from teachers) a3
ON a2.TeacherID=a3.TeacherID

      

Subqueries should be used for deduplication purposes.

This query works fine in PostgreSQL. However, if I add a condition between the student and teacher table according to the execution plan, Postgres will mistakenly nest a loop, join the student, and examine tables that have no direct relationship. For example:

Select count(*) FROM
(Select distinct Student_ID, Name, Student_Age, CourseID from student) a1
JOIN
(Select distinct CourseID, CourseName, TeacherID from courses) a2
ON a1.CourseID=a2.CourseID
JOIN 
(Select distinct TeacherID, TeacherName, Teacher_Age from teachers) a3 ON
 a2.TeacherID=a3.TeacherID
WHERE Teacher_Age>=Student_Age

      

This request will be executed forever. However, if I replace the subqueries with tables, it works very quickly. Without using temporary tables to store the deduplication result, is there a way to avoid the nested loop in this situation?

Thank you for your help.

+3


source to share


1 answer


You are making a database to do a lot of unnecessary work to achieve your goal. Instead of doing 3 different SELECT DISTINCT subqueries all concatenated with each other, try connecting the base tables directly to each other and letting it handle the DISTINCT part only once. If you have tables with appropriate indexes on the ID fields, this should be pretty quick.



SELECT COUNT(1)
    FROM (
    SELECT DISTINCT s.Student_ID, c.CourseID, t.TeacherID
        FROM student s
        JOIN courses c ON s.CourseID = c.CourseID
        JOIN teachers t ON c.TeacherID = t.TeacherID
        WHERE t.Teacher_Age >= s.StudentAge
     ) a

      

0


source







All Articles