SELECT .. FROM (SELECT .. FROM ..). How can I improve this query?

I am learning SQL and looking through the chapters of a GalaXQL application.

I wrote the following query in response to the question "Hilight star (s) that has the highest orbiting planet in the galaxy. Remember to clear old hilights before starting."

This query returns the result starid: 23334, which when inserted into the hilight table allows me to progress. However, the program hangs for a long time before passing this result to me, so I suspect there is a much more efficient way to get this information from the database.

This request works but it takes a long time to process, how can I improve this request?

INSERT INTO hilight 
SELECT DISTINCT s.starid 
FROM planets AS p, stars AS s, moons AS m 
WHERE s.starid=(SELECT starid 
                FROM planets 
                WHERE orbitdistance=(SELECT MAX(orbitdistance) 
                                     FROM planets));

      

My logic for this query structure is

First, find the planet with the longest orbiting distance from the "planets" in the table.

The second is to compare the "highest orbital distance" value with the "orbit" field of the "planets" table and return the "stellar" value associated with that field.

The third is to compare the value of the "starid" field in the "planets" table with the "starid" field in the "stars" table, and then insert the "starid" into the "HiLight" table.

Data structure:

enter image description here

+3


source to share


3 answers


Ok, let's start by looking at your basic query. I know you already have an answer that works, but I feel like I need to explain to you what's going on in your request.

INSERT INTO hilight 
SELECT DISTINCT 
    s.starid 
FROM planets AS p, stars AS s, moons AS m 
WHERE s.starid=(
    SELECT 
        starid 
    FROM planets 
    WHERE orbitdistance=(
            SELECT 
                MAX(orbitdistance) 
            FROM planets
            )
        );

      

So, here in your proposal FROM

, you are also using a few old style unions. You are also missing a sentence ON

. This is called a CROSS JOIN , which will produce what we call a Cartesian product. The result set will show the number of rows in the first table multiplied by the number of rows in the second table, and so on. Etc.

So we can fix it simply by setting the join syntax and joining the relationships you have outlined in the diagram.

INSERT INTO hilight 
SELECT --DISTINCT probably no longer needed as we aren't artifficially creating duplicate rows
    s.starid 
FROM planets AS p
inner join stars AS s
    on s.StarID = p.starid 
inner join moons AS m 
    m.planetID = p.planetID
WHERE s.starid=(
    SELECT 
        starid 
    FROM planets 
    WHERE orbitdistance=(
            SELECT 
                MAX(orbitdistance) 
            FROM planets
            )
        );

      

In further analysis, you join the table wells, but do not use any of the data or limit your result set. This means that you do not get any benefit from this in your request and can be cut straight out.



INSERT INTO hilight 
SELECT --DISTINCT probably no longer needed as we aren't artifficially creating duplicate rows
    s.starid 
FROM planets AS p
inner join stars AS s
    on s.StarID = p.starid 
WHERE s.starid=(
    SELECT 
        starid 
    FROM planets 
    WHERE orbitdistance=(
            SELECT 
                MAX(orbitdistance) 
            FROM planets
            )
        );

      

Now that we look at your point WHERE

, it appears to be rather redundant. It seems to me that there is no reason to go to a planet twice to get a predicate when you can just match the maximum orbital distance to the table of planets. It also removes any reason to join the stars of the table.

INSERT INTO hilight 
SELECT 
    p.starid 
FROM planets AS p
WHERE p.orbitdistance=
    (
    SELECT 
        MAX(orbitdistance) 
    FROM planets
    )

      

The resulting query is much simpler and should run much faster if we don't create as many duplicate rows. I hope to shed some light on what's going on in your request.

UPDATE: On further inspection, this GalaSQL looks pretty awful and has massive outdated information, and I highly recommend not using it.

+7


source


You can remove the first selection so that it looks like ...

INSERT INTO hilight
SELECT DISTINCT p.starid 
FROM planets p
WHERE orbitdistance=(
    SELECT MAX(orbitdistance) 
    FROM planets)
 ;

      



You can also delete a report if you don't have a specific reason to enable it.

+1


source


you can also use CROSS Apply.

INSERT INTO hilight 
SELECT P.startid
FROM planets P
CROSS APPLY ( SELECT MAX(orbitdistance) as maxVal FROM planets ) T
WHERE P.orbitdistance = T.maxVal

      

0


source







All Articles