Find how many times a table needs to be searched to get a match for an Oracle database

I have this table in my database in oracle9i:

CREATE TABLE involved_in
(
    rid number,
    aid number NOT NULL,
    fid number NOT NULL,
    role varchar(80),
    note varchar(80), 
    job varchar(35),
    PRIMARY KEY(rid),
    FOREIGN KEY(fid) REFERENCES production(pr_id),
    FOREIGN KEY(aid) REFERENCES person(pid)
);

      

What contains data about actors (help) who worked in films (fid).

What I want to do is like working out Bacon Number

Except my Kevin Bacon is known for helping 517635.

I have no idea how to work (using SQL statements only) the number of members that I have to bind a given actor with (using other help) to find a 517635 connection.

The result of the request will be either a list of all the participants that this actor should connect to in order to get to my boyfriend, or just a number.

To do this, I thought that I would need to be the first to collect all 517635 participants, and in the end they will have 1, because he worked with him directly. The table is not too big, but big enough to make it irreplaceable.

Examples, say Brad Pitt, mine is 517635. He worked with Angelina Jolie in Mystery and Mrs Smith, which would make her number 1. If Brad Pitt had never worked in any movie with Bruce Willis (let's say the case), but Angelina Jolie was in one with him, then Bruce Willis's number in relation to Brad Pitt would have been 2.

In my query, if the given number was Angelina, the result would be: "Brad Pitt 1" or just "1" If the given number was Willis, the result would be: "Angelina Jolie Brad Pitt 2" or "2" Example of what is in the table :

INSERT INTO involved_in(rid, aid, fid, role, note, job) VALUES(1, 33,                  1584953, 'Himself', 'NULL', 'actor');
INSERT INTO involved_in(rid, aid, fid, role, note, job) VALUES(2, 1135, 1999660, 'Himself', 'NULL', 'actor');
INSERT INTO involved_in(rid, aid, fid, role, note, job) VALUES(3, 1135, 2465724, 'Himself', 'NULL', 'actor');
INSERT INTO involved_in(rid, aid, fid, role, note, job) VALUES(4, 6003, 2387806, 'Himself', '(archive footage)', 'actor');
INSERT INTO involved_in(rid, aid, fid, role, note, job) VALUES(5, 13011, 1935123, 'Himself', 'NULL', 'actor');

      

I have nothing in my head, I am completely new to SQL, and everything I can imagine leads to infinite loops with a variable to count the number of loops. Any ideas on where to start and luckily end up?

+3


source to share


2 answers


This is difficult to verify without sample data, but the following answer is at least syntactically correct.

What you want here is an explicitly recursive query. Oracle has ways to do this: using common-table (CTE) expressions (ie A with

); or with a suggestion connect by

. CTEs are a SQL standard and connect by

are proprietary, but I find it connect by

less confusing, so that's what I will use (also recursive CTEs are not supported in 9i).

SELECT   aid, MIN (lvl) AS distance
FROM     (SELECT     ii2.aid, LEVEL AS lvl
          FROM       involved_in ii1
                     JOIN involved_in ii2
                        ON ii1.fid = ii2.fid AND ii1.aid <> ii2.aid
          START WITH ii1.aid = 517635
          CONNECT BY NOCYCLE ii1.aid = PRIOR ii2.aid)
GROUP BY aid

      

  • A connection connects everything aid

    that is shared with fid

    each other.
  • The proposal connect by

    brings together each set connected aid

    to another set aid

    .
  • The clause nocycle

    prevents infinite recursion.
  • level

    is a keyword that gives us the number of times the recursion has occurred. We should get the minimum, because any given one aid

    can connect to the starting aid

    one through several paths.
  • If this query performs very poorly, you may only get results at a certain distance. The best way to do this is to add and level <= 100

    to the proposal connect by

    (in this case, limit the results to 100 or less).



As pointed out in the comments, 9i doesn't support nocycle

. Also, when running this request, the OP ends up running out of temporary space. The two are not actually related: if a loop occurs, Oracle will throw an error; this means the server is running out of temporary space before the loop is found. I don't see any good way to get around these problems.

You can specify the end point (to some extent). You can add AND ii1.aid <> 2

where 2

is the aid

end point to the sentence on

. This will cause the request to stop navigating the branch when it encounters this value. However, this may not help with the aforementioned problems, as it will short-circuit those branches where the supplied values ​​are found. It should still evaluate all other branches in case the value is somewhere in them.

+2


source


The best approach is to use hierarchical queries.

Oracle (even in version 9i) has a CONNECT BY clause. http://docs.oracle.com/cd/B19306_01/server.102/b14200/queries003.htm

Combined with START WITH and LEVEL, this becomes ridiculously easy.



Example:

SELECT last_name, employee_id, manager_id, LEVEL
FROM employees
START WITH employee_id = 100
CONNECT BY PRIOR employee_id = manager_id
ORDER SIBLINGS BY last_name;

      

+1


source







All Articles