Creating all possible combinations of a schedule using SQL Query

I have an awkward SQL conundrum that has surpassed me.

I am trying to create a list of possible student block configurations so that I can fit their choices into the schedule. The list of possible qualifications and blocks for a student may be as follows:

Biology A
Biology C 
Biology D 
Biology E 

Chemistry B 
Chemistry C 
Chemistry D 
Chemistry E 
Chemistry F 

Computing D
Computing F 

Tutorial A 
Tutorial B 
Tutorial E 

      

A possible solution to the blocks for the student could be

Biology D
Chemistry C 
Computing F 
Tutorial E 

      

How can I query the above dataset to generate all possible combinations of lessons and blocks for a student? I could then reset the list by deleting the ones that collide and select the one that works. I believe that in this case there will be about 120 combinations in total.

I could imagine it would be something like a cross. I've tried all sorts of solutions using windowing and cross-application and so on, but they all had some kind of flaw. They all tend to get confused because each student has a different number of courses and each course has a different number of blocks.

Cheers for any help you can offer! I can gnarled gnarled the mess of the query I have, if needed too!

Alex

+3


source to share


5 answers


With a fixed number of qualifications, the answer is relatively simple - the option CROSS JOIN

from the previous answers will work just fine.

However, if the number of qualifications is unknown or may change in the future, the hardcodes of the four op's CROSS JOIN

will not work. In this case, the answer becomes more complex.

For a small number of lines, you can use a variation on this answer in the DBA that uses the powers of two and bit-comparisons to generate combinations. However, this will be limited to a very small number of lines.

For more lines, you can use the function to generate each combination of "M" numbers from the "N" lines. Then you can concatenate this value with the value ROW_NUMBER

calculated from the original data to get the original string.

The function for generating combinations can be written in TSQL, but it would be wiser to use SQLCLR if possible:

[SqlFunction(
    DataAccess = DataAccessKind.None,
    SystemDataAccess = SystemDataAccessKind.None,
    IsDeterministic = true,
    IsPrecise = true,
    FillRowMethodName = "FillRow",
    TableDefinition = "CombinationId bigint, Value int"
)]
public static IEnumerable Combinations(SqlInt32 TotalCount, SqlInt32 ItemsToPick)
{
    if (TotalCount.IsNull || ItemsToPick.IsNull) yield break;

    int totalCount = TotalCount.Value;
    int itemsToPick = ItemsToPick.Value;
    if (0 >= totalCount || 0 >= itemsToPick) yield break;

    long combinationId = 1;
    var result = new int[itemsToPick];
    var stack = new Stack<int>();
    stack.Push(0);

    while (stack.Count > 0)
    {
        int index = stack.Count - 1;
        int value = stack.Pop();

        while (value < totalCount)
        {
            result[index++] = value++;
            stack.Push(value);

            if (index == itemsToPick)
            {
                for (int i = 0; i < result.Length; i++)
                {
                    yield return new KeyValuePair<long, int>(
                        combinationId, result[i]);
                }

                combinationId++;
                break;
            }
        }
    }
}

public static void FillRow(object row, out long CombinationId, out int Value)
{
    var pair = (KeyValuePair<long, int>)row;
    CombinationId = pair.Key;
    Value = pair.Value;
}

      

(Based on this feature .)

Once the function is in place, creating a list of valid combinations is pretty straightforward:



DECLARE @Blocks TABLE 
(
    Qualification varchar(10) NOT NULL, 
    Block char(1) NOT NULL, 
    UNIQUE (Qualification, Block)
);

INSERT INTO @Blocks 
VALUES
    ('Biology', 'A'),
    ('Biology', 'C'), 
    ('Biology', 'D'), 
    ('Biology', 'E'),
    ('Chemistry', 'B'), 
    ('Chemistry', 'C'), 
    ('Chemistry', 'D'), 
    ('Chemistry', 'E'), 
    ('Chemistry', 'F'), 
    ('Computing', 'D'),
    ('Computing', 'F'), 
    ('Tutorial', 'A'), 
    ('Tutorial', 'B'), 
    ('Tutorial', 'E') 
;

DECLARE @Count int, @QualificationCount int;

SELECT
    @Count = Count(1),
    @QualificationCount = Count(DISTINCT Qualification)
FROM
    @Blocks
;

WITH cteNumberedBlocks As
(
    SELECT
        ROW_NUMBER() OVER (ORDER BY Qualification, Block) - 1 As RowNumber,
        Qualification,
        Block
    FROM
        @Blocks
),
cteAllCombinations As
(
    SELECT
        C.CombinationId,
        B.Qualification,
        B.Block
    FROM
        dbo.Combinations(@Count, @QualificationCount) As C
        INNER JOIN cteNumberedBlocks As B
        ON B.RowNumber = C.Value
),
cteMatchingCombinations As
(
    SELECT
        CombinationId
    FROM
        cteAllCombinations
    GROUP BY
        CombinationId
    HAVING
        Count(DISTINCT Qualification) = @QualificationCount
    And
        Count(DISTINCT Block) = @QualificationCount
)
SELECT
    DENSE_RANK() OVER(ORDER BY C.CombinationId) As CombinationNumber,
    C.Qualification,
    C.Block
FROM
    cteAllCombinations As C
    INNER JOIN cteMatchingCombinations As MC
    ON MC.CombinationId = C.CombinationId
ORDER BY
    CombinationNumber,
    Qualification
;

      

This query will generate a list of 172 lines representing 43 valid combinations:

1  Biology    A
1  Chemistry  B
1  Computing  D
1  Tutorial   E

2  Biology    A
2  Chemistry  B
2  Computing  F
2  Tutorial   E
...

      


If you want the TSQL version of the function Combinations

:

CREATE FUNCTION dbo.Combinations
(
    @TotalCount int,
    @ItemsToPick int
)
Returns @Result TABLE
(
    CombinationId bigint NOT NULL,
    ItemNumber int NOT NULL,
    Unique (CombinationId, ItemNumber)
)
As
BEGIN
DECLARE @CombinationId bigint;
DECLARE @StackPointer int, @Index int, @Value int;
DECLARE @Stack TABLE 
( 
    ID int NOT NULL Primary Key,
    Value int NOT NULL
);
DECLARE @Temp TABLE
(
    ID int NOT NULL Primary Key,
    Value int NOT NULL Unique
);

    SET @CombinationId = 1;

    SET @StackPointer = 1;
    INSERT INTO @Stack (ID, Value) VALUES (1, 0);

    WHILE @StackPointer > 0
    BEGIN
        SET @Index = @StackPointer - 1;
        DELETE FROM @Temp WHERE ID >= @Index;

        -- Pop:
        SELECT @Value = Value FROM @Stack WHERE ID = @StackPointer;
        DELETE FROM @Stack WHERE ID = @StackPointer;
        SET @StackPointer -= 1;

        WHILE @Value < @TotalCount
        BEGIN
            INSERT INTO @Temp (ID, Value) VALUES (@Index, @Value);
            SET @Index += 1;
            SET @Value += 1;

            -- Push:
            SET @StackPointer += 1;
            INSERT INTO @Stack (ID, Value) VALUES (@StackPointer, @Value);

            If @Index = @ItemsToPick
            BEGIN
                INSERT INTO @Result (CombinationId, ItemNumber)
                SELECT @CombinationId, Value
                FROM @Temp;

                SET @CombinationId += 1;
                SET @Value = @TotalCount;
            END;
        END;
    END;

    Return;
END

      

It's pretty much the same as the SQLCLR version, except for the fact that TSQL doesn't have stacks or arrays, so I had to fake them with table variables.

+5


source


One giant cross joins in?

select * from tablea,tableb,tablec,tabled

      

This actually works for what you need, where tablea is biology records, b is chem, c is computation, and d is textbook. You can refine the connections a bit:

select * from tablea cross join tableb cross join tablec cross join tabled.

      



Technically both statements are the same ... it's all cross-join, so the above semicolon version is simpler, in more complex queries you will need to use the second operator so you can be very clear as to where you are the inner / left-join intersection.

You can replace the "table" records with a select union command to specify the values ​​you are looking for in the query form:

select * from  
(select 'biology' as 'course','a' as 'class' union all  select 'biology','c'  union all select 'biology','d' union all select 'biology','e') a cross join
(select 'Chemistry' as 'course','b' as 'class' union all  select 'Chemistry','c'  union all select 'Chemistry','d' union all select 'Chemistry','e' union all select 'Chemistry','f') b cross join
(select 'Computing' as 'course','a' as 'class' union all  select 'Computing','c') c cross join
(select 'Tutorial ' as 'course','a' as 'class' union all  select 'Tutorial ','b'  union all select 'Tutorial ','e') d

      

There are 120 results (4 * 5 * 3 * 2)

0


source


Doesn't sound like a problem, but does this sqlFiddle work ?

0


source


You should be able to do with a simple union, however each union choice will only have a filter for one type, so you don't get BIO, BIO, BIO, BIO, BIO BIO, CHEM, BIO, BIO, BIO, etc. ..

select
      b.course as BioCourse,
      c.course as ChemCourse,
      co.course as CompCourse,
      t.course as Tutorial
  from
      YourTable b,
      YourTable c,
      YourTable co,
      YourTable t
  where
          b.course like 'Biology%'
      AND c.course like 'Chemistry%'
      AND co.course like 'Computing%'
      AND t.course like 'Tutorial%'

      

0


source


Let's use a paradigm where Table 1 is Biology, Table 2 is Chemistry, Table3 is Computation, and Table4 is a tutorial. Each table has 1 column and these are the possible blocks for that table or course. To get all the possible combinations, we want the Cartesian Product to collect all tables together and then filter out the rows that have duplicate letters.

Each column in the final result will represent their respective rate. This means that column 1 in the finished table will be the block letter for biology, which is table 1.

So the SQL for the answer will look something like this.

SELECT * FROM Table1,Table2,Table3,Table4 
WHERE col1 != col2
AND col1 != col3
AND col1 != col4
AND col2 != col3
AND col2 != col4
AND col3 != col4;

      

Note. It is trivial to go to the case where each table has 2 columns, the first is a topic and the second is a block. Replacements should only be done in the where clause, but if I ignore that case, the code is much easier to follow.

It's a little verbose, but it works if each student needs to have a class from each of the tables and the maximum number of classes is 4 classes. This decision breaks down if the student is not supposed to have 4 grades.

The exact SQL query may differ slightly depending on the database used. For example: = can be <>.

Hope this helps!

0


source







All Articles