SQL Server - replace OR when JOINING on non-indexed field

History:

One of our clients has a system in which they have records of telephone conversations between their agents and their clients, with whom they create various contracts. The records are stored on the server with their locations stored in the records table in the database. Agents can then "attach" the entry to the contract, which creates an entry in the ContractRecordings table. I need to create a report that shows which records are not contract bound, but the way the tables are created makes this more complicated than expected.

Recording
------------------------------
recordingId : INT PK, IDENTITY
agentId : INT, FK
filename : NVARCHAR(255)


ContractRecording
----------------------------------
recordingId : INT PK, IDENTITY
contractNumber : INT
created : DATETIME
username : NVARCHAR(20)
note : NVARCHAR(MAX)
fileLocation : NVARCHAR(max)

      

It would be easy if ContractRecording.recordingId was a foreign key reference for Recording.recordingId, but it isn't. This is his own identification key. The only link between the tables is the location of the file, but Recording.filename only stores the file name, while ContractRecording.fileLocation stores the full path. Yes, I know, but I did not design these tables. Fortunately, there is a pattern, and the full path is derived from the agent name and record date, which we can learn from the data in the record table. But of course there is another problem: the file path format changed a year ago, and some records are stored in the old format and some in the new format.

Old format: C: \ John-Recordings \ 2015 \ 06 \ 15-0811.wav

New format: C: \ Recordings \ John Smith \ 2015 \ 06 \ 15-0811.wav

Problem:

To link two tables, I have to join them in the full path of records, which must be manually created in the Recording table and can be in one of two formats. I first tried using OR in the JOIN clause, but it takes about 8 minutes to go back about 15k lines, which is not acceptable. Then I tried to use two LEFT OUTER JOINs - one for each condition, but what seemed to be the same took ten minutes. I am assuming that since I am joining a custom field that is not indexed. Splitting it into two SELECTs and using UNION resulted in duplicate rows, each query would return one row for each record. Do I have any other options to get this request up to a few seconds? Here's my original query using the OR clause.

SELECT * FROM
    (SELECT 
        cr.recordingid AS "attachedrecordingid"
        ,rec.recordingid AS "rawrecordingid"
        ,cr.contractnumber
        ,cr.created
        ,rec.name
        ,cr.note
        ,cr.filelocation
        ,rec.filename
        ,rec.recordtime
    FROM ContractRecording cr
    RIGHT OUTER JOIN
    (SELECT 
        recordingid
        ,a.name
        ,filename
        ,retain
        ,r.recordtime
        ,'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathnew"
        ,'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathold"
    FROM Recording r
    JOIN Agents a
        ON r.agentid = a.agentid) rec
    ON cr.filelocation = rec.fullpathold OR cr.filelocation = rec.fullpathnew) main
ORDER BY main.name, main.recordtime

      

The report should display one row for all records in the record table (unless only one record is tied to multiple contracts, in which case it should show one row per pair), while data from the Contract record is displayed if there are any rows which correspond to one of the file formats.

If absolutely necessary, I don't mind just fetching all the data from both tables and linking them using code, but that would be the last resort.

UPDATE:

As requested, here is the parse version of the UNION query. As mentioned, it returns two strings for each pairing - one with data and one without. This is because at least one of the two JOINs will not always have a match, but I just want to ignore them when the other JOIN has a match. If neither JOIN matches, I also want to display it once. I'm less sure that I can achieve the desired result with UNION than with the other capabilities, so I didn't pursue this approach.

SELECT * FROM
    ((SELECT 
        cr.recordingid AS "attachedrecordingid"
        ,rec.recordingid AS "rawrecordingid"
        ,cr.contractnumber
        ,cr.created
        ,rec.name
        ,cr.note
        ,cr.filelocation
        ,rec.filename
        ,rec.recordtime
    FROM ContractRecording cr
    RIGHT OUTER JOIN
    (SELECT 
        recordingid
        ,a.name
        ,filename
        ,retain
        ,r.recordtime
        ,'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathold"
    FROM Recording r
    JOIN Agents a
        ON r.agentid = a.agentid) rec
    ON cr.filelocation = rec.fullpathold)
    UNION
    (SELECT 
        cr.recordingid AS "attachedrecordingid"
        ,rec.recordingid AS "rawrecordingid"
        ,cr.contractnumber
        ,cr.created
        ,rec.name
        ,cr.note
        ,cr.filelocation
        ,rec.filename
        ,rec.recordtime
    FROM ContractRecording cr
    RIGHT OUTER JOIN
    (SELECT 
        recordingid
        ,a.name
        ,filename
        ,retain
        ,r.recordtime
        ,'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathnew"
    FROM Recording r
    JOIN Agents a
        ON r.agentid = a.agentid) rec
    ON cr.filelocation = rec.fullpathnew)) main
ORDER BY main.name, main.recordtime

      

+3


source to share


1 answer


You can try using LIKE

WITH AgentRecordings AS
(
    SELECT  
        a.name,
        r.recordingId AS rawrecordingid,
        r.filename,
        r.recordtime,
        CONCAT(
            'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + FILENAME,
            'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
        ) AS filepaths
    FROM
        Agents a
        JOIN Recording r ON a.agentId = r.agentId
)
SELECT
    cr.recordingid AS "attachedrecordingid"
    ,rec.recordingid AS "rawrecordingid"
    ,cr.contractnumber
    ,cr.created
    ,rec.name
    ,cr.note
    ,cr.filelocation
    ,rec.filename
    ,rec.recordtime
FROM
    AgentRecordings rec
    LEFT JOIN ContractRecording cr ON rec.filepaths LIKE '%' + cr.filelocation + '%'

      

If it helps anyone. I will also try to create a temp table instead of using cte and see if that helps more.

You can also try splitting the two OR operators by 2 cte and using concatenation to concatenate the found record id



WITH fullpathnew AS
(
    SELECT  cr.recordingid AS "attachedrecordingid",
            rec.recordingid AS "rawrecordingid",
            cr.contractnumber,
            cr.created,
            cr.note,
            cr.filelocation
    FROM    Agents a
            JOIN Recording r ON a.agentId = r.agentId
            JOIN ContractRecording cr ON cr.filelocation = 'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
),
fullpathold AS
(
    SELECT  cr.recordingid AS "attachedrecordingid",
            rec.recordingid AS "rawrecordingid",
            cr.contractnumber,
            cr.created,
            cr.note,
            cr.filelocation
    FROM    Agents a
            JOIN Recording r ON a.agentId = r.agentId
            JOIN ContractRecording cr ON cr.filelocation = 'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
)
combinedCtes AS
(
    SELECT attachedrecordingid, rawrecordingid, contractnumber, created, note, filelocation FROM fullpathnew
    UNION SELECT attachedrecordingid, rawrecordingid, contractnumber, created, note, filelocation FROM fullpathold
)
SELECT  cte.attachedrecordingid
        ,r.recordingid AS "rawrecordingid"
        ,cte.contractnumber
        ,cte.created
        ,a.name
        ,cte.note
        ,cte.filelocation
        ,r.filename
        ,r.recordtime
FROM    Agents a
        JOIN Recording r ON r.agentId = a.agentId
        LEFT JOIN combinedCtes cte ON r.recordingid = cte.rawrecordingid

      

Yours UNION

should be in the pick and then you can join that subquery

SELECT  j.attachedrecordingid
        ,r.recordingid AS rawrecordingid
        ,j.contractnumber
        ,j.created
        ,a.NAME
        ,j.note
        ,j.filelocation
        ,r.filename      
        ,r.recordtime
FROM    Agents a
        JOIN Recording r ON a.agentId = r.agentId
        LEFT JOIN(
            SELECT  cr.recordingid AS "attachedrecordingid"
                    ,rec.recordingid AS "rawrecordingid"
                    ,cr.contractnumber
                    ,cr.created
                    ,cr.note
                    ,cr.filelocation
            FROM    Agents a 
                    JOIN Recording r
                    JOIN ContractRecording cr 
                        ON cr1.filelocation = 'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
            UNION
            SELECT  cr.recordingid AS "attachedrecordingid"
                    ,rec.recordingid AS "rawrecordingid"
                    ,cr.contractnumber
                    ,cr.created
                    ,cr.note
                    ,cr.filelocation
            FROM    Agents a 
                    JOIN Recording r
                    JOIN ContractRecording cr 
                        ON cr1.filelocation = 'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename


        ) j ON r.recordingId = j.rawrecordingid
ORDER BY a.name, r.recordtime

      

+2


source







All Articles