Need help with SQL query to retrieve data

The SQL query I wrote got 3677 rows where the field cutomerID

contains a lot of duplicate data. I want to write a query that will give me all the required fields with a unique one cutomerID

. We cannot use different for customerID

only if the other fields are of a different data type. Please help me with this question:

  SELECT TimeMark,
        CustomerID,
        AccountId,
        TargetURL
    FROM BTILog
    WHERE timemark BETWEEN '20140926 00:00:00'
            AND '20141020 23:59:59'
        AND TargetURL LIKE '%/api/v1/cust/details%'
        AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
    ORDER BY TimeMark DESC

      

Please have a look the data I am getting from the below query.  There you can see duplicate customerID as highlighted in blue line.  I want the 57155299 customer only once with last time mark, similarly if there will be any other customer who have appeared twice / thrice or so on, theey should only once in my data extract

+3


source to share


13 replies


You have to use group by and aggregating function for TimeMark. The following are the unique customer ID entries with the last login timestamp for each:



SELECT max(TimeMark) TimeMark,
        CustomerID,
        AccountId,
        TargetURL
    FROM BTILog
    WHERE timemark BETWEEN '20140926 00:00:00'
            AND '20141020 23:59:59'
        AND TargetURL LIKE '%/api/v1/cust/details%'
        AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
    GROUP BY CustomerID, AccountId,TargetURL
    ORDER BY TimeMark DESC

      

+1


source


I believe that rating your result will give you the desired result. Similar to this.

select TimeMark,
        CustomerID,
        AccountId,
        TargetURL
    from (
        select TimeMark,
            CustomerID,
            AccountId,
            TargetURL,
            rank() over (
                CustomerID order by TimeMark desc
                ) rank_
        from BTILog
        where timemark between '20140926 00:00:00'
                and '20141020 23:59:59'
            and TargetURL like '%/api/v1/cust/details%'
            and Class like 'com.btfin.security.sso.SSODetailsFactory%'
        )
    where rank_ = 1;

      

You can also include the AccountId in the split clause if that's your desired output.



rank() over (
                    CustomerID, AccountId order by TimeMark desc
                    ) rank_

      

Additionally, if there are rows with the same CustomerID, AccountId, and TimeMark, you can use row_number instead of rank. Which will arbitrarily assign a higher rank to one of the related strings.

0


source


try it

Select distinct CustomerId, AccountId, TargetURL,
                Max(TimeMark) over (partition by CustomerId) as 'MaxTM'
  FROM BTILog
    WHERE timemark BETWEEN '20140926 00:00:00'
            AND '20141020 23:59:59'
        AND TargetURL LIKE '%/api/v1/cust/details%'
        AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
   group by CustomerId, AcconutId, TargetUrl --I don't think this is needed
    ORDER BY MaxTM DESC

      

0


source


Each selected field must be part of a group or aggregated (as is the case with MAX, SUM, COUNT, etc.). It looks like in your case you need to group the CustomerIDs, AccountId and possibly TargetURL (if they are still the same, group, if not, maybe MAX (TargetURL)) and figure out what you want to do with TimeMark - MAX (TimeMark). possibly?

You can try one of the following, depending on the nature of the TargetURL:

SELECT MAX (TimeMark), CustomerID, AccountId, TargetURL
FROM BTILog
WHERE timemark BETWEEN '20140926 00:00:00'
    AND '20141020 23:59:59'
    AND TargetURL LIKE '% / api / v1 / cust / details%'
    AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
CustomerID, AccountId, TargetURL
ORDER BY TimeMark DESC

or

SELECT MAX (TimeMark), CustomerID, AccountId, MAX (TargetURL)
FROM BTILog
WHERE timemark BETWEEN '20140926 00:00:00'
    AND '20141020 23:59:59'
    AND TargetURL LIKE '% / api / v1 / cust / details%'
    AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
GROUP BY CustomerID, AccountId
ORDER BY TimeMark DESC
0


source


you need to use a subquery to return the record with the maximum TimeMark based on criteria, then join CustomerID and TimeMark, this will limit the result to the record you want

SELECT BTILog.TimeMark, BTILog.CustomerID, BTILog.AccountID, BTILog.TargetUrl
FROM BTILog
INNER JOIN (
  SELECT Max(BTILog.TimeMark) AS MaxOfTimeMark, BTILog.CustomerID
  FROM BTILog
  WHERE (((BTILog.TargetUrl) Like '%/api/v1/cust/details%')
    AND ((BTILog.Class) Like  'com.btfin.security.sso.SSODetailsFactory%')
    AND ((BTILog.TimeMark) BETWEEN '20140926 00:00:00' AND '20141020 23:59:59'))
  GROUP BY BTILog.CustomerID) AS T1
ON (BTILog.CustomerID = T1.CustomerID)
  AND (BTILog.TimeMark = T1.MaxOfTimeMark)
ORDER by BTILog.TimeMark DESC

      

I usually work in MS Access Query Editor, so the syntax is different, I think I changed everything that could cause it to "crash"

0


source


Based on your sample details and given that your request is correct, follow these steps.

Select * from 
(SELECT TimeMark,
        CustomerID,
        AccountId,
        TargetURL,
        **ROW_NUMBER()over(partition by CustomerID order by TimeMark desc)rownum**
    FROM BTILog
    WHERE timemark BETWEEN '20140926 00:00:00'
            AND '20141020 23:59:59'
        AND TargetURL LIKE '%/api/v1/cust/details%'
        AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
    ORDER BY TimeMark DESC
    ) tbl where rownum=1

      

0


source


I don't think the function rank

or the tricks rownum

will help, since you are not dealing with a sorted list with the same unique keys, but for each CustomerID, A.AccountId

you are looking for an entry that is within a specified time period that has a maximum timestamp

You can try this:

  SELECT DISTINCT
        A.BTimeMark,
        A.CustomerID,
        A.AccountId,
        A.TargetURL
    FROM BTILog A
    WHERE A.timemark =
    (
          SELECT MAX(B.TimeMark)
          FROM BTILog B
          WHERE B.timemark BETWEEN '20140926 00:00:00' AND '20141020 23:59:59'
            AND B.TargetURL LIKE '%/api/v1/cust/details%'
            AND B.Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'

            AND B.CustomerID = A.CustomerID
            AND B.AccountId = A.AccountId
    )
    AND A.TargetURL LIKE '%/api/v1/cust/details%'
    AND A.Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
    ORDER BY A.TimeMark DESC

      

0


source


in sql server i use for date> = Conversion (nchar (10), first_date_value, 103) ............

0


source


How about a subquery that counts rows for the current customer

  SELECT TimeMark,
        CustomerID,
        AccountId,
        TargetURL
    FROM BTILog ALIAS outer
    WHERE timemark BETWEEN '20140926 00:00:00'
            AND '20141020 23:59:59'
        AND TargetURL LIKE '%/api/v1/cust/details%'
        AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
        AND (
                SELECT count(*) 
                FROM BTILog
                WHERE timemark BETWEEN '20140926 00:00:00'
                AND '20141020 23:59:59'
                AND TargetURL LIKE '%/api/v1/cust/details%'
                AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
                AND outer.CustomerID = CustomerID
        ) == 1

    ORDER BY TimeMark DESC

      

0


source


Use the group by CustomerID as well as the duplicate column so you can keep track of the number of duplicates. You don't need TimeMark because they will be duplicated by Times

SELECT TimeMark,
        CustomerID,
        COUNT(CustomerID) AS duplicate,
        AccountId,
        TargetURL
    FROM BTILog
    WHERE timemark BETWEEN '20140926 00:00:00'
            AND '20141020 23:59:59'
        AND TargetURL LIKE '%/api/v1/cust/details%'
        AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
   GROUP BY(CustomerID)
    ORDER BY TimeMark DESC

      

-1


source


You can write a subquery to select DISTINCT only from the customer ID.

 SELECT TimeMark, 
        (SELECT DISTINCT CustomerID 
        FROM BTILog
        WHERE timemark BETWEEN '20140926 00:00:00'
            AND '20141020 23:59:59'
        AND TargetURL LIKE '%/api/v1/cust/details%'
        AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%') 
        AS CustyID, 
        AccountId,
        TargetURL
    FROM BTILog
    WHERE timemark BETWEEN '20140926 00:00:00'
            AND '20141020 23:59:59'
        AND TargetURL LIKE '%/api/v1/cust/details%'
        AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
    ORDER BY TimeMark DESC

      

-1


source


You will most likely need to redesign your data structure to split the data into two tables, one defined by the Client and the other as a timestamp. Each customer record will be unique, with or without DISTINCT. The Time Stamp table will reference the customer table through a foreign key, similar field CustomerID and timestamp along with any other fields that need to be recorded on each periodic event. Schematically:

Customer: CustomerID, Integer, Unique.
AccountID, Integer.
TargetURL text 250.

TimeStamp: TimeID, Integer, Unique.
TimeMark, DateTime.
CustomerID, Integer

Client from Clients and TimeID from TimeStamp should probably be AutoIncrement 'Crash Number' fields. The CustomerID fields in each table act as a reference. With this streamlined schema, your queries will start to fall into place easily and you can retrieve the data you want without complexities like DISTINCT and UNIQUE ROWS.

-1


source


You should group CustomerID

and retrieve records HAVING COUNT(CustomerID) = 1

; This should give you all the records where the CustomerID is unique.

SELECT TimeMark,
    CustomerID,
    AccountId,
    TargetURL
FROM BTILog
WHERE timemark BETWEEN '20140926 00:00:00'
        AND '20141020 23:59:59'
    AND TargetURL LIKE '%/api/v1/cust/details%'
    AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
ORDER BY TimeMark DESC
GROUP BY CustomerID
HAVING COUNT(CustomerID) = 1;

      

-1


source







All Articles