Need help with SQL query to retrieve data
The SQL query I wrote got 3677 rows where the field cutomerID
contains a lot of duplicate data. I want to write a query that will give me all the required fields with a unique one cutomerID
. We cannot use different for customerID
only if the other fields are of a different data type. Please help me with this question:
SELECT TimeMark,
CustomerID,
AccountId,
TargetURL
FROM BTILog
WHERE timemark BETWEEN '20140926 00:00:00'
AND '20141020 23:59:59'
AND TargetURL LIKE '%/api/v1/cust/details%'
AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
ORDER BY TimeMark DESC
source to share
You have to use group by and aggregating function for TimeMark. The following are the unique customer ID entries with the last login timestamp for each:
SELECT max(TimeMark) TimeMark,
CustomerID,
AccountId,
TargetURL
FROM BTILog
WHERE timemark BETWEEN '20140926 00:00:00'
AND '20141020 23:59:59'
AND TargetURL LIKE '%/api/v1/cust/details%'
AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
GROUP BY CustomerID, AccountId,TargetURL
ORDER BY TimeMark DESC
source to share
I believe that rating your result will give you the desired result. Similar to this.
select TimeMark,
CustomerID,
AccountId,
TargetURL
from (
select TimeMark,
CustomerID,
AccountId,
TargetURL,
rank() over (
CustomerID order by TimeMark desc
) rank_
from BTILog
where timemark between '20140926 00:00:00'
and '20141020 23:59:59'
and TargetURL like '%/api/v1/cust/details%'
and Class like 'com.btfin.security.sso.SSODetailsFactory%'
)
where rank_ = 1;
You can also include the AccountId in the split clause if that's your desired output.
rank() over (
CustomerID, AccountId order by TimeMark desc
) rank_
Additionally, if there are rows with the same CustomerID, AccountId, and TimeMark, you can use row_number instead of rank. Which will arbitrarily assign a higher rank to one of the related strings.
source to share
try it
Select distinct CustomerId, AccountId, TargetURL,
Max(TimeMark) over (partition by CustomerId) as 'MaxTM'
FROM BTILog
WHERE timemark BETWEEN '20140926 00:00:00'
AND '20141020 23:59:59'
AND TargetURL LIKE '%/api/v1/cust/details%'
AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
group by CustomerId, AcconutId, TargetUrl --I don't think this is needed
ORDER BY MaxTM DESC
source to share
Each selected field must be part of a group or aggregated (as is the case with MAX, SUM, COUNT, etc.). It looks like in your case you need to group the CustomerIDs, AccountId and possibly TargetURL (if they are still the same, group, if not, maybe MAX (TargetURL)) and figure out what you want to do with TimeMark - MAX (TimeMark). possibly?
You can try one of the following, depending on the nature of the TargetURL:
SELECT MAX (TimeMark), CustomerID, AccountId, TargetURL FROM BTILog WHERE timemark BETWEEN '20140926 00:00:00' AND '20141020 23:59:59' AND TargetURL LIKE '% / api / v1 / cust / details%' AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%' CustomerID, AccountId, TargetURL ORDER BY TimeMark DESC
or
SELECT MAX (TimeMark), CustomerID, AccountId, MAX (TargetURL) FROM BTILog WHERE timemark BETWEEN '20140926 00:00:00' AND '20141020 23:59:59' AND TargetURL LIKE '% / api / v1 / cust / details%' AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%' GROUP BY CustomerID, AccountId ORDER BY TimeMark DESC
source to share
you need to use a subquery to return the record with the maximum TimeMark based on criteria, then join CustomerID and TimeMark, this will limit the result to the record you want
SELECT BTILog.TimeMark, BTILog.CustomerID, BTILog.AccountID, BTILog.TargetUrl
FROM BTILog
INNER JOIN (
SELECT Max(BTILog.TimeMark) AS MaxOfTimeMark, BTILog.CustomerID
FROM BTILog
WHERE (((BTILog.TargetUrl) Like '%/api/v1/cust/details%')
AND ((BTILog.Class) Like 'com.btfin.security.sso.SSODetailsFactory%')
AND ((BTILog.TimeMark) BETWEEN '20140926 00:00:00' AND '20141020 23:59:59'))
GROUP BY BTILog.CustomerID) AS T1
ON (BTILog.CustomerID = T1.CustomerID)
AND (BTILog.TimeMark = T1.MaxOfTimeMark)
ORDER by BTILog.TimeMark DESC
I usually work in MS Access Query Editor, so the syntax is different, I think I changed everything that could cause it to "crash"
source to share
Based on your sample details and given that your request is correct, follow these steps.
Select * from
(SELECT TimeMark,
CustomerID,
AccountId,
TargetURL,
**ROW_NUMBER()over(partition by CustomerID order by TimeMark desc)rownum**
FROM BTILog
WHERE timemark BETWEEN '20140926 00:00:00'
AND '20141020 23:59:59'
AND TargetURL LIKE '%/api/v1/cust/details%'
AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
ORDER BY TimeMark DESC
) tbl where rownum=1
source to share
I don't think the function rank
or the tricks rownum
will help, since you are not dealing with a sorted list with the same unique keys, but for each CustomerID, A.AccountId
you are looking for an entry that is within a specified time period that has a maximum timestamp
You can try this:
SELECT DISTINCT
A.BTimeMark,
A.CustomerID,
A.AccountId,
A.TargetURL
FROM BTILog A
WHERE A.timemark =
(
SELECT MAX(B.TimeMark)
FROM BTILog B
WHERE B.timemark BETWEEN '20140926 00:00:00' AND '20141020 23:59:59'
AND B.TargetURL LIKE '%/api/v1/cust/details%'
AND B.Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
AND B.CustomerID = A.CustomerID
AND B.AccountId = A.AccountId
)
AND A.TargetURL LIKE '%/api/v1/cust/details%'
AND A.Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
ORDER BY A.TimeMark DESC
source to share
How about a subquery that counts rows for the current customer
SELECT TimeMark,
CustomerID,
AccountId,
TargetURL
FROM BTILog ALIAS outer
WHERE timemark BETWEEN '20140926 00:00:00'
AND '20141020 23:59:59'
AND TargetURL LIKE '%/api/v1/cust/details%'
AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
AND (
SELECT count(*)
FROM BTILog
WHERE timemark BETWEEN '20140926 00:00:00'
AND '20141020 23:59:59'
AND TargetURL LIKE '%/api/v1/cust/details%'
AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
AND outer.CustomerID = CustomerID
) == 1
ORDER BY TimeMark DESC
source to share
Use the group by CustomerID as well as the duplicate column so you can keep track of the number of duplicates. You don't need TimeMark because they will be duplicated by Times
SELECT TimeMark,
CustomerID,
COUNT(CustomerID) AS duplicate,
AccountId,
TargetURL
FROM BTILog
WHERE timemark BETWEEN '20140926 00:00:00'
AND '20141020 23:59:59'
AND TargetURL LIKE '%/api/v1/cust/details%'
AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
GROUP BY(CustomerID)
ORDER BY TimeMark DESC
source to share
You can write a subquery to select DISTINCT only from the customer ID.
SELECT TimeMark,
(SELECT DISTINCT CustomerID
FROM BTILog
WHERE timemark BETWEEN '20140926 00:00:00'
AND '20141020 23:59:59'
AND TargetURL LIKE '%/api/v1/cust/details%'
AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%')
AS CustyID,
AccountId,
TargetURL
FROM BTILog
WHERE timemark BETWEEN '20140926 00:00:00'
AND '20141020 23:59:59'
AND TargetURL LIKE '%/api/v1/cust/details%'
AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
ORDER BY TimeMark DESC
source to share
You will most likely need to redesign your data structure to split the data into two tables, one defined by the Client and the other as a timestamp. Each customer record will be unique, with or without DISTINCT. The Time Stamp table will reference the customer table through a foreign key, similar field CustomerID and timestamp along with any other fields that need to be recorded on each periodic event. Schematically:
Customer: CustomerID, Integer, Unique.
AccountID, Integer.
TargetURL text 250.
TimeStamp: TimeID, Integer, Unique.
TimeMark, DateTime.
CustomerID, Integer
Client from Clients and TimeID from TimeStamp should probably be AutoIncrement 'Crash Number' fields. The CustomerID fields in each table act as a reference. With this streamlined schema, your queries will start to fall into place easily and you can retrieve the data you want without complexities like DISTINCT and UNIQUE ROWS.
source to share
You should group CustomerID
and retrieve records HAVING COUNT(CustomerID) = 1
; This should give you all the records where the CustomerID is unique.
SELECT TimeMark,
CustomerID,
AccountId,
TargetURL
FROM BTILog
WHERE timemark BETWEEN '20140926 00:00:00'
AND '20141020 23:59:59'
AND TargetURL LIKE '%/api/v1/cust/details%'
AND Class LIKE 'com.btfin.security.sso.SSODetailsFactory%'
ORDER BY TimeMark DESC
GROUP BY CustomerID
HAVING COUNT(CustomerID) = 1;
source to share