SQL: number of values ββin one column relative to another column
I have the following table
id date time_stamp licenseid storeid deviceid value
1 2015-06-12 17:36:15 lic0001 1 0add 52
2 2015-06-12 17:36:15 lic0002 1 0add 54
3 2015-06-12 17:36:15 lic0003 1 0add 53
4 2015-06-12 17:36:21 lic0001 1 0add 54
5 2015-06-12 17:36:21 lic0002 1 0add 59
6 2015-06-12 17:36:21 lic0003 1 0add 62
7 2015-06-12 17:36:21 lic0004 1 0add 55
8 2015-06-12 17:36:15 lic0001 1 0bdd 53
9 2015-06-12 17:36:15 lic0002 1 0bdd 52
10 2015-06-12 17:36:15 lic0003 1 0bdd 52
I need a deviceid number based on the number of timestamps it is visible in. So the result would be something like this: 0add is mapped to 2 timestamps, so the score is 2, while 0bdd is mapped at one time, so 0bdd has a count of 1. The number of licenses corresponding to the device per timestamp is not counted towards the score.
date deviceid count
2015-06-12 0add 2
2015-06-12 0bdd 1
I am trying to execute this query below, but cannot verify if it works the way the query has been running for quite some time and is not showing any result:
select date, deviceid, count(deviceid) from my_table group by deviceid, time_stamp
Note that the number of rows this query runs on is 2,000,000
- Is the above query correct for my output
- If so, how can I optimize it to execute quickly for my table size.
EDIT: Column labeled time_stamp
is type TIME
.
source to share
I think you need to consider a couple of things here:
- If you want the number of timestamps per device for each date, you must group device and date, not device and timestamp.
- You have strings where the device ID has the same date and timestamp, so you might want to look for separate timestamps on each date.
The fix for the first one explains it himself, and for the second one you can change the aggregation to COUNT(DISTINCT timestamp)
. Try this query:
SELECT device_id, date, COUNT(DISTINCT timestamp) AS numRows
FROM myTable
GROUP BY device_id, date;
Here is a SQL Fiddle example using your example data. It is also worth noting that including the index on the device_id and date columns can help execute this query faster if the query is still slow for you. See Comments for a more detailed discussion of this issue.
source to share