Select the earliest date and time from the list of individual user sessions
I have a table of user access sessions that logs the activity of a site visitor:
accessid, userid, date, time, url
I am trying to get all the individual sessions for userid 1234, as well as the earliest date and time for each of those individual sessions.
SELECT
DISTINCT accessid,
date,
time
FROM
accesslog
WHERE userid = '1234'
GROUP BY accessid
This gives me the date and time of a random string in each individual accessid. I read several posts recommending using min () and max (), so I tried:
SELECT DISTINCT accessid, MIN(DATE) AS date, MIN(TIME) AS time FROM accesslog WHERE userid = '1234' GROUP BY accessid ORDER BY date DESC, time DESC
... and even...
SELECT DISTINCT accessid, MIN(CONCAT(DATE, ' ', TIME)) AS datetime FROM accesslog WHERE userid = '1234' GROUP BY accessid ORDER BY date DESC, time DESC
... but I never get the correct result from the earliest date and time.
What's the trick for ordering this kind of request?
EDIT -
Something strange is happening ...
The code given below by Bill Carwin correctly retrieves the earliest date and time for sessions started in 2009-09. But for sessions that started on some day 2009-08, the time and date of the first hit in the current month is what is returned. In other words, the request doesn't seem to cover months!
Example dataset:
accessid | userid | date | time
1 | 1234 | 2009-08-15 | 01:01:01
1 | 1234 | 2009-09-01 | 12:01:01
1 | 1234 | 2009-09-15 | 13:01:01
2 | 1234 | 2009-09-01 | 14:01:01
2 | 1234 | 2009-09-15 | 15:01:01
At least in my actual datasheet, the query posted below finds the following early dates and times for each of the two access IDs:
accessid | userid | date | time
1 | 1234 | 2009-09-01 | 12:01:01
2 | 1234 | 2009-09-01 | 14:01:01
... and I would guess that the only reason the result for accessid 2 looks correct is because it had no hits in the previous month.
I'm going crazy?
EDIT 2 -
The answer is yes, I'm going crazy. The query works on the above sample data when it is placed in the duplicated structure table.
Here is the (truncated) raw data. I included the first hit, another hit in the same month, the first hit of the next month, and then the last hit of the month. The original dataset has many more strokes between these points, for a total of 462 lines.
accessid | date | time
cbb82c08d3103e721a1cf0c3f765a842 | 2009-08-18 | 04:01:42
cbb82c08d3103e721a1cf0c3f765a842 | 2009-08-23 | 23:18:52
cbb82c08d3103e721a1cf0c3f765a842 | 2009-09-17 | 05:12:16
cbb82c08d3103e721a1cf0c3f765a842 | 2009-09-18 | 06:29:59
... the query returns 2009-09-17 as the earliest value when querying the source table. But, when I copy ........ oh, balls.
This is because hits from 2009-08% have an empty field userid
.
source to share
This is a variation of the "largest n-by-group" problem that appears on StackOverflow several times a week.
SELECT
a1.accessid,
a1.date,
a1.time
FROM
accesslog a1
LEFT OUTER JOIN
accesslog a2
ON (a1.accessid = a2.accessid AND a1.userid = a2.userid
AND (a1.date > a2.date OR a1.date = a2.date AND a1.time > a2.time))
WHERE a1.userid = '1234'
AND a2.accessid IS NULL;
The way it works is we are trying to find a string (a2) that has the same access and user ID and an earlier date or time than string a1. When we cannot find an earlier line, then a1 must be the earliest line.
Repeat your comment, I just tried it with the example data you provided. This is what I get:
+----------+------------+----------+
| accessid | date | time |
+----------+------------+----------+
| 1 | 2009-08-15 | 01:01:01 |
| 2 | 2009-09-01 | 14:01:01 |
+----------+------------+----------+
I am using MySQL 5.0.75 on Mac OS X.
source to share