Find records that have related records in the past

Basically, the result I am trying to do is "get the number of successful entries that have 0 failed entries in a given amount of time in the past." "success" and "failure" simply refer to the column value.

While it's a little more complicated, here's a description of the table I'm dealing with:

`log`
  id                int PRIMARY KEY AUTO_INCREMENT
  fingerprint_id    int (foreign key)
  status            boolean
  date              timestamp

      

The workflow of the little system we have is that when the user is fingerprinted, a record is added to this table and status

set based on whether it matches (again, there is more, I'm just trying to simplify). We get fingerprint_id

based on what the user is doing, so this is the identifier for linking records to a person.

We now require them to try it no more than 3 times. Thus, they can match 1 of 3, 2 of 3, 3 of 3, or not at all. And that means they can have 1, 2, or 3 entries in their "group". While this is not the case, we can assume that the user will keep trying until they match or reach 3 failed attempts (we found that sometimes people don't continue after failing once or twice, perhaps).

Here's an example of some data:

id  fp_id status   date
----------------------------------------
20    2     0      '2013-01-21 12:30:01'
21    2     0      '2013-01-21 12:30:05'
22    2     0      '2013-01-21 12:30:10'
23    9     1      '2013-01-21 12:31:30'
24    1     0      '2013-01-21 12:35:00'
25    1     1      '2013-01-21 12:35:05'

      

In the data, user ( fingerprint_id

) 2 tried 3 times and never matched. User 9 matches the first attempt. User 1 tried once and failed, then tried again and picked up.

The challenge is to find out how many successful ( status

= 1) log entries have 0 failed ( status

= 0) entries within 35 seconds ago. Of course, the only way to "connect" them is fingerprint_id

.

Again, we accept many things, but that's fine.

Here's my attempts:

SELECT  COUNT(*)
FROM    log AS log_main
WHERE   log_main.status=1 AND
        (SELECT COUNT(*)
         FROM   log AS log_inner
         WHERE  log_inner.fingerprint_id=log_main.fingerprint_id AND
                log_inner.status=0 AND
                log_inner.date<log_main.date AND log_inner.date>=(log_main.date - INTERVAL 35 SECOND))=0

      

^ I would expect this to display all successful entries that have a score of 0 unsuccessful entries that happened 35 seconds ago (for this user). But I don't know because the request takes over 600 seconds. I just found out how to extend the MySQL Workbench maximum timeout, but it takes a very long time anyway. There are about 120,000 records in the table, so I'm not sure if that's enough to make this query slow down.

Anyway, here's another try:

SELECT  COUNT(*)
FROM    (SELECT log.fingerprint_id, log.date
         FROM log
         WHERE log.status=1) successful,
        (SELECT log.fingerprint_id, log.date
         FROM log
         WHERE log.status=0) unsuccessful
WHERE   successful.fingerprint_id=unsuccessful.fingerprint_id AND
        unsuccessful.date<successful.date AND unsuccessful.date>=(successful.date - INTERVAL 35 SECOND)

      

^ I feel like this one is closer, but of course there is no comparison of "counting" the number of records matched in the past. This is the part I am confused about on how to solve it. I have a feeling it has to do with GROUP BY

or instead use IN

, but what I did just doesn't work (in the sense that it takes over 600 seconds or something). Here is an example of what I have tried withGROUP BY

SELECT  successful.id, COUNT(*) cnt
FROM    (SELECT log.fingerprint_id, log.date, log.id
         FROM log
         WHERE log.status=1) successful,
        (SELECT log.fingerprint_id, log.date, log.id
         FROM log
         WHERE log.status=0) unsuccessful
WHERE   successful.fingerprint_id=unsuccessful.fingerprint_id AND
        unsuccessful.date<successful.date AND unsuccessful.date>=(successful.date - INTERVAL 35 SECOND)
GROUP BY successful.id

      

^ But the results only contain strings that are NOT 0. And I assume it is because of the suggestion WHERE

. But I ONLY need a score of 0.

I've tried so many combinations, I think my brain is just fried.

+3


source to share


1 answer


Try using NOT EXISTS

instead COUNT = 0

. This should work much better.

SELECT  COUNT(*)
FROM    log AS log_main
WHERE   log_main.status=1 
AND     NOT EXISTS
        (   SELECT 1
            FROM   log AS log_inner
            WHERE   log_inner.fingerprint_id=log_main.fingerprint_id
            AND     log_inner.status = 0
            AND     log_inner.date < log_main.date 
            AND     log_inner.date >= (log_main.date - INTERVAL 35 SECOND)
        );

      

You also need to make sure the table is indexed correctly.

EDIT

I find that using is LEFT JOIN/IS NULL

more efficient in MySQL than using NOT EXISTS

, so this will work better than the above (although perhaps not significantly):



SELECT  COUNT(*)
FROM    log AS log_main
        LEFT JOIN log AS log_inner
            ON log_inner.fingerprint_id=log_main.fingerprint_id
            AND log_inner.status = 0
            AND log_inner.date < log_main.date 
            AND log_inner.date >= (log_main.date - INTERVAL 35 SECOND)
WHERE   log_main.status = 1 
AND     Log_inner.fingerprint_id IS NULL;

      

EDIT 2

To get records with 1 or 2 attempts, etc., I will still use JOIN, but like this:

SELECT  COUNT(*)
FROM    (   SELECT  log_Main.id
            FROM    log AS log_main
                    INNER JOIN log AS log_inner
                        ON log_inner.fingerprint_id=log_main.fingerprint_id
                        AND log_inner.status = 0
                        AND log_inner.date < log_main.date 
                        AND log_inner.date >= (log_main.date - INTERVAL 35 SECOND)
            WHERE   log_main.status = 1 
            AND     Log_inner.fingerprint_id IS NULL
            GROUP BY log_Main.id
            HAVING COUNT(log_Inner.id) = 1
        ) d

      

+1


source







All Articles