How to select records one hour apart on sql server

I have a user_id login dataset with a timestamp.

the user may be logged in multiple times, but we need to return records at least one hour apart, starting with the min record. deduplication must be done at the user level (there can be multiple users)

eg.

  • user1 2012-03-07 14: 24: 30.000
  • user1 2012-03-07 14: 34: 30.000
  • user1 2012-03-07 15: 14: 30.000
  • user1 2012-03-07 15: 20: 30.000
  • user1 2012-03-07 15: 30: 30.000
  • user1 2012-03-08 09: 20: 30.000
  • user1 2012-03-08 09: 50: 30.000
  • user1 2012-03-08 10: 30: 30.000
  • user2 2012-03-07 15: 20: 30.000

I would like to see only the following entries

  • user1 2012-03-07 14: 24: 30.000
  • user1 2012-03-07 15: 30: 30.000
  • user1 2012-03-08 09: 20: 30.000
  • user1 2012-03-08 10: 30: 30.000
  • user2 2012-03-07 15: 20: 30.000

=============================================== === =======================

Is there a way to do this in a clean way? we could do it recursively, but I was hoping there might be a way to use the row_number section.

any help is greatly appreciated !!

+3


source to share


1 answer


In Sql Server 2005 or later, the CTE will return a LoginAt table that removes those less than an hour than the LoginAts already selected.

;with SkipHour(UserID, LoginAT, rn) as (
  select UserID, min(LoginAt), cast (1 as bigint)
    from LogTable
   group by UserID
  union all
  select SkipHour.UserID, LogTable.LoginAt,
         row_number() over (partition by SkipHour.UserID 
                            order by Logtable.LoginAt) rn
  from SkipHour
     inner join LogTable
        on LogTable.UserID = SkipHour.UserID
     where datediff(minute, SkipHour.LoginAt, LogTable.LoginAt) >= 60
     -- Only first rows from previous generation qualify to have children
        and rn = 1
)
select *
from SkipHour
where rn = 1
order by UserID, LoginAT

      

The key part is row_number (). Since Sql Server does not allow aggregate functions or upper predicate, row_number () is the only way (IMO) to order a loginAt datetime and only store the first one.

The Sql Fiddle Playground is the way to go .



UPDATE

Line numbers apply separately for each generation. Excerpt from WITH common_table_expression (Transact-SQL) :

The analytic and aggregate functions in the recursive part of the CTE are applied to the set for the current level of recursion, not the set for the CTE. Functions like ROW_NUMBER only work on a subset of the data passed to them at the current recursion level, not across the entire dataset, staged for the recursive part of the CTE. For more information, see J. Using analytic functions in a recursive CTE.

+3


source







All Articles