SQL query to compute calculations from different rows of the same table on SQL server

Question

SQL query to compute calculations from different rows of the same table on SQL server

I need a SQL query for the following. I am new to SQL. And the table below is just an example of what data I have. I have very large data of about 30 million rows and would like to write a query to get the table below.

   Id        type        data          time
-----------------------------------------------------------
    1          30          3.9          15:50:10.660555
    1          30          4.0          15:50:10.660777
    1          70          11.5         15:50:10.797966
    1          30          4.1          15:50:10.834444
    1          70          12.6         15:50:10.853114
    1          70          16.7         15:50:10.955086
    1          30          5            15:50:10.99
    11         30          3.8          15:50:11.660555
    11         30          4.1          15:50:11.660777
    11         70          12.5         15:50:11.797966
    11         30          4.7          15:50:11.834444
    11         70          12.68        15:50:11.853114
    11         70          16.76        15:50:11.955086
    11         30          5.1          15:50:11.99

I have a table as above. For each type 70 I need to compute something with the last known type 30. For example, for Id = 1 for the first type = 70 data at 15: 50: 10.797966 I need to get data type = 30 at 15: 50: 10.660777, so i can calculate the result = 11.5 / 4.0. Likewise, for type = 70 at 15: 50: 10.853114, I need data like = 30 at 15:50: 10.834444, so my result is 12.6 / 4.1.

I want the result to look like this:

Id          type           result             time
------------------------------------------------------
1            70             11.5/4.0        15:50:10.797966
1            70             12.6/4.1        15:50:10.853114
1            70             16.7/4.1        15:50:10.955086
11           70             12.5/4.1        15:50:11.797966
11           70             12.68/4.7       15:50:11.853114
11           70             16.76/4.7       15:50:11.955086

I would like to be able to execute these SQL queries in python using pyodbc.

Any help would be greatly appreciated! Thanks in advance!

+3

python sql pandas sql-server

Gingerbread 05 jul. '17 at 1:00

source to share

2 answers

You can only do this with window functions.

For each line, get the previous type and value. Also, list the 70s in such a way that you can identify them as a group (you can do this with the grand total).

In the next step, use the partitioned max to get the type and finally do your calculation.

select t.*,
       data / data_30 as result
from (select t.*,
             max(case when prev_type = 30 then prev_data end) over (partition by id, grp) as data_30
      from (select t.*,
                   sum(case when type <> 70 then 1 else 0 end) over (partition by id order by time) as grp,
                   lag(type) over (partition by id order by time) as prev_type,
                   lag(data) over (partition by id order by time) as prev_data
            from t
            where type in (30, 70)
           ) t
     ) t;

One interesting aspect of this. By limiting types to only 30 and 70, we ensure that each group of 70s is preceded by 30.

+1

Gordon linoff 05 jul. '17 at 2:30

source to share

Vamsi Prabhala · Accepted Answer · 2017-07-05T01:28:58+0000

Assuming at least one type = 30 lines preceding type = 70 by id, you can do this by using outer apply

getting the max

time for type = 30 before each line of type = 70 and using that value to divide.

SELECT x.id,
       x.type,
       x.time,
       x.data*1.0/t.data as result
FROM
  (SELECT t.*,t1.maxtime_before
   FROM t 
   OUTER APPLY
     (SELECT max(time) AS maxtime_before
      FROM t t1
      WHERE t1.id=t.id AND t1.type=30 AND t1.time<t.time) t1
   WHERE type = 70
  ) x
JOIN t ON t.id=x.id AND t.time=x.maxtime_before

If there is no row with type = 30 preceding row with type = 70, you can show the value null

for that time in the result column using

WITH x AS
  (SELECT t.*,
          t1.maxtime_before
   FROM t
   OUTER APPLY
     (SELECT max(time) AS maxtime_before
      FROM t t1
      WHERE t1.id=t.id AND t1.type=30 AND t1.time<t.time) t1
   WHERE type = 70
  )
SELECT x.id,
       x.type,
       x.time,
       x.data*1.0/t.data as resullt
FROM t
JOIN x ON t.id=x.id AND t.time=x.maxtime_before
UNION ALL
SELECT id,
       type,
       time,
       NULL
FROM x
WHERE maxtime_before IS NULL

Sample Demo

Another way to do this is to use a window function max

to keep track of the current maximum time type = 30 lines per id.

WITH x AS
  (SELECT t.*,
          MAX(CASE WHEN type=30 THEN time END) OVER(PARTITION BY id ORDER BY time) AS running_max
   FROM t
  )
SELECT x.id,
       x.type,
       x.time,
       x.data*1.0/t.data as result
FROM x
JOIN t ON t.id=x.id AND t.time=x.running_max
WHERE x.type=70
UNION ALL
SELECT id,
       type,
       time,
       NULL
FROM x 
WHERE running_max IS NULL

SQL query to compute calculations from different rows of the same table on SQL server

More articles: