SQL query to compute calculations from different rows of the same table on SQL server
I need a SQL query for the following. I am new to SQL. And the table below is just an example of what data I have. I have very large data of about 30 million rows and would like to write a query to get the table below.
Id type data time
-----------------------------------------------------------
1 30 3.9 15:50:10.660555
1 30 4.0 15:50:10.660777
1 70 11.5 15:50:10.797966
1 30 4.1 15:50:10.834444
1 70 12.6 15:50:10.853114
1 70 16.7 15:50:10.955086
1 30 5 15:50:10.99
11 30 3.8 15:50:11.660555
11 30 4.1 15:50:11.660777
11 70 12.5 15:50:11.797966
11 30 4.7 15:50:11.834444
11 70 12.68 15:50:11.853114
11 70 16.76 15:50:11.955086
11 30 5.1 15:50:11.99
I have a table as above. For each type 70 I need to compute something with the last known type 30. For example, for Id = 1 for the first type = 70 data at 15: 50: 10.797966 I need to get data type = 30 at 15: 50: 10.660777, so i can calculate the result = 11.5 / 4.0. Likewise, for type = 70 at 15: 50: 10.853114, I need data like = 30 at 15:50: 10.834444, so my result is 12.6 / 4.1.
I want the result to look like this:
Id type result time
------------------------------------------------------
1 70 11.5/4.0 15:50:10.797966
1 70 12.6/4.1 15:50:10.853114
1 70 16.7/4.1 15:50:10.955086
11 70 12.5/4.1 15:50:11.797966
11 70 12.68/4.7 15:50:11.853114
11 70 16.76/4.7 15:50:11.955086
I would like to be able to execute these SQL queries in python using pyodbc.
Any help would be greatly appreciated! Thanks in advance!
source to share
Assuming at least one type = 30 lines preceding type = 70 by id, you can do this by using outer apply
getting the max
time for type = 30 before each line of type = 70 and using that value to divide.
SELECT x.id,
x.type,
x.time,
x.data*1.0/t.data as result
FROM
(SELECT t.*,t1.maxtime_before
FROM t
OUTER APPLY
(SELECT max(time) AS maxtime_before
FROM t t1
WHERE t1.id=t.id AND t1.type=30 AND t1.time<t.time) t1
WHERE type = 70
) x
JOIN t ON t.id=x.id AND t.time=x.maxtime_before
If there is no row with type = 30 preceding row with type = 70, you can show the value null
for that time in the result column using
WITH x AS
(SELECT t.*,
t1.maxtime_before
FROM t
OUTER APPLY
(SELECT max(time) AS maxtime_before
FROM t t1
WHERE t1.id=t.id AND t1.type=30 AND t1.time<t.time) t1
WHERE type = 70
)
SELECT x.id,
x.type,
x.time,
x.data*1.0/t.data as resullt
FROM t
JOIN x ON t.id=x.id AND t.time=x.maxtime_before
UNION ALL
SELECT id,
type,
time,
NULL
FROM x
WHERE maxtime_before IS NULL
Another way to do this is to use a window function max
to keep track of the current maximum time type = 30 lines per id.
WITH x AS
(SELECT t.*,
MAX(CASE WHEN type=30 THEN time END) OVER(PARTITION BY id ORDER BY time) AS running_max
FROM t
)
SELECT x.id,
x.type,
x.time,
x.data*1.0/t.data as result
FROM x
JOIN t ON t.id=x.id AND t.time=x.running_max
WHERE x.type=70
UNION ALL
SELECT id,
type,
time,
NULL
FROM x
WHERE running_max IS NULL
source to share
You can only do this with window functions.
For each line, get the previous type and value. Also, list the 70s in such a way that you can identify them as a group (you can do this with the grand total).
In the next step, use the partitioned max to get the type and finally do your calculation.
select t.*,
data / data_30 as result
from (select t.*,
max(case when prev_type = 30 then prev_data end) over (partition by id, grp) as data_30
from (select t.*,
sum(case when type <> 70 then 1 else 0 end) over (partition by id order by time) as grp,
lag(type) over (partition by id order by time) as prev_type,
lag(data) over (partition by id order by time) as prev_data
from t
where type in (30, 70)
) t
) t;
One interesting aspect of this. By limiting types to only 30 and 70, we ensure that each group of 70s is preceded by 30.
source to share