Counting ordered data
I have the following problem to solve and I cannot come up with an algorithm yet, let alone a real solution.
I have a table with similar structure / data as shown below, where IDs are not always in sequence for the same Ticker / QuouteType:
ID Ticker PriceDateTime QuoteType OpenPrice HighPrice LowPrice ClosePrice
------- ------ ---------------- --------- --------- --------- -------- ----------
2036430 ^COMP 2012-02-10 20:50 95/Minute 2901.57 2905.04 2895.37 2901.71
2036429 ^COMP 2012-02-10 19:15 95/Minute 2909.63 2910.98 2899.95 2901.67
2036428 ^COMP 2012-02-10 17:40 95/Minute 2905.9 2910.27 2904.29 2909.64
2036427 ^COMP 2012-02-10 16:05 95/Minute 2902 2908.29 2895.1 2905.89
2036426 ^COMP 2012-02-09 21:00 95/Minute 2926.12 2928.01 2925.53 2927.21
The information I need to extract from this data is as follows:
- How many lines in a row? Counting down from the most recent (as written in PriceDateTime) looking at ClosePrice?
IE: For the current example, the answer should be 2. ClosePrice (line 1) = 2901.71, which is greater than ClosePrice (line 2) = 2901.67, but lower than ClosePrice (line 3) = 2909.64. So looking back at the most recent price, we have 2 rows that "go in the same direction".
Of course I have to do this for a lot of other names, so speed is very important.
PS: Thanks everyone for your help, I took inspiration from all of your answers while building the final procedure. You are very kind!
source to share
Try this: (I've simplified the test data I'm using since it only needs 2 columns to demonstrate the logic).
CREATE TABLE #Test (PriceDateTime DATETIME, ClosePrice DECIMAL(6, 2))
INSERT #Test VALUES
('20120210 20:50:00.000', 2901.71),
('20120210 19:15:00.000', 2901.67),
('20120210 17:40:00.000', 2900.64),
('20120210 16:05:00.000', 2905.89),
('20120209 21:00:00.000', 2927.21)
-- FIRST CTE, JUST DEFINES A VIEW GIVING EACH ENTRY A ROW NUMBER
;WITH CTE AS
( SELECT *,
ROW_NUMBER() OVER(ORDER BY PriceDateTime DESC) [RowNumber]
FROM #Test
),
-- SECOND CTE, ASSIGNES EACH ENTRY +1 OR -1 DEPENDING ON HOW THE VALUE HAS CHANGED COMPARED TO THE PREVIOUS RECORD
CTE2 AS
( SELECT a.*, SIGN(a.ClosePrice - b.ClosePrice) [Movement]
FROM CTE a
LEFT JOIN CTE b
ON a.RowNumber = b.RowNumber - 1
),
-- THIRD CTE, WILL LOOP THROUGH THE DATA AS MANY TIMES AS POSSIBLE WHILE THE PREVIOUS ENTRY HAS THE SAME "MOVEMENT"
CTE3 AS
( SELECT *, 1 [Recursion]
FROM CTE2
UNION ALL
SELECT a.PriceDateTime, a.ClosePrice, a.RowNumber, a.Movement, b.Recursion + 1
FROM CTE2 a
INNER JOIN CTE3 b
ON a.RowNumber = b.RowNumber - 1
AND a.Movement = b.Movement
)
SELECT MAX(Recursion) + 1 -- ADD 1 TO THE RECORD BECAUSE THERE WILL ALWAYS BE AT LEAST TWO ROWS
FROM CTE3
WHERE RowNumber = 1 -- LATEST ENTRY
DROP TABLE #Test
I tried to comment on the answer to explain how I am going. If something is not clear from the comments, let me know and I will try to explain further
source to share
The solution below should be efficient enough, but it will fail if there are spaces in the identifier sequence.
Please update your theme if that's the point.
DECLARE @t TABLE (
ID INT,
ClosePrice DECIMAL(10, 5)
)
INSERT @t (ID, ClosePrice)
VALUES (2036430, 2901.71), (2036429, 2901.67), (2036428, 2909.64), (2036427, 2905.89), (2036426, 2927.21)
;WITH CTE AS (
SELECT TOP 1 ID, ClosePrice, 1 AS lvl
FROM @t
ORDER BY ID DESC
UNION ALL
SELECT s.ID, s.ClosePrice, CTE.lvl + 1
FROM @t AS s
INNER JOIN CTE
ON s.ID = CTE.ID - 1 AND s.ClosePrice < CTE.ClosePrice
)
SELECT MAX(lvl) AS answer
FROM CTE
source to share
I would join your data on its own (+1 on master key / order), then use a simple CASE to track the change (assuming I understand your question correctly).
For example:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[tbl_NumericSequence](
[ID] [int] NULL,
[Value] [int] NULL
) ON [PRIMARY]
GO
INSERT [dbo].[tbl_NumericSequence] ([ID], [Value]) VALUES (1, 1)
GO
INSERT [dbo].[tbl_NumericSequence] ([ID], [Value]) VALUES (2, 2)
GO
INSERT [dbo].[tbl_NumericSequence] ([ID], [Value]) VALUES (3, 3)
GO
INSERT [dbo].[tbl_NumericSequence] ([ID], [Value]) VALUES (4, 2)
GO
INSERT [dbo].[tbl_NumericSequence] ([ID], [Value]) VALUES (5, 1)
GO
INSERT [dbo].[tbl_NumericSequence] ([ID], [Value]) VALUES (6, 3)
GO
INSERT [dbo].[tbl_NumericSequence] ([ID], [Value]) VALUES (7, 3)
GO
INSERT [dbo].[tbl_NumericSequence] ([ID], [Value]) VALUES (8, 8)
GO
INSERT [dbo].[tbl_NumericSequence] ([ID], [Value]) VALUES (9, 1)
GO
WITH RawData ( [ID], [Value] )
AS ( SELECT [ID] ,
[Value]
FROM [Test].[dbo].[tbl_NumericSequence]
)
SELECT RawData.ID ,
RawData.Value ,
CASE WHEN RawDataLag.Value = RawData.Value THEN 'No Change'
WHEN RawDataLag.Value > RawData.Value THEN 'Down'
WHEN RawDataLag.Value < RawData.Value THEN 'Up'
END AS Change
FROM RawData
LEFT OUTER JOIN RawData RawDataLag ON RawData.ID = RawDataLag.iD + 1
ORDER BY RawData.ID ASC
source to share
I would approach it with recursive generic table expressions:
CREATE TABLE #MyTable (ID INT, ClosePrice MONEY)
INSERT INTO #MyTable ( ID, ClosePrice )
VALUES (2036430,2901.71),
(2036429,2901.67),
(2036428,2909.64),
(2036427,2905.89),
(2036426,2927.21)
WITH CTE AS (
SELECT TOP 1 id, closeprice, 1 Consecutive
FROM #MyTable
ORDER BY id DESC
UNION ALL
SELECT A.id, A.closeprice, CASE WHEN A.ClosePrice < B.ClosePrice THEN Consecutive+1 ELSE 1 END
FROM #MyTable A INNER JOIN cte B ON A.ID=B.id -1
)
SELECT * FROM cte
--OR to just get the max consecutive
--select max(Consecutive) from cte
DROP TABLE #MyTable
source to share