Counting ordered data

I have the following problem to solve and I cannot come up with an algorithm yet, let alone a real solution.

I have a table with similar structure / data as shown below, where IDs are not always in sequence for the same Ticker / QuouteType:

ID      Ticker PriceDateTime    QuoteType OpenPrice HighPrice LowPrice ClosePrice
------- ------ ---------------- --------- --------- --------- -------- ----------
2036430 ^COMP  2012-02-10 20:50 95/Minute 2901.57   2905.04   2895.37  2901.71
2036429 ^COMP  2012-02-10 19:15 95/Minute 2909.63   2910.98   2899.95  2901.67
2036428 ^COMP  2012-02-10 17:40 95/Minute 2905.9    2910.27   2904.29  2909.64
2036427 ^COMP  2012-02-10 16:05 95/Minute 2902      2908.29   2895.1   2905.89
2036426 ^COMP  2012-02-09 21:00 95/Minute 2926.12   2928.01   2925.53  2927.21

      

The information I need to extract from this data is as follows:

  • How many lines in a row? Counting down from the most recent (as written in PriceDateTime) looking at ClosePrice?

IE: For the current example, the answer should be 2. ClosePrice (line 1) = 2901.71, which is greater than ClosePrice (line 2) = 2901.67, but lower than ClosePrice (line 3) = 2909.64. So looking back at the most recent price, we have 2 rows that "go in the same direction".

Of course I have to do this for a lot of other names, so speed is very important.

PS: Thanks everyone for your help, I took inspiration from all of your answers while building the final procedure. You are very kind!

+3


source to share


4 answers


Try this: (I've simplified the test data I'm using since it only needs 2 columns to demonstrate the logic).

CREATE TABLE #Test (PriceDateTime DATETIME, ClosePrice DECIMAL(6, 2))
INSERT #Test VALUES 
('20120210 20:50:00.000', 2901.71),
('20120210 19:15:00.000', 2901.67),
('20120210 17:40:00.000', 2900.64),
('20120210 16:05:00.000', 2905.89),
('20120209 21:00:00.000', 2927.21)

-- FIRST CTE, JUST DEFINES A VIEW GIVING EACH ENTRY A ROW NUMBER
;WITH CTE AS
(   SELECT  *,
            ROW_NUMBER() OVER(ORDER BY PriceDateTime DESC) [RowNumber]
    FROM    #Test
), 
-- SECOND CTE, ASSIGNES EACH ENTRY +1 OR -1 DEPENDING ON HOW THE VALUE HAS CHANGED COMPARED TO THE PREVIOUS RECORD
CTE2 AS
(   SELECT  a.*, SIGN(a.ClosePrice - b.ClosePrice) [Movement]
    FROM    CTE a
            LEFT JOIN CTE b
                ON a.RowNumber = b.RowNumber - 1
), 
-- THIRD CTE, WILL LOOP THROUGH THE DATA AS MANY TIMES AS POSSIBLE WHILE THE PREVIOUS ENTRY HAS THE SAME "MOVEMENT"
CTE3 AS
(   SELECT  *, 1 [Recursion]
    FROM    CTE2
    UNION ALL
    SELECT  a.PriceDateTime, a.ClosePrice, a.RowNumber, a.Movement, b.Recursion + 1
    FROM    CTE2 a
            INNER JOIN CTE3 b
                ON a.RowNumber = b.RowNumber - 1
                AND a.Movement = b.Movement
)

SELECT  MAX(Recursion) + 1 -- ADD 1 TO THE RECORD BECAUSE THERE WILL ALWAYS BE AT LEAST TWO ROWS
FROM    CTE3
WHERE   RowNumber = 1 -- LATEST ENTRY

DROP TABLE #Test

      



I tried to comment on the answer to explain how I am going. If something is not clear from the comments, let me know and I will try to explain further

+2


source


The solution below should be efficient enough, but it will fail if there are spaces in the identifier sequence.

Please update your theme if that's the point.



DECLARE @t TABLE (
    ID INT,
    ClosePrice DECIMAL(10, 5)
)

INSERT @t (ID, ClosePrice)
VALUES  (2036430, 2901.71), (2036429, 2901.67), (2036428, 2909.64), (2036427, 2905.89), (2036426, 2927.21)


;WITH CTE AS (
    SELECT TOP 1 ID, ClosePrice, 1 AS lvl
    FROM @t
    ORDER BY ID DESC

    UNION ALL

    SELECT s.ID, s.ClosePrice, CTE.lvl + 1
    FROM @t AS s
    INNER JOIN CTE
        ON s.ID = CTE.ID - 1 AND s.ClosePrice < CTE.ClosePrice
)   
SELECT MAX(lvl) AS answer 
FROM CTE

      

+1


source


I would join your data on its own (+1 on master key / order), then use a simple CASE to track the change (assuming I understand your question correctly).

For example:

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[tbl_NumericSequence](
    [ID] [int] NULL,
    [Value] [int] NULL
) ON [PRIMARY]

GO
INSERT [dbo].[tbl_NumericSequence] ([ID], [Value]) VALUES (1, 1)
GO
INSERT [dbo].[tbl_NumericSequence] ([ID], [Value]) VALUES (2, 2)
GO
INSERT [dbo].[tbl_NumericSequence] ([ID], [Value]) VALUES (3, 3)
GO
INSERT [dbo].[tbl_NumericSequence] ([ID], [Value]) VALUES (4, 2)
GO
INSERT [dbo].[tbl_NumericSequence] ([ID], [Value]) VALUES (5, 1)
GO
INSERT [dbo].[tbl_NumericSequence] ([ID], [Value]) VALUES (6, 3)
GO
INSERT [dbo].[tbl_NumericSequence] ([ID], [Value]) VALUES (7, 3)
GO
INSERT [dbo].[tbl_NumericSequence] ([ID], [Value]) VALUES (8, 8)
GO
INSERT [dbo].[tbl_NumericSequence] ([ID], [Value]) VALUES (9, 1)
GO
WITH    RawData ( [ID], [Value] )
          AS ( SELECT   [ID] ,
                        [Value]
               FROM     [Test].[dbo].[tbl_NumericSequence]
             )
    SELECT  RawData.ID ,
            RawData.Value ,
            CASE WHEN RawDataLag.Value = RawData.Value THEN 'No Change'
                 WHEN RawDataLag.Value > RawData.Value THEN 'Down'
                 WHEN RawDataLag.Value < RawData.Value THEN 'Up'
            END AS Change
    FROM    RawData
            LEFT OUTER JOIN RawData RawDataLag ON RawData.ID = RawDataLag.iD + 1
    ORDER BY RawData.ID ASC

      

0


source


I would approach it with recursive generic table expressions:

CREATE TABLE #MyTable (ID INT, ClosePrice MONEY)

INSERT INTO #MyTable ( ID, ClosePrice )
VALUES (2036430,2901.71),
(2036429,2901.67),
(2036428,2909.64),
(2036427,2905.89),
(2036426,2927.21)

WITH CTE AS (
    SELECT TOP 1 id, closeprice, 1 Consecutive 
    FROM #MyTable 
    ORDER BY id DESC
    UNION ALL
    SELECT A.id, A.closeprice, CASE WHEN A.ClosePrice < B.ClosePrice THEN Consecutive+1 ELSE 1 END
    FROM #MyTable A INNER JOIN cte B ON A.ID=B.id -1
)
SELECT * FROM cte

--OR to just get the max consecutive
--select max(Consecutive) from cte

DROP TABLE #MyTable

      

0


source







All Articles