Average calculation by excluding 10% and lower 10% values ​​using T-SQL

I need to calculate the average duration of one of our running stored procedures. For example, duration (in seconds) SP: 1,30,2,5,2,15,35,7,3,4,2,1,2,40

I need to eliminate the 10% calls (fast calls) and the bottom 10% calls (slow calls) and calculate the average for the rest.

Is there a better approach that minimizes performance as it needs to be done on a huge dataset on a regular basis?

The approach I know of is:

Eliminate the top 10% of the entry using the following query which results in two values ​​(1,1)

SELECT TOP 10 PERCENT WITH TIES value FROM #t order by value asc

      

the bottom 10% gives (35.40),

SELECT TOP 10 PERCENT WITH TIES value FROM #t order by value desc

      

After eliminating these values ​​(1,1,35 and 40), the average will be 7.

+3


source to share


2 answers


One option that works relatively well in my tests (less than 1 second with 300,000 lines, up to 12 seconds with 10 million lines):



declare @tot int = 
(select count(*)
from #MyData)

; with cte as (select Data, ROW_NUMBER() over (order by data) RN from #MyData)

select avg(Data)
from cte
where rn between @tot/10 and 9*@tot/10

      

0


source


Edited to provide a crisp "haircut"

Declare @YourTable table (Seconds int)
Insert Into @YourTable values
(1),(30),(2),(5),(2),(15),(35),(7),(3),(4),(2),(1),(2),(40)

Select AvgSeconds = avg(Seconds)
 From (
        Select *
              ,Dec1 = NTile(10) over (Order By Seconds)
              ,Dec2 = NTile(10) over (Order By Seconds Desc)
          From  @YourTable
      ) A
 Where Dec1 between 2 and 9
   and Dec2 between 2 and 9

      



Returns

AvgSeconds
7

      

0


source







All Articles