Average calculation by excluding 10% and lower 10% values ββusing T-SQL
I need to calculate the average duration of one of our running stored procedures. For example, duration (in seconds) SP: 1,30,2,5,2,15,35,7,3,4,2,1,2,40
I need to eliminate the 10% calls (fast calls) and the bottom 10% calls (slow calls) and calculate the average for the rest.
Is there a better approach that minimizes performance as it needs to be done on a huge dataset on a regular basis?
The approach I know of is:
Eliminate the top 10% of the entry using the following query which results in two values ββ(1,1)
SELECT TOP 10 PERCENT WITH TIES value FROM #t order by value asc
the bottom 10% gives (35.40),
SELECT TOP 10 PERCENT WITH TIES value FROM #t order by value desc
After eliminating these values ββ(1,1,35 and 40), the average will be 7.
source to share
One option that works relatively well in my tests (less than 1 second with 300,000 lines, up to 12 seconds with 10 million lines):
declare @tot int =
(select count(*)
from #MyData)
; with cte as (select Data, ROW_NUMBER() over (order by data) RN from #MyData)
select avg(Data)
from cte
where rn between @tot/10 and 9*@tot/10
source to share
Edited to provide a crisp "haircut"
Declare @YourTable table (Seconds int)
Insert Into @YourTable values
(1),(30),(2),(5),(2),(15),(35),(7),(3),(4),(2),(1),(2),(40)
Select AvgSeconds = avg(Seconds)
From (
Select *
,Dec1 = NTile(10) over (Order By Seconds)
,Dec2 = NTile(10) over (Order By Seconds Desc)
From @YourTable
) A
Where Dec1 between 2 and 9
and Dec2 between 2 and 9
Returns
AvgSeconds
7
source to share