Is using COUNT () inside a CTE more expensive than outside a CTE?

I am paging from SQL Server and I would like to avoid duplication by counting the total number of results as part of my partial result set rather than getting that result set and then running a separate query to get the score afterwards. However, the problem is the increase in execution time. For example, if I check with SET STATISTICS TIME ON

, this:

WITH PagedResults AS (
    SELECT
        ROW_NUMBER() OVER (ORDER BY AggregateId ASC) AS RowNumber,
        COUNT(PK_MatrixItemId) OVER() AS TotalRowCount,
        *
    FROM [MyTable] myTbl WITH(NOLOCK)
)
SELECT * FROM PagedResults
WHERE RowNumber BETWEEN 3 AND 4810

      

... or this (whose execution plan is identical):

SELECT * FROM (
    SELECT TOP (4813)
        ROW_NUMBER() OVER (ORDER BY AggregateId ASC) AS RowNumber,
        COUNT(PK_MatrixItemId) OVER() AS TotalRowCount,
        *
    FROM [MyTable] myTbl WITH(NOLOCK)
) PagedResults
WHERE PagedResults.RowNumber BETWEEN 3 AND 4810

      

... seems to average out the CPU time (all requests added) 1.5-2 times more than this:

SELECT * FROM (
    SELECT TOP (4813)
        ROW_NUMBER() OVER (ORDER BY AggregateId ASC) AS RowNumber,
        *
    FROM [MyTable] myTbl WITH(NOLOCK)
) PagedResults
WHERE PagedResults.RowNumber BETWEEN 3 AND 4810

SELECT COUNT(*) FROM [MyTable] myTbl WITH(NOLOCK)

      

Obviously, I would rather use the former than the latter, because the latter unnecessarily repeats a sentence FROM

(and will repeat any suggestions WHERE

if I had), but its runtime is much better than I actually have using it. Is there a way that I can eliminate the previous runtime entirely?

+3


source to share


1 answer


CTEs are included in the query plan. They perform in the same way as derived tables.

Views do not correspond to physical operations. They do not "materialize" the result set into a temporary table. (I believe MySQL does this, but MySQL is the most primitive core DBMS.)



The usage OVER()

does show up in the query plan as buffering in the temp table. It's not entirely clear why this would be faster than just re-reading the underlying table. Buffering is pretty slow because writes are more CPU intensive than reads in SQL Server. We can just read twice from the original table. This is probably why the latter option is faster.

If you want to avoid repeating parts of the query, use the view or table-value function. Of course, these are not great options for one-off requests. You can also generate SQL at the application layer and reuse strings. ORMs also make this much easier.

+1


source







All Articles