Optimizing LINQ Queries for Slow Grouping

I have a LINQ query that gets data through Entity Framework Code First from a SQL database. It works, but it is very slow.

This is the original request:

      var tmpResult = from mdv in allMetaDataValues
                  where mdv.Metadata.InputType == MetadataInputType.String && mdv.Metadata.ShowInFilter && !mdv.Metadata.IsHidden && !string.IsNullOrEmpty(mdv.ValueString)
                  group mdv by new
                  {
                    mdv.ValueString,
                    mdv.Metadata
                  } into g
                  let first = g.FirstOrDefault()
                  select new
                  {
                    MetadataTitle = g.Key.Metadata.Title,
                    MetadataID = g.Key.Metadata.ID,
                    CollectionColor = g.Key.Metadata.Collection.Color,
                    CollectionID = g.Key.Metadata.Collection.ID,

                    MetadataValueCount = 0,
                    MetadataValueTitle = g.Key.ValueString,
                    MetadataValueID = first.ID
                  };

      

This is the generated SQL from the original query:

{SELECT 
0 AS [C1], 
[Project4].[Title] AS [Title], 
[Project4].[ID] AS [ID], 
[Extent9].[Color] AS [Color], 
[Project4].[Collection_ID] AS [Collection_ID], 
[Project4].[ValueString] AS [ValueString], 
[Project4].[C1] AS [C2]
FROM   (SELECT 
    [Project2].[ValueString] AS [ValueString], 
    [Project2].[ID] AS [ID], 
    [Project2].[Title] AS [Title], 
    [Project2].[Collection_ID] AS [Collection_ID], 
    (SELECT TOP (1) 
        [Filter4].[ID1] AS [ID]
        FROM ( SELECT [Extent6].[ID] AS [ID1], [Extent6].[ValueString] AS [ValueString], [Extent7].[Collection_ID] AS [Collection_ID1], [Extent8].[ID] AS [ID2], [Extent8].[InputType] AS [InputType], [Extent8].[ShowInFilter] AS [ShowInFilter], [Extent8].[IsHidden] AS [IsHidden1]
            FROM   [dbo].[MetadataValue] AS [Extent6]
            LEFT OUTER JOIN [dbo].[Media] AS [Extent7] ON [Extent6].[Media_ID] = [Extent7].[ID]
            INNER JOIN [dbo].[Metadata] AS [Extent8] ON [Extent6].[Metadata_ID] = [Extent8].[ID]
            WHERE ( NOT (([Extent6].[ValueString] IS NULL) OR (( CAST(LEN([Extent6].[ValueString]) AS int)) = 0))) AND ([Extent7].[IsHidden] <> cast(1 as bit))
        )  AS [Filter4]
        WHERE (2 =  CAST( [Filter4].[InputType] AS int)) AND ([Filter4].[ShowInFilter] = 1) AND ([Filter4].[IsHidden1] <> cast(1 as bit)) AND ([Filter4].[Collection_ID1] = @p__linq__0) AND (([Project2].[ValueString] = [Filter4].[ValueString]) OR (([Project2].[ValueString] IS NULL) AND ([Filter4].[ValueString] IS NULL))) AND (([Project2].[ID] = [Filter4].[ID2]) OR (1 = 0))) AS [C1]
    FROM ( SELECT 
        [Distinct1].[ValueString] AS [ValueString], 
        [Distinct1].[ID] AS [ID], 
        [Distinct1].[Title] AS [Title], 
        [Distinct1].[Collection_ID] AS [Collection_ID]
        FROM ( SELECT DISTINCT 
            [Filter2].[ValueString] AS [ValueString], 
            [Filter2].[ID3] AS [ID], 
            [Filter2].[InputType1] AS [InputType], 
            [Filter2].[Title1] AS [Title], 
            [Filter2].[ShowInFilter1] AS [ShowInFilter], 
            [Filter2].[IsHidden2] AS [IsHidden], 
            [Filter2].[Collection_ID2] AS [Collection_ID]
            FROM ( SELECT [Filter1].[ValueString], [Filter1].[Collection_ID3], [Filter1].[IsHidden3], [Filter1].[ID3], [Filter1].[InputType1], [Filter1].[Title1], [Filter1].[ShowInFilter1], [Filter1].[IsHidden2], [Filter1].[Collection_ID2]
                FROM ( SELECT [Extent1].[ValueString] AS [ValueString], [Extent2].[Collection_ID] AS [Collection_ID3], [Extent4].[IsHidden] AS [IsHidden3], [Extent5].[ID] AS [ID3], [Extent5].[InputType] AS [InputType1], [Extent5].[Title] AS [Title1], [Extent5].[ShowInFilter] AS [ShowInFilter1], [Extent5].[IsHidden] AS [IsHidden2], [Extent5].[Collection_ID] AS [Collection_ID2]
                    FROM     [dbo].[MetadataValue] AS [Extent1]
                    LEFT OUTER JOIN [dbo].[Media] AS [Extent2] ON [Extent1].[Media_ID] = [Extent2].[ID]
                    INNER JOIN [dbo].[Metadata] AS [Extent3] ON [Extent1].[Metadata_ID] = [Extent3].[ID]
                    LEFT OUTER JOIN [dbo].[Metadata] AS [Extent4] ON [Extent1].[Metadata_ID] = [Extent4].[ID]
                    LEFT OUTER JOIN [dbo].[Metadata] AS [Extent5] ON [Extent1].[Metadata_ID] = [Extent5].[ID]
                    WHERE ( NOT (([Extent1].[ValueString] IS NULL) OR (( CAST(LEN([Extent1].[ValueString]) AS int)) = 0))) AND ([Extent2].[IsHidden] <> cast(1 as bit)) AND (2 =  CAST( [Extent3].[InputType] AS int)) AND ([Extent3].[ShowInFilter] = 1)
                )  AS [Filter1]
                WHERE [Filter1].[IsHidden3] <> cast(1 as bit)
            )  AS [Filter2]
            WHERE [Filter2].[Collection_ID3] = @p__linq__0
        )  AS [Distinct1]
    )  AS [Project2] ) AS [Project4]
LEFT OUTER JOIN [dbo].[Collection] AS [Extent9] ON [Project4].[Collection_ID] = [Extent9].[ID]}

      

If we remove " let first = g.FirstOrDefault () " and change " MetadataValueID = first.ID " to " MetadataValueID = 0 " so that we just have a fixed ID = 0 for testing purposes, then the data loads very quickly and itself the generated query is half the size of the original It looks like this part makes the query very slow:

let first = g.FirstOrDefault()
...
  MetadataValueID = first.ID
};

      

How can this be rewritten? If I try to rewrite the code, it is still slow:

MetadataValueID = g.Select(x => x.ID).FirstOrDefault()

      

or

let first = g.Select(x => x.ID).FirstOrDefault()
...
  MetadataValueID = first
};

      

Any suggestions?

+3


source to share


2 answers


Using EF I have always thought that it has performance issues in translating things like g.Key.Metadata.Collection

, so I am trying to join them more clearly and include only those fields that are necessary for your result. You can use include

instead of connecting using the repository pattern.

Then your request will look like this:

   from mdv in allMetaDataValues.Include("Metadata").Include("Metadata.Collection")
   where mdv.Metadata.InputType == MetadataInputType.String && 
         mdv.Metadata.ShowInFilter && 
         !mdv.Metadata.IsHidden && 
         !string.IsNullOrEmpty(mdv.ValueString)
   group mdv by new
   {
     MetadataID = mdv.Metadata.ID,
     CollectionID = mdv.Metadata.Collection.ID,
     mdv.Metadata.Title,
     mdv.Metadata.Collection.Color,
     mdv.ValueString
   } into g
   let first = g.FirstOrDefault().ID
   select new
   {
     MetadataTitle = g.Key.Title,
     MetadataID = g.Key.MetadataID,
     CollectionColor = g.Key.Color,
     CollectionID = g.Key.CollectionID,
     MetadataValueCount = 0,
     MetadataValueTitle = g.Key.ValueString,
     MetadataValueID = first
   }

      

LinqPad is a good tool for playing with linq .

The problem is also that:



  let first = g.FirstOrDefault().ID

      

cannot be easily translated to SQL, see this answer . But this rewrite simplifies the basic query for it at least. It remains unclear to me why you want the first id from the set without using orderby

.

It can be rewritten like this:

let first =  (from f in allMetaDataValues
              where f.Metadata.ID == g.Key.MetadataID && 
                    f.ValuesString == g.Key.ValuesString select f.ID)
             .FirstOrDefault()

      

This way you prevent EF from writing the query for you and you can specify exactly how to make the choice. To speed up your query, you might also consider adding indexes to the database according to the generated query, namely an index using both columns used in the where clause of that query let first

.

+1


source


Try the next solution.
 Replace FirstOrDefault()

with .Take(1)

. FirstOrDefault()

is not downloading.



var tmpResult = from mdv in allMetaDataValues
                  where mdv.Metadata.InputType == MetadataInputType.String && mdv.Metadata.ShowInFilter && !mdv.Metadata.IsHidden && !string.IsNullOrEmpty(mdv.ValueString)
                  group mdv by new
                  {
                    mdv.ValueString,
                    mdv.Metadata
                  } into g
                  let first = g.Take(1)
                  select new
                  {
                    MetadataTitle = g.Key.Metadata.Title,
                    MetadataID = g.Key.Metadata.ID,
                    CollectionColor = g.Key.Metadata.Collection.Color,
                    CollectionID = g.Key.Metadata.Collection.ID,

                    MetadataValueCount = 0,
                    MetadataValueTitle = g.Key.ValueString,
                    MetadataValueID = first.ID
                  };

      

0


source







All Articles