C # object property not available in Enumerable.Where inside foreach
Found C # head-scratcher. In a foreach loop, using the parent.Id property directly on Enumerable.Where doesn't work. The variable is inserted first. No problem with directly accessing parent.Id in the Select statement.
List<Person> people = new List<Person>() {
new Person() { Id = 1, name = "John", parentId = null },
new Person() { Id = 2, name = "Sarah", parentId = null },
new Person() { Id = 3, name = "Daniel", parentId = 1 },
new Person() { Id = 4, name = "Peter", parentId = 1 }
};
List<object> peopleTree = new List<object>();
var parents = people.Where(p => !p.parentId.HasValue);
foreach (Person parent in parents)
{
int parentId = parent.Id;
var children = people
//.Where(p => p.parentId.Equals(parentId)) //This works, is able to find the children
.Where(p => p.parentId.Equals(parent.Id)) //This does not work, no children for John
.Select(p => new { Id = p.Id, Name = p.name, pId = parent.Id }); //pId set correctly
peopleTree.Add(new
{
Id = parent.Id,
Name = parent.name,
Children = children
});
}
Alternatively, if I use a for loop and put the parent into a variable first, I can access the parent.Id property directly in the Where statement.
var parents = people.Where(p => !p.parentId.HasValue).ToArray();
for (int idx = 0; idx < parents.Count(); idx++)
{
var parent = parents[idx];
...
I couldn't find an answer as to why it behaves like this. Can anyone explain this?
source to share
This problem is created by deferred execution children
. Essentially, the value parent
at the time of assessment children
is different. Geekspeak for this Modified close access .
You can fix this by introducing the temporary way you did it, or by forcing an evaluation when the loop foreach
is still in the current iteration:
var children = people
.Where(p => p.parentId.Equals(parent.Id))
.Select(p => new { Id = p.Id, Name = p.name, pId = parent.Id })
.ToList();
source to share
This is caused by the lazy nature of linq queries. Linq queries will "materialize" as late as possible to avoid potentially unnecessary work.
children
is materialized IEnumerable<T>
. It won't actually be populated with items. There is a significant difference between parent
and parentId
used in your two calls .Where()
. parent
is declared only once, but parentId
is inside the loop, so it is declared multiple times. children
parent
Values โโchanged during materialization . It will refer to the last element in parents
which is not the one you intended.
You can force this assessment.
var children = people
.Where(p => p.parentId.Equals(parent.Id))
.Select(p => new { Id = p.Id, Name = p.name, pId = parent.Id })
.ToArray(); <---- this forces materialization
source to share
The problem is with a statement that starts out like this:
var children = people ...
This operator does not output it to the collection, which actually stores the values โโ... this makes the IEnumerable know how to iterate over the collection. The instructions used by this object refer to the variable parent
from the loop. This variable is captured for the Enumerable in something called a closure . Later, when you actually use the Enumerable object to access the elements, it accesses this variable parent
.
Here's the trick: there is one variable parent
that mutates for each iteration through the original loop. At the end of the loop, all of the items in your collection parents
use the same object parent
. Copying the value parent.Id
to a variable inside the loop eliminates the problem because you are now dealing with a new variable to close with each iteration through the loop.
You can also fix this by using a call .ToList()
at the end of the above statement to evaluate the Enumerable while still inside the loop. However, I prefer your existing solution because it is more memory efficient if you don't need to expand all those children at the same time.
The good news is that this issue has been fixed for C # 5 .
source to share