When to use List <T>, IEnumerable <T> and ArrayList

My question is very simple. When to use List, IEnumerable and ArrayList.

Here's my scenario. I am working in a web application using LINQ. The information is returned as IEnumerable:

IEnumerable<Inventory> result = from Inventory i in db where.... 

      

I'm not sure how IEnumerable works, but it takes a long time to complete each operation. More specifically, result.Count (), result.ElementAt (i), result.ToList, etc., each operation takes a significant amount of time.

So I was wondering if it should be treated like a list by doing result.ToList instead of working with the IEnumerable variable.

Thank!

+3


source to share


7 replies


If I understand that you are doing the right thing, you have a query like from Inventory i in db select i

, and then you perform multiple operations on the result:

var count = result.Count();
var fifth = result.ElementAt(5);
var allItems = result.ToList();

      

Now let's look at what happens when you have a request as different types:

  • IQueryable<T>

    var result = from Inventory i in db select i;
    IQueryable<Inventory> result = from Inventory i in db select i;
    
          

    The two lines above are the same. They don't actually go to the database, they just create a view of the query. If you have this, it will Count()

    execute the SQL query, for example SELECT COUNT(*) FROM Inventory

    , ElementAt(5)

    execute another query that only takes the 5th item in the table, and ToList()

    will execute something like SELECT * FROM Inventory

    , but what we want here.

  • IEnumerable<T>

    IEnumerable<Inventory> result = from Inventory i in db select i;
    
          

    Doing this again doesn't go to the database, it only creates a view of the query. But this is a view that cannot use methods specific to IQueryable<T>

    , so any LINQ operation will enumerate a collection that will execute an SQL query eg SELECT * FROM Inventory

    .

    So, for example: Count()

    will execute the query SELECT * â€Ļ

    only to count the items in the result. ElementAt(5)

    will execute the entire query again, only to discard all but the fifth item. And it ToList()

    will execute the request again.

  • List<T>

    List<Inventory> result = (from Inventory i in db select i).ToList();
    
          

    This will actually execute the request SELECT * FROM Inventory

    immediately and once. All operations from result

    will not touch the database, they will be performed in memory.



What should you take away from this? First, never use it IEnumerable<T>

as a database query type. It has terrible performance.

If you want to do several different operations on the result, using IQueryable<T>

may be the best solution.

If you want to get the whole result anyway, use ToList()

(or ToArray()

) as soon as possible and then do the work with the resultant List<T>

.

+6


source


Never use ArrayList. ArrayList is supported for pre-.NET 2.0 compatibility. This is equivalent List<object>

and there is no reason not to use generic types in any normal situation.

As per your sample code, you are using LINQ to SQL or similar structure to retrieve data from the DB. In this case, the operator select

does not display the data himself, he simply constructs the query. When you call a method like Count () or ToList (), it fetches the data - so it seems slow. It's not slower, it's just lazy loading in action.



The advantage of using IEnumerable is that you don't have to load all the data at once. Whether you are just asking for a specific clause where

or calling Take (1) to get the first item, the LINQ provider must be smart enough to fetch the items you need from the DB. But if you call Count () or ToList (), it should get the entire dataset. If you need such information, you probably want to call ToList

or ToArray

, and the rest of your work is in a list in memory, so you don't have to hit DB again.

+4


source


Your request is only executed when you call ToList () or another similar method.

This is called Deffered Execution .

Use IEnumerable whenever possible for your result

. The execution performance of LINQ doo't depends on what you are using for result

, because in the end it is still treated as IEnumerable.

But LINQ performance depends on the underlying data.

[IMAGE WITH DETAILS]

+2


source


The distinction between using IEnumerable or IList is actually pretty straightforward (on the surface).

You should look at the contract defined by both interfaces. IEnumerable just lets you enumerate a sequence. In other words, the only way to access the data is to use the Enumerator, usually in a foreach loop. Thus, a naive implementation of the count function would look something like this:

public static int Count(this IEnumerable<T> source) {
    int count = 0;
    foreach(var item in myEnumerable)
    {
        count++;
    }
    return count;
}

      

This means that the time it takes to compute the number of items in your enumeration will increase linearly with the number of items. Also, since this is not stored in any way internally, you will need to do this loop every time you want to count.

IList already provides a Count property. This is part of the contract. To implement Count (), you simply end up calling the Count property. It will take the same amount of time regardless of the number of items.

An easy way to think about this (especially when using Linq) is to think of IEnumerable as a specification of the elements you need. Until you can access the data, you are unlikely to be able to build it. Once you start enumerating (something that returns something else than IEnumerable basically) the code is executed and it may take a while.

As far as your context is concerned, what I usually like is to keep the Linq execution in the controller. So I do my query build and then ToList or ToArray before sending it to the view. The reason is quite simple: if I need to do more than just access the data in the view, it means that I am doing too much in my opinion. Now I am forced to move this logic into my controller action keeping my views as clean as possible.

+1


source


If you use a linq expression for the Linq query provider, the result is IQueryable<T>

which is an extension IEnumerable<T>

.

Each time you iterate over IQueryable<T>

, the Linq Query Provider will query the underlying data source. Therefore, if you want to repeat the result more than once, it may be more efficient to convert it to a list in the first place ( .ToList()

).

Note that when converting the result to a list, you must use actual members List<T>

instead of extension methods IEnumerable<T>

. For example, list.ElementAt(i)

both list.Count()

are performed at a O(n)

time, and list[i]

and list.Count

are performed at a constant time.

0


source


Use Shared Lists / IEnumerable whenever possible.

Avoid ArrayList

. This can result in boxing for value types and casting for reference types. IEnumerable

- the same - it is best to avoid if you are not dealing with objects.

IEnumerable<T>

exhibits very good covariance, contravariance. However, it displays delayed execution

, which is a scourge as well as a blessing.

List<T>

is better for internal use and interfaces are IEnumerable<T>

. List<T>

does not support contravariance.

0


source


The answer to using "depends on this, but mostly List is used".

Based on the full content of your question (long startup delays .Count () and other methods), you should first do a toList () on the query results and then use it for further access.

That's why. This IEnumerable is pretty much a query. Since the requested data may change between query runs, ever one method call on this IEnumerable results in another database lookup.

So every time you call .Count (), someone has to go into the database and get the count of all objects that match your request. Every time you do elementAt (x), even if x does not change, someone still needs to go through the database and get everything there is, because IEnumerable cannot assume the data hasn't changed.

On the other hand, if you got a snapshot of your query using a List, then getting Count or accessing random elements is pretty fast.

So what to use depends. If every time you access IEnumerable you need to know what is in the database (or any data source) NOW, then you need to use IEnumerable. If you only care about what happened when you ran the original query or need to perform an operation on a sequential (and / or static) data source, use List. You will still harm your first access, but everything else will be quick.

0


source







All Articles