Multi-block work list

A list is a wrapper over an array. As you add items to the list, it creates more and more array under cover (and the previous array is garbage collected). But if you handle large lists at some point, you will get an OutOfMemoryException even if there is free memory due to memory fragmentation. I'm looking for an ICollection implementation that will work with a set of undercover arrays similar to what MemoryTributary does .

Refresh.

I found BigArray implementation here:

http://blogs.msdn.com/b/joshwil/archive/2005/08/10/450202.aspx .

While it tries to solve another problem (creating an array> 2GB in size) it also solves my problem. But this implementation is incomplete and doesn't even compile. So if I don't find a better one, I'll improve it and use it.

+3


source to share


2 answers


I haven't found a good implementation. So I wrote my own ChunkyList. Usually ChunkyList is a list of array wrappers. The block is initially of size 1, but it is multiplied by two (when it reaches MaxBlockSize, the next block is created) every time you need to expand the current block (list-like behavior).

Here is a generic ChunkyList:

public class ChunkyList<T> : IList<T>
{
    public ChunkyList()
    {
        MaxBlockSize = 65536;
    }

    public ChunkyList(int maxBlockSize)
    {
        MaxBlockSize = maxBlockSize;
    }

    private List<T[]> _blocks = new List<T[]>();

    public int Count { get; private set; }

    public int MaxBlockSize { get; private set; }

    public bool IsReadOnly
    {
        get { throw new NotImplementedException(); }
    }

    public IEnumerator<T> GetEnumerator()
    {
        var index = 0;
        foreach (var items in _blocks)
            foreach (var item in items)
            {
                yield return item;
                if (Count <= ++index)
                    break;
            }
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }

    public void Add(T item)
    {
        var indexInsideBlock = GetIndexInsideBlock(Count);
        if (indexInsideBlock == 0)
            _blocks.Add(new T[1]);
        else
        {
            var lastBlockIndex = _blocks.Count - 1;
            var lastBlock = _blocks[lastBlockIndex];
            if(indexInsideBlock >= lastBlock.Length)
            {
                var newBlockSize = lastBlock.Length*2;
                if (newBlockSize >= MaxBlockSize)
                    newBlockSize = MaxBlockSize;

                _blocks[lastBlockIndex] = new T[newBlockSize];
                Array.Copy(lastBlock, _blocks[lastBlockIndex], lastBlock.Length);
            }
        }

        _blocks[GetBlockIndex(Count)][indexInsideBlock] = item;

        Count++;
    }

    public void AddRange(IEnumerable<T> items)
    {
        foreach (var item in items)
            Add(item);
    }

    public void Clear()
    {
        throw new NotImplementedException();
    }

    public bool Contains(T item)
    {
        throw new NotImplementedException();
    }

    public void CopyTo(T[] array, int arrayIndex)
    {
        throw new NotImplementedException();
    }

    public bool Remove(T item)
    {
        throw new NotImplementedException();
    }

    public int IndexOf(T item)
    {
        throw new NotImplementedException();
    }

    public void Insert(int index, T item)
    {
        throw new NotImplementedException();
    }

    public void RemoveAt(int index)
    {
        throw new NotImplementedException();
    }

    public T this[int index]
    {
        get
        {
            if (index >= Count)
                throw new ArgumentOutOfRangeException("index");

            var blockIndex = GetBlockIndex(index);
            var block = _blocks[blockIndex];

            return block[GetIndexInsideBlock(index)];
        }
        set { throw new NotImplementedException(); }
    }

    private int GetBlockIndex(int index)
    {
        return index / MaxBlockSize;
    }

    private long GetIndexInsideBlock(int index)
    {
        return index % MaxBlockSize;
    }
}

      



And here are the tests that prove this implementation works:

 [TestClass]
    public class ChunkyListTests
    {
        [TestMethod]
         public void GetEnumerator_NoItems()
         {
             var chunkyList = new ChunkyList<float>();

             var wasInsideForeach = false;
             foreach (var item in chunkyList)
                 wasInsideForeach = true;

             Assert.IsFalse(wasInsideForeach);
         }

        [TestMethod]
        public void GetEnumerator_MaxBlockSizeOfThreeWithThreeItems()
        {
            var chunkyList = new ChunkyList<float> (3) { 1, 2, 3 };

            var wasInsideForeach = false;
            var iteratedItems = new List<float>();
            foreach (var item in chunkyList)
            {
                wasInsideForeach = true;
                iteratedItems.Add(item);
            }

            Assert.IsTrue(wasInsideForeach);
            CollectionAssert.AreEqual(new List<float> { 1, 2, 3 }, iteratedItems);
        }

        [TestMethod]
        public void GetEnumerator_MaxBlockSizeOfTwoWithThreeItems()
        {
            var chunkyList = new ChunkyList<float>(2) {1,2,3};
            var wasInsideForeach = false;
            var iteratedItems = new List<float>();

            foreach (var item in chunkyList)
            {
                wasInsideForeach = true;
                iteratedItems.Add(item);
            }

            Assert.IsTrue(wasInsideForeach);
            CollectionAssert.AreEqual(new List<float>() { 1, 2, 3 }, iteratedItems);
            Assert.AreEqual(chunkyList.MaxBlockSize, 2);
        }
    }

      

P. S. I have implemented only the IList methods that are used in my code. Therefore, you can improve this implementation.

0


source


You can initialize the list with capacity if you have a rough estimate of the maximum size. As far as I know, this will avoid heavy garbage collection (with older smaller arrays), which can also avoid memory fragmentation.



+1


source







All Articles