Sorting integers

I am somehow new to the art and science of algorithms, and when I was learning about "quicksort" (which allegedly is pretty fast) I had an idea for a kind that uses a dictionary. I coded it and I was very surprised that I was interested in (sorting, say, ground temperature or elevation data) that what I coded was actually fasterthan C # .NET List. Sort () when I compiled it in Release mode. For example, if I create a list of a million integers loaded with values ​​between zero and 8000 (a good range for typical Earth heights), the .NET List.Sort () method averages around 88 milliseconds to sort the list, whereas algo below does it in about 58 milliseconds. So it seems to me that I should either tackle the Nobel Prize in Computer Science (unlikely) or that there is something I am missing and that there is a much more efficient way to sort a large number of integers in the range zero to 10,000. How do you , experts, sorting a large amount of data in this range?

private static long DictionarySort(List<int> myList, out List<int> sortedList)
{
    Stopwatch sw = new Stopwatch();
    sw.Start();
    int max = myList.Max();
    Dictionary<int, int> sorter = new Dictionary<int, int>();
    int myListCount = myList.Count;
    for (int i = 0; i < myListCount; i++)
    {
        int val = myList[i];
        int occurances = 0;
        sorter[val] = sorter.TryGetValue(val, out occurances) ? occurances + 1 : 1;
    }
    sortedList = new List<int>(myList.Count + 1);
    int numOccur = 0;
    for (int i = 0; i <= max; i++)
    {
        if (sorter.TryGetValue(i, out numOccur))
        {
            for (int j = 0; j < numOccur; j++)
            {
                sortedList.Add(i);
            }
        }
    }
    sw.Stop();
    return sw.ElapsedMilliseconds;
}

      

+3


source to share


1 answer


You rediscovered that Wikipedia invokes counting sorting , a very simple distribution sorting algorithm. This is the optimal algorithm for your dataset: it runs in O (N + k) time (N is the number of records and k is the number of distinct keys), uses O (k) additional storage, and has very low ratios.



+3


source







All Articles