A very fast way to sort an array of integers?

I come here with a solution I just wrote, but I definitely cannot find a better way to sort my array as quickly as possible.

I really need an algorithm to give me an answer in less than 1 second, to an array containing 100,000 integers.

Here's the code:

read N
for (( i=0; i<N; i++ )); do
    read arr[i]
done

sorted=($(printf '%s\n' "${arr[@]}"|sort -n))

min=$(( sorted[1]-sorted[0] ))

for (( i=1; i<N-1; i++ )); do
    diff=$(( sorted[i+1] - sorted[i] ))
    if [ "$diff" -lt "$min" ]; then
        min=$diff
    fi
done

echo $min

      

N is the number of elements, in our case 100,000, as I said.

The thing is, after sorting the array, I need to loop through it and calculate the minimum distance between two integers:

For example, with (3, 5, 8, 9) the minimum distance is 1 (between 8 and 9).

I'm such a newbie to Bash, so I don't even know if it's efficient;

In fact, the speed gain might come from the second part of the code, not the sorting part ...

Any ideas?

Thanks in advance,

+3


source to share


3 answers


Finally I came up with a simple and rather elegant solution:

read N

for (( i=0; i<N; i++ )); do
    read tab[i]
done

printf "%i\n" ${tab[@]} | sort -n | awk 'BEGIN{dmin=1000000;} (NR>1){d=$1-prec; if (d<dmin) dmin=d;} {prec=$1;} END{print dmin;}'

      



And what it is. :) Thank you all for taking the time to help me!;)

+2


source


the range of numbers <0,1000000)

is small enough

  • since your numbers don't repeat (slightest distance> = 40)
  • then you just need a bit for the value (if present or not)
  • so you need up to 1 MB for byte / value or up to 128 KB for bits / value
  • (since K, M are 1024-based, so you have a large enough reserve)

so sorting in a basket is possible:



  • create an array of flags for each possible value

    • bool flag[1000000];

  • remove flags O(M)

    • for (int i=0;i<1000000;i++) flag[i]=0;

  • calculate flags by processing an array arr[N]

    ...O(N)

    • for (int i=0;i<N;i++) flag[arr[i]]=1;

    • to iterate over just increment the [] flag if it's too big to avoid overflow
    • if flag[arr[i]]

      already 1

      then returndistance = 0

  • restore array O(M)

    • for (int i=0,j=0;i<1000000;i++) if (flag[i]) { arr[j]=i; j++; } N=j;

  • now calculate the distance O(N)

    • you can combine these steps together so you don't need it arr[N]

      in memory
    • instead, just count the trailing zeros in flag[]

      ...

[Notes]

  • M=1000000

  • N<=M

  • if N

    much less than M

    , then this may not be the fastest way ...
+2


source


Considering this awesome page on the complexity of the sorting algorithm, I would use Radix Sort ( here in Python I didn't find a Bash implementation, but I'm still looking):

#python2.6 <
from math import log

def getDigit(num, base, digit_num):
    # pulls the selected digit
    return (num // base ** digit_num) % base  

def makeBlanks(size):
    # create a list of empty lists to hold the split by digit
    return [ [] for i in range(size) ]  

def split(a_list, base, digit_num):
    buckets = makeBlanks(base)
    for num in a_list:
        # append the number to the list selected by the digit
        buckets[getDigit(num, base, digit_num)].append(num)  
    return buckets

# concatenate the lists back in order for the next step
def merge(a_list):
    new_list = []
    for sublist in a_list:
       new_list.extend(sublist)
    return new_list

def maxAbs(a_list):
    # largest abs value element of a list
    return max(abs(num) for num in a_list)

def split_by_sign(a_list):
    # splits values by sign - negative values go to the first bucket,
    # non-negative ones into the second
    buckets = [[], []]
    for num in a_list:
        if num < 0:
            buckets[0].append(num)
        else:
            buckets[1].append(num)
    return buckets

def radixSort(a_list, base):
    # there are as many passes as there are digits in the longest number
    passes = int(round(log(maxAbs(a_list), base)) + 1) 
    new_list = list(a_list)
    for digit_num in range(passes):
        new_list = merge(split(new_list, base, digit_num))
    return merge(split_by_sign(new_list))

      

+1


source







All Articles