Determine if more than half of the keys in an array of size n are the same key in O (n) time?

Question

Determine if more than half of the keys in an array of size n are the same key in O (n) time?

You have an array or list of keys of known size n. It is not known how many unique keys there are in this list, there can be as few as 0 and up to n and including n. The keys are not in a particular order, and they really cannot be, since these keys have no concept of more or less than just equality or inequality. Now, before you say the hashmap, here is another condition that I think throws the key into this idea: the value of each key is private. The only information you can get about a key is whether it matches another key. So basically:

class key{
    private:
        T data;
        ...
    public:
        ...
        bool operator==(const key &k){return data==k.data}
        bool operator!=(const key &k){return data!=k.data}
};

key array[n];

Now, is there an algorithm that can determine if more than half of the keys in an array are the same key in linear time? If not, how about O (n * log (n))? For example, for example, an array only has 3 unique keys. 60% of the array is filled with keys, where key.data == foo, 30% key.data == bar and 10% key.data == derp. The algorithm only needs to determine that more than 50% of the keys are of the same type (data keys == foo) and also return one of those keys.

According to my professor, it can be done in O (n) times, but he says that we only need to find one that can do it in O (n * log (n)) time.

+3

arrays algorithm complexity-theory compare key

Mike 13 Feb At 4:37 am

source to share

3 answers

If you don't want to use BM algorithm, you can use 2 following algorithm based on the same idea.

Algorithm a. On each run, maintain a set S of M (a small part of N, for example 10) of a pair of item counts, traversing the array for each item:

1.1. If element E is in the set, increase count of the corresponding pair (E,count) -> (E, count+1)
1.2. If not, drop out element with minimal count and insert new pair (E,1)

If an element has a frequency F> 0.5, it will be in the set at the end of this procedure with a probability (very roughly, in fact much higher). 1 - (1-F) ^ M, in the second run, calculate the actual frequencies of elements in the set S.

Algorithm b. Take N series of length M randomly selected elements of the array, select the most frequently used element by any method and calculate the frequency for it and the average frequency over the row, the maximum error in calculating the frequency will be F (realFrequency) / sqrt (N). So if you get F_e * 1 - 1.0 / sqrt (N))> 0.5 then you will find the most used item, if you get F_e (1 + 1.0 / sqrt (N)) <0.5 this item does not exist. F_e is the estimated frequency.

0

distantTransformer 13 Feb 13 at 8:42

source to share

One solution that comes to my mind is ... Select the first element from this array and go through the list and all the matching elements that you put in a separate arraylist. Now you select the second item from the original list and compare it with the first, if they are equal leave this item and select the next one. This might be a possible solution.

-2

Sunny gupta 13 Feb At 4:50 am

source to share

MBo · Accepted Answer · 2013-02-13T05:49:32+0000

If you can retrieve and hold any key for further comparisons, then your friend will be "Boy-Moore by Majority."

Determine if more than half of the keys in an array of size n are the same key in O (n) time?

More articles: