How do I get 100 random elements from a HashSet in Java?

I have a HashSet in which I have 10,000 items. I want to extract random 100 items from this HashSet. So I thought I could use shuffle on the set, but it doesn't work.

Set<String> users = new HashSet<String>();

// for randomness, but this doesn't work
Collections.shuffle(users, new Random(System.nanoTime()));  

// and use for loop to get 100 elements

      

I can't use shuffle now, is there any other better way to get 100 random elements from a HashSet in Java?

+3


source to share


3 answers


Without creating a new list, you can implement the following algorithm:

n = 100
d = 10000  # length(users)
for user in users:
    generate a random number p between 0 and 1
    if p <= n / d:
       select user
       n -= 1
    d -= 1

      



As you iterate over the list, you decrease the likelihood of future items being selected by decreasing n, but at the same time increasing the likelihood that d will decrease. Initially, you will have a 100/10000 chance to select the first item. If you choose to take this item, you have a 99/9999 chance of choosing the second item; if you don't pick the first one, you have a slightly better 100/9999 probability of picking the second item. The math works so that at the end each item has a 100/10000 probability of being selected for output.

+5


source


Shuffling a collection assumes that there is a certain order of elements inside, so the elements can be reordered. HashSet

is not an ordered collection, as there is no order of items within (or rather, order details are not exposed to the user). So implementation wise it doesn't make sense to reshuffle HashSet

.

What you can do is add all elements from set

to ArrayList

, shuffle it and get the results.



List<String> usersList = new ArrayList<String>(users);
Collections.shuffle(usersList);
// get 100 elements out of the list

      

+6


source


Java.lang.HashSet has an order, so you can't shuffle Sets. If you must use Sets, you can iterate over the Set and stop at a random position.

pseudocode:

Set randomUsers = new HashSet<String>();
Random r = new Random();
Iterator it = users.iterator(); 
numUsersNeeded = 100;
numUsersLeft = users.size();
while (it.hasNext() && randomUsers.size() < 100) {
  String user = it.next();
  double prop = (double)numUsersNeeded / numUsersLeft;
  --numUsersLeft;
  if (prop > r.nextDouble() && randomUsers.add(user)) { 
    --numUsersNeeded;
  }
}

      

You can repeat this because there is no guarantee that you will receive 100 items.

If there is no memory, you can create an array and select 100 random elements:

Pseudocode II:

Object userArray[] = user.toArray();
Set<String> randoms = new HashSet<String>();
while(randoms.size() != 100) {
  int randomUser = userArray[new Random().nexInt(10000)];
  randoms.add(randomUser);
}

      

-1


source







All Articles