Pick a point at random, but no density offset

I have this distribution of points (all points that are a list of lists: [[x1, y1] [x2, y2] [x3, y3] [x4, y4] ... [xn, yn]])

enter image description here

From which I would like to select points randomly.

in Python I would do something like:

from random import *
point = choice(allPoints)

      

Also, I need a random selection so as not to be biased on the existing density. For example, here the "selection" would tend to pick a point on the far left of the graph.

How can I get rid of this bias in Python? I tried to split the space into parts of the "div" size and then, a pattern inside that part, but in many cases there are no dots at all and the while loop finds no solution:

def column(matrix, i):
    return [row[i] for row in matrix]    

div = 10

min_x,max_x = min(column(allPoints,0)),max(column(allPoints,0))
min_y, max_y = min(column(allPoints,1)),max(column(allPoints,1))

zone_x_min = randint(1,div-1) * (max_x - min_x) / div + min_x
zone_x_max = zone_x_min + (max_x - min_x) / div

zone_y_min = randint(1,div-1) * (max_y - min_y) / div + min_y
zone_y_max = zone_yl_min + (max_y - min_y) / div

p = choice(allPoints)

cont = True

while cont == True:
    if (p[0] > zone_x_min and p[0] < zone_x_max) and (e[1] > zone_y_min and e[1] < zone_y_max):
        cont = False
    else:
        p = choice(allPoints)

      

what would be the correct, inexpensive (if possible) solution to this problem?

If it weren't funny, I think something seems to work for me, in theory:

p = [uniform(min_x,max_x),uniform(min_y,max_y)]
while p not in allPoints:
    p = [uniform(min_x,max_x),uniform(min_y,max_y)]

      

+3


source to share


4 answers


The question is a bit incorrect, but here's a blow.

The idea is to use a Gaussian kernel density estimate, then sample from your data with weights equal to the inverse PDF at each point.



This is not statistically justified in any real sense.

import numpy as np
from scipy import stats

#random data
x = np.random.normal(size = 200)
y = np.random.normal(size = 200)

#estimate the density
kernel = stats.gaussian_kde(np.vstack([x,y]))

#calculate the inverse of pdf for each point, and normalise to sum to 1
pvector = 1/kernel.pdf(np.vstack([x,y]))/sum(1/kernel.pdf(np.vstack([x,y])))

#get a vector of indices based on your weights
np.random.choice(range(len(x)), size = 10, replace = True, p = pvector) 

      

+3


source


I believe you want to randomly select an anchor point from your graph. That is, one of the little black dots.

Calculate centroid or pick point (1.0, 70).



Calculate the distance from each point to the center and let it be the probability of your choice of this point.

That is, if the distance (P, C) is 100 and the distance (Q, C) is 1, then let P be 100 times more likely. All points are eligible to win, but individually overflowing points are less likely (but do it with help.).

+2


source


If I understood your initial attempt correctly, I believe that you can do a simple setup to make this work.

Randomly generate an x ​​value (0.4.5) and a y value (0.70). Then scroll allPoints

to find the closest point.

This has the disadvantage of large empty areas, all converging to one point. A way to help (not delete) this problem would be to have your random point have a range. If there are no points in this range, randomly create a new point.

+1


source


Assuming you want your selected points to be visually spread out, I can think of at least one "effective / simple" method.

  • Pick a random point (for example random.choice

    );
  • remove from your source set any point that is "closed" *;
  • repeat until no point is left in your set.

* This requires you to know from the start how dense you want your sample to be.

0


source







All Articles