可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
Suppose we have a hollow square lamina of size n. That is, we have a nxn square from which a k*l rectangle has been removed (1<=k,l<=n-2). I want to calculate an average of distances between 2 random, uniformly distributed points within such hollow square lamina.
For the sake of simplicity let's consider n=3, k=l=1, or a 3x3 square from whose center a unit square has been removed
I wrote this code for numpy, but it has at least 2 problems: I have to throw away approximately 1/9 of all generated points and removing the numpy.array elements requires lots of RAM:
x,y = 3*np.random.random((2,size,2))
x = x[
np.logical_not(np.logical_and(
np.logical_and(x[:,0] > 1, x[:,0] < 2),
np.logical_and(x[:,1] > 1, x[:,1] < 2)
))
]
y = y[
np.logical_not(np.logical_and(
np.logical_and(y[:,0] > 1, y[:,0] < 2),
np.logical_and(y[:,1] > 1, y[:,1] < 2)
))
]
n = min(x.shape[0], y.shape[0])
UPD:Here size
is the sample size of which I'm going to calculate average.
Is there an elegant way to generate those points right away, without removing the unfit ones?
UPD: Here is the full code just for the reference:
def calc_avg_dist(size):
x,y = 3*np.random.random((2,size,2))
x = x[
np.logical_not(np.logical_and(
np.logical_and(x[:,0] > 1, x[:,0] < 2),
np.logical_and(x[:,1] > 1, x[:,1] < 2)
))
]
y = y[
np.logical_not(np.logical_and(
np.logical_and(y[:,0] > 1, y[:,0] < 2),
np.logical_and(y[:,1] > 1, y[:,1] < 2)
))
]
n = min(x.shape[0], y.shape[0])
diffs = x[:n,:] - y[:n,:]
return np.sum(np.sqrt(np.einsum('ij,ij->i',diffs,diffs)))/n
回答1:
With the center removed, there are 8 regions that should contain points. These are their lower-left corners:
In [350]: llcorners = np.array([[0, 0], [1, 0], [2, 0], [0, 1], [2, 1], [0, 2], [1, 2], [2, 2]])
The regions are 1x1, so they have the same area and are equally likely to contain a given random point. The following chooses size
lower-left corners:
In [351]: corner_indices = np.random.choice(len(llcorners), size=size)
Now generate size
(x,y) coordinates in the unit square:
In [352]: unit_coords = np.random.random(size=(size, 2))
Add those to the lower-left corners chosen previously:
In [353]: pts = unit_coords + llcorners[corner_indices]
pts
has shape (size, 2)
. Here's a plot, with size = 2000
:
In [363]: plot(pts[:,0], pts[:,1], 'o')
Out[363]: [<matplotlib.lines.Line2D at 0x11000f950>]
Update to address the updated question...
The following function generalizes the above idea to a rectangular shape containing a rectangular hollow. The rectangle is still considered to be nine regions, with the middle region being the hollow. The probability of a random point being in a region is determined by the area of the region; numpy.random.multinomial
is used to select the number of points in each region.
(I'm sure there is room for optimization of this code.)
from __future__ import division
import numpy as np
def sample_hollow_lamina(size, outer_width, outer_height, a, b, inner_width, inner_height):
"""
(a, b) is the lower-left corner of the "hollow".
"""
llcorners = np.array([[0, 0], [a, 0], [a+inner_width, 0],
[0, b], [a+inner_width, b],
[0, b+inner_height], [a, b+inner_height], [a+inner_width, b+inner_height]])
top_height = outer_height - (b + inner_height)
right_width = outer_width - (a + inner_width)
widths = np.array([a, inner_width, right_width, a, right_width, a, inner_width, right_width])
heights = np.array([b, b, b, inner_height, inner_height, top_height, top_height, top_height])
areas = widths * heights
shapes = np.column_stack((widths, heights))
regions = np.random.multinomial(size, areas/areas.sum())
indices = np.repeat(range(8), regions)
unit_coords = np.random.random(size=(size, 2))
pts = unit_coords * shapes[indices] + llcorners[indices]
return pts
For example,
In [455]: pts = sample_hollow_lamina(2000, 5, 5, 1, 1, 2, 3)
In [456]: plot(pts[:,0], pts[:,1], 'o', alpha=0.75)
Out[456]: [<matplotlib.lines.Line2D at 0x116da0a50>]
In [457]: grid()
Note that the arguments do not have to be integers:
In [465]: pts = sample_hollow_lamina(2000, 3, 3, 0.5, 1.0, 1.5, 0.5)
In [466]: plot(pts[:,0], pts[:,1], 'o', alpha=0.75)
Out[466]: [<matplotlib.lines.Line2D at 0x116e60390>]
In [467]: grid()
回答2:
I've already posted a shorter and somehow unclear answer,
here I've taken the time to produce what I think is a better answer.
Generalizing the OP problem, we have a "surface" composed of nsc * nsr
squares disposed in nsc
columns and nsr
rows, and a "hole" composed
of nhc * nhr
squares (corresponding to the surface squares) disposed
in a rectangular arrangement with nhr
rows and nhc
columns, with
the origin placed at ohc, ohr
+------+------+------+------+------+------+------+------+------+ nsr
| | | | | | | | | |
| | | | | | | | | |
...+------+------+------+------+------+------+------+------+------+ nsr-1
| | | oooo | oooo | | | | | |
| | | oooo | oooo | | | | | |
+------+------+------+------+------+------+------+------+------+ ...
| | | oooo | oooo | | | | | |
| | | oooo | oooo | | | | | |
+------+------+------+------+------+------+------+------+------+ ...
| | | oooo | oooo | | | | | |
| | | oooo | oooo | | | | | |
ohc+------+------+------+------+------+------+------+------+------+ 1
| | | | | | | | | |
| | | | | | | | | |
+------+------+------+------+------+------+------+------+------+ 0
0 1 2 ... ... nsc-1 nsc
| |
ohr=2 ohr+nhr
Our aim is to draw n
random points from the surface without the hole
(the admissible surface) with a uniform distribution over the
admissible surface.
We observe that the admissible area is composed of
nsq = nsc*nsr -nhc*nhr
equal squares:
if we place a random point in an abstract unit square and then assign, with equal probability, that
point to one of the squares of the admissible area, we have done our
job.
In pseudocode, if random()
samples from an uniformly distributed random variable over [0, 1)
(x, y) = (random(), random())
square = integer(random()*nsq)
(x, y) = (x, y) + (column_of_square_origin(square), row_of_square_origin(square))
To speed up the process, we are using numpy
and we try to avoid, as far as possible,
explicit for
loops.
With the names previously used, we need a listing of the origins of
the admissible squares, to implement the last line of the pseudo code.
def origins_of_OK_squares(nsc, nsr, ohc, ohr, nhc, nhr):
# a set of tuples, each one the origin of a square, for ALL the squares
s_all = {(x, y) for x in range(nsc) for y in range(nsr)}
# a set of tuples with the origin of the hole squares
s_hole = {(x, y) for x in range(ohc,ohc+nhc) for y in range(ohr,ohr+nhr)}
# the set of the origins of admissible squares is the difference
s_adm = s_all - s_hole
# return an array with all the origins --- the order is not important!
# np.array doesn't like sets
return np.array(list(s_adm))
We need to generate n
random points in the unit square, organized in an array of shape (n,2)
rand_points = np.random.random((n, 2))
We need an array of n
admissible squares
placements = np.random.randint(0, nsq, n)
We translate each point in rand_points
in one of the admissible
squares, as specified by the elements of placements
.
rand_points += origins_of_OK_squares(nsc, nsr, ohc, ohr, nhc, nhr)[placements]
taking advantage of the extended addressing that it is possible for
numpy
arrays, and its done...
In an overly compact function
import numpy as np
def samples_wo_hole(n, nsc, nsr, ohc, ohr, nhc, nhr):
s_all = {(x, y) for x in range(nsc) for y in range(nsr)}
s_hole = {(x, y) for x in range(ohc,ohc+nhc) for y in range(ohr,ohr+nhr)}
rand_points = np.random.random((n, 2))
placements = np.random.randint(0, nsc*nsr - nhc*nhr, n)
return rand_points+np.array(list(s_all-s_hole))[placements]
回答3:
You have 8 equal unit squares in which it is admissible to place points, so draw as many points in an unit square as you want as in
x = np.random(n, 2)
now it suffices to choose at random in which of the 8 admissible squares each point is placed
sq = np.random.randomint(0, 8, n)
you need also an array of origins
delta = np.array([[0, 0],
[1, 0],
[2, 0],
[0, 1],
# no central square
[2, 1],
[0, 2],
[1, 2]
[2, 2]])
and finally
x = x + delta[sq]
To generalize the solution, write a function to compute an array of the origins of the admissible squares, a possible implementation using sets being
def origins(n, hole_xor, hole_wd, hole_yor, hole_hg):
all_origins = {(x,y) for x in range(n) for y in range(n)}
hole_origins = {(x,y) for x in range(hole_xor, hole_xor+hole_wd)
for y in range(hole_yor, hole_yor+hole_hg)}
return np.array(list(all_origins-hole_origins)))
and use it like this
delta = origins(12, 4,5, 6,2)
n_squares = len(delta) # or n*n - width*height
target_square = np.random.randomint(0, n_squares, size)
x = x + delta[target_square]