In MATLAB, I would like to generate n
pairs of random integers in the range [1, m]
, where each pair is unique. For uniqueness, I consider the order of the numbers in the pair to be irrelevant such that [3, 10]
is equal to [10, 3]
.
Also, each pair should consist of two distinct integers; i.e. [3, 4]
is ok but [3, 3]
would be rejected.
EDIT: Each possible pair should be chosen with equal likelihood.
(Obviously a constraint on the parameters is that n <= m(m-1)/2
.)
I have been able to successfully do this when m
is small, like so:
m = 500; n = 10; % setting parameters
A = ((1:m)'*ones(1, m)); % each column has the numbers 1 -> m
idxs1 = squareform(tril(A', -1))';
idxs2 = squareform(tril(A, -1))';
all_pairs = [idxs1, idxs2]; % this contains all possible pairs
idx_to_use = randperm( size(all_pairs, 1), n ); % choosing random n pairs
pairs = all_pairs(idx_to_use, :)
pairs =
254 414
247 334
111 146
207 297
45 390
229 411
9 16
75 395
12 338
25 442
However, the matrix A
is of size m x m
, meaning when m
becomes large (e.g. upwards of 10,000), MATLAB runs out of memory.
I considered generating a load of random numbers randi(m, [n, 2])
, and repeatedly rejecting the rows which repeated, but I was concerned about getting stuck in a loop when n
was close to m(m-1)/2
.
Is there an easier, cleaner way of generating unique pairs of distinct integers?
This is more of a general approach rather then a matlab solution.
How about you do the following first you fill a vector like the following.
Then you do the following again
And if
You would make sure that the same pair is not already selected.
After you are done you can run a permutation algorithm on the matrix if you want them to be randomly spaced.
This approach will give you all the possibility or 2 integer pairs, and it runs in O(n) where n is the height of the matrix.
The following code does what you need:
Easy, peasy, when viewed in the proper way.
You wish to generate n pairs of integers, [p,q], such that p and q lie in the interval [1,m], and p
How many possible pairs are there? The total number of pairs is just m*(m-1)/2. (I.e., the sum of the numbers from 1 to m-1.)
So we could generate n random integers in the range [1,m*(m-1)/2]. Randperm does this nicely. (Older matlab releases do not allow the second argument to randperm.)
(Note that I've written this expression with m in a funny way, dividing by 2 in perhaps a strange place. This avoids precision problems for some values of m near the upper limits.)
Now, if we associate each possible pair [p,q] with one of the integers in k, we can work backwards, from the integers generated in k, to a pair [p,q]. Thus the first few pairs in that list are:
We can think of them as the elements in a strictly upper triangular array of size m by m, thus those elements above the main diagonal.
See that these formulas recover p and q from the unrolled elements in k. We can convince ourselves that this does indeed work, but perhaps a simple way here is just this test:
Another way of testing it is to show that all pairs get generated for a small case.
Yup, it looks like everything works perfectly. Now try it for some large numbers for m and n to test the time used.
This scheme will work for m as large as roughly 1e8, before it fails due to precision errors in double precision. The exact limit should be m no larger than 134217728 before m/2*(m-1) exceeds 2^53. A nice feature is that no rejection for repeat pairs need be done.