Matlab — random walk with boundaries, vectorized

2019-02-25 00:55发布

问题:

Suppose I have a vector J of jump sizes and an initial starting point X_0. Also I have boundaries 0, B (assume 0 < X_0 < B). I want to do a random walk where X_i = [min(X_{i-1} + J_i,B)]^+. (positive part). Basically if it goes over a boundary, it is made equal to the boundary. Anyone know a vectorized way to do this? The current way I am doing it consists of doing cumsums and then finding places where it violates a condition, and then starting from there and repeating the cumsum calculation, etc until I find that I stop violating the boundaries. It works when the boundaries are rarely hit, but if they are hit all the time, it basically becomes a for loop.

In the code below, I am doing this across many samples. To 'fix' the ones that go out of the boundary, I have to loop through the samples to check...(don't think there is a vectorized 'find')

% X_init is a row vector describing initial resource values to use for
% each sample

% J is matrix where each col is a sequence of Jumps (columns = sample #)
% In this code the jumps are subtracted, but same thing

X_intvl = repmat(X_init,NumJumps,1) - cumsum(J);
X = [X_init; X_intvl];
for sample = 1:NumSamples
    k = find(or(X_intvl(:,sample) > B, X_intvl(:,sample) < 0),1);
    while(~isempty(k))
        change = X_intvl(k-1,sample) - X_intvl(k,sample);
        X_intvl(k:end,sample) = X_intvl(k:end,sample)+change;
        k = find(or(X_intvl(:,sample) > B, X_intvl(:,sample) < 0),1);
    end
end

回答1:

Interesting question (+1).

I faced a similar problem a while back, although slightly more complex as my lower and upper bound depended on t. I never did work out a fully-vectorized solution. In the end, the fastest solution I found was a single loop which incorporates the constraints at each step. Adapting the code to your situation yields the following:

%# Set the parameters
LB = 0; %# Lower bound
UB = 5; %# Upper bound
T = 100; %# Number of observations
N = 3; %# Number of samples
X0 = (1/2) * (LB + UB); %# Arbitrary start point halfway between LB and UB

%# Generate the jumps
Jump = randn(N, T-1);

%# Build the constrained random walk
X = X0 * ones(N, T);
for t = 2:T
    X(:, t) = max(min(X(:, t-1) + Jump(:, t-1), UB), 0);
end
X = X';

I would be interested in hearing if this method proves faster than what you are currently doing. I suspect it will be for cases where the constraint is binding in more than one or two places. I can't test it myself as the code you provided is not a "working" example, ie I can't just copy and paste it into Matlab and run it, as it depends on several variables for which example (or simulated) values are not provided. I tried adapting it myself, but couldn't get it to work properly?

UPDATE: I just switched the code around so that observations are indexed on columns and samples are indexed on rows, and then I transpose X in the last step. This will make the routine more efficient, since Matlab allocates memory for numeric arrays column-wise - hence it is faster when performing operations down the columns of an array (as opposed to across the rows). Note, you will only notice the speed-up for large N.

FINAL THOUGHT: These days, the JIT accelerator is very good at making single loops in Matlab efficient (double loops are still pretty slow). Therefore personally I'm of the opinion that every time you try and obtain a fully-vectorized solution in Matlab, ie no loops, you should weigh up whether the effort involved in finding a clever solution is worth the slight gains in efficiency to be made over an easier-to-obtain method that utilizes a single loop. And it is important to remember that fully-vectorized solutions are sometimes slower than solutions involving single loops when T and N are small!



回答2:

I'd like to propose another vectorized solution.

So, first we should set the parameters and generate random Jumpls. I used the same set of parameters as Colin T Bowers:

% Set the parameters
LB = 0; % Lower bound
UB = 20; % Upper bound
T = 1000; % Number of observations
N = 3; % Number of samples
X0 = (1/2) * (UB + LB); % Arbitrary start point halfway between LB and UB

% Generate the jumps
Jump = randn(N, T-1);

But I changed generation code:

% Generate initial data without bounds
X = cumsum(Jump, 2);

% Apply bounds
Amplitude = UB - LB;
nsteps = ceil( max(abs(X(:))) / Amplitude - 0.5 );
for ii = 1:nsteps
    ind = abs(X) > (1/2) * Amplitude;
    X(ind) = Amplitude * sign(X(ind)) - X(ind);
end

% Shifting X
X = X0 + X;

So, instead of for loop I'm using cumsum function with smart post-processing.

N.B. This solution works significantly slower than Colin T Bowers's one for tight bounds (Amplitude < 5), but for loose bounds (Amplitude > 20) it works much faster.