Gridwise application of the bisection method

2019-05-10 15:15发布

I need to find roots for a generalized state space. That is, I have a discrete grid of dimensions grid=AxBx(...)xX, of which I do not know ex ante how many dimensions it has (the solution should be applicable to any grid.size) .

I want to find the roots (f(z) = 0) for every state z inside grid using the bisection method. Say remainder contains f(z), and I know f'(z) < 0. Then I need to

  • increase z if remainder > 0
  • decrease z if remainder < 0

Wlog, say the matrix historyof shape (grid.shape, T) contains the history of earlier values of z for every point in the grid and I need to increase z (since remainder > 0). I will then need to select zAlternative inside history[z, :] that is the "smallest of those, that are larger than z". In pseudo-code, that is:

zAlternative =  hist[z,:][hist[z,:] > z].min()

I had asked this earlier. The solution I was given was

b = sort(history[..., :-1], axis=-1)
mask = b > history[..., -1:]
index = argmax(mask, axis=-1)
indices = tuple([arange(j) for j in b.shape[:-1]])
indices = meshgrid(*indices, indexing='ij', sparse=True)
indices.append(index)
indices = tuple(indices)
lowerZ = history[indices]

b = sort(history[..., :-1], axis=-1)
mask = b <= history[..., -1:]
index = argmax(mask, axis=-1)
indices = tuple([arange(j) for j in b.shape[:-1]])
indices = meshgrid(*indices, indexing='ij', sparse=True)
indices.append(index)
indices = tuple(indices)
higherZ = history[indices]

newZ = history[..., -1]
criterion = 0.05
increase = remainder > 0 + criterion
decrease = remainder < 0 - criterion
newZ[increase] = 0.5*(newZ[increase] + higherZ[increase])
newZ[decrease] = 0.5*(newZ[decrease] + lowerZ[decrease])

However, this code ceases to work for me. I feel extremely bad about admitting it, but I never understood the magic that is happening with the indices, therefore I unfortunately need help.

What the code actually does, it to give me the lowest respectively the highest. That is, if I fix on two specific z values:

history[z1] = array([0.3, 0.2, 0.1])
history[z2] = array([0.1, 0.2, 0.3])

I will get higherZ[z1] = 0.3 and lowerZ[z2] = 0.1, that is, the extrema. The correct value for both cases would have been 0.2. What's going wrong here?

If needed, in order to generate testing data, you can use something along the lines of

history = tile(array([0.1, 0.3, 0.2, 0.15, 0.13])[newaxis,newaxis,:], (10, 20, 1))
remainder = -1*ones((10, 20))

to test the second case.

Expected outcome

I adjusted the history variable above, to give test cases for both upwards and downwards. Expected outcome would be

lowerZ = 0.1 * ones((10,20))
higherZ = 0.15 * ones((10,20))

Which is, for every point z in history[z, :], the next highest previous value (higherZ) and the next smallest previous value (lowerZ). Since all points z have exactly the same history ([0.1, 0.3, 0.2, 0.15, 0.13]), they will all have the same values for lowerZ and higherZ. Of course, in general, the histories for each z will be different and hence the two matrices will contain potentially different values on every grid point.

2条回答
混吃等死
2楼-- · 2019-05-10 15:34

z-shaped arrays for the next highest and lowest observation in history relative to the current observation, given the current observation is history[...,-1:]

This constructs the higher and lower arrays by manipulating the strides of history to make it easier to iterate over the observations of each element of z. This is accomplished using numpy.lib.stride_tricks.as_strided and an n-dim generalzed function found at Efficient Overlapping Windows with Numpy - I will include it's source at the end

There is a single python loop that has 200 iterations for history.shape of (10,20,x).

import numpy as np

history = np.tile(np.array([0.1, 0.3, 0.2, 0.15, 0.13])[np.newaxis,np.newaxis,:], (10, 20, 1))
remainder = -1*np.ones((10, 20))

z_shape = final_shape = history.shape[:-1]
number_of_observations = history.shape[-1]
number_of_elements_in_z = np.product(z_shape)

# manipulate histories to efficiently iterate over
# the observations of each "element" of z
s = sliding_window(history, (1,1,number_of_observations))
# s.shape will be (number_of_elements_in_z, number_of_observations)

# create arrays of the next lower and next higher observation
lowerZ = np.zeros(number_of_elements_in_z)
higherZ = np.zeros(number_of_elements_in_z)
for ndx, observations in enumerate(s):
    current_observation = observations[-1]
    a = np.sort(observations)
    lowerZ[ndx] = a[a < current_observation][-1]
    higherZ[ndx] = a[a > current_observation][0]

assert np.all(lowerZ == .1)
assert np.all(higherZ == .15)
lowerZ = lowerZ.reshape(z_shape)
higherZ = higherZ.reshape(z_shape)

sliding_window from Efficient Overlapping Windows with Numpy

import numpy as np
from numpy.lib.stride_tricks import as_strided as ast
from itertools import product

def norm_shape(shape):
    '''
    Normalize numpy array shapes so they're always expressed as a tuple, 
    even for one-dimensional shapes.

    Parameters
        shape - an int, or a tuple of ints

    Returns
        a shape tuple

    from http://www.johnvinyard.com/blog/?p=268
    '''
    try:
        i = int(shape)
        return (i,)
    except TypeError:
        # shape was not a number
        pass

    try:
        t = tuple(shape)
        return t
    except TypeError:
        # shape was not iterable
        pass

    raise TypeError('shape must be an int, or a tuple of ints')


def sliding_window(a,ws,ss = None,flatten = True):
    '''
    Return a sliding window over a in any number of dimensions

    Parameters:
        a  - an n-dimensional numpy array
        ws - an int (a is 1D) or tuple (a is 2D or greater) representing the size 
             of each dimension of the window
        ss - an int (a is 1D) or tuple (a is 2D or greater) representing the 
             amount to slide the window in each dimension. If not specified, it
             defaults to ws.
        flatten - if True, all slices are flattened, otherwise, there is an 
                  extra dimension for each dimension of the input.

    Returns
        an array containing each n-dimensional window from a

    from http://www.johnvinyard.com/blog/?p=268
    '''

    if None is ss:
        # ss was not provided. the windows will not overlap in any direction.
        ss = ws
    ws = norm_shape(ws)
    ss = norm_shape(ss)

    # convert ws, ss, and a.shape to numpy arrays so that we can do math in every 
    # dimension at once.
    ws = np.array(ws)
    ss = np.array(ss)
    shape = np.array(a.shape)


    # ensure that ws, ss, and a.shape all have the same number of dimensions
    ls = [len(shape),len(ws),len(ss)]
    if 1 != len(set(ls)):
        error_string = 'a.shape, ws and ss must all have the same length. They were{}'
        raise ValueError(error_string.format(str(ls)))

    # ensure that ws is smaller than a in every dimension
    if np.any(ws > shape):
        error_string = 'ws cannot be larger than a in any dimension. a.shape was {} and ws was {}'
        raise ValueError(error_string.format(str(a.shape),str(ws)))

    # how many slices will there be in each dimension?
    newshape = norm_shape(((shape - ws) // ss) + 1)
    # the shape of the strided array will be the number of slices in each dimension
    # plus the shape of the window (tuple addition)
    newshape += norm_shape(ws)
    # the strides tuple will be the array's strides multiplied by step size, plus
    # the array's strides (tuple addition)
    newstrides = norm_shape(np.array(a.strides) * ss) + a.strides
    strided = ast(a,shape = newshape,strides = newstrides)
    if not flatten:
        return strided

    # Collapse strided so that it has one more dimension than the window.  I.e.,
    # the new array is a flat list of slices.
    meat = len(ws) if ws.shape else 0
    firstdim = (np.product(newshape[:-meat]),) if ws.shape else ()
    dim = firstdim + (newshape[-meat:])
    # remove any dimensions with size 1
    dim = filter(lambda i : i != 1,dim)
    return strided.reshape(dim)
查看更多
干净又极端
3楼-- · 2019-05-10 15:38

I compared what you posted here to the solution for your previous post and noticed some differences.

For the smaller z, you said

mask = b > history[..., -1:]
index = argmax(mask, axis=-1)

They said:

mask = b >= a[..., -1:]
index = np.argmax(mask, axis=-1) - 1

For the larger z, you said

mask = b <= history[..., -1:]
index = argmax(mask, axis=-1)

They said:

mask = b > a[..., -1:]
index = np.argmax(mask, axis=-1)

Using the solution for your previous post, I get:

import numpy as np
history = np.tile(np.array([0.1, 0.3, 0.2, 0.15, 0.13])[np.newaxis,np.newaxis,:], (10, 20, 1))
remainder = -1*np.ones((10, 20))

a = history

# b is a sorted ndarray excluding the most recent observation
# it is sorted along the observation axis
b = np.sort(a[..., :-1], axis=-1)

# mask is a boolean array, comparing the (sorted)
# previous observations to the current observation - [..., -1:]
mask = b > a[..., -1:]

# The next 5 statements build an indexing array.
# True evaluates to one and False evaluates to zero.
# argmax() will return the index of the first True,
# in this case along the last (observations) axis.
# index is an array with the shape of z (2-d for this test data).
# It represents the index of the next greater
# observation for every 'element' of z.
index = np.argmax(mask, axis=-1)

# The next two statements construct arrays of indices
# for every element of z - the first n-1 dimensions of history.
indices = tuple([np.arange(j) for j in b.shape[:-1]])
indices = np.meshgrid(*indices, indexing='ij', sparse=True)

# Adding index to the end of indices (the last dimension of history)
# produces a 'group' of indices that will 'select' a single observation
# for every 'element' of z
indices.append(index)
indices = tuple(indices)
higherZ = b[indices]

mask = b >= a[..., -1:]
# Since b excludes the current observation, we want the
# index just before the next highest observation for lowerZ,
# hence the minus one.
index = np.argmax(mask, axis=-1) - 1
indices = tuple([np.arange(j) for j in b.shape[:-1]])
indices = np.meshgrid(*indices, indexing='ij', sparse=True)
indices.append(index)
indices = tuple(indices)
lowerZ = b[indices]
assert np.all(lowerZ == .1)
assert np.all(higherZ == .15)

which seems to work

查看更多
登录 后发表回答