Creating a tiled multi-dimensional array while rem

I was trying to tile an array where each index is multi-diminsional. I then remove the i'th sub element from each index.

For example, starting with this array:

>>> a = np.array([[    1.     ,     7.     ,     0.     ],
                  [    2.     ,     7.     ,     0.     ],
                  [    3.     ,     7.     ,     0.     ]])
>>> a = np.tile(a, (a.shape[0],1,1))

>>> print a
array([[[    1.     ,     7.     ,     0.     ],
        [    2.     ,     7.     ,     0.     ],
        [    3.     ,     7.     ,     0.     ]],

       [[    1.     ,     7.     ,     0.     ],
        [    2.     ,     7.     ,     0.     ],
        [    3.     ,     7.     ,     0.     ]],

       [[    1.     ,     7.     ,     0.     ],
        [    2.     ,     7.     ,     0.     ],
        [    3.     ,     7.     ,     0.     ]]])

Desired output:

b = np.array([[[    2.     ,     7.     ,     0.     ],
               [    3.     ,     7.     ,     0.     ]],

              [[    1.     ,     7.     ,     0.     ],
               [    3.     ,     7.     ,     0.     ]],

              [[    1.     ,     7.     ,     0.     ],
               [    2.     ,     7.     ,     0.     ]]])

I was wondering if there was a more efficient way to generate this output without having to create a large array first then delete from it?

[UPDATE]

The intention behind this permutation was as an attempt to vectorize instead of using python for-loops. The answer provided by Divakar has been a great help in accomplishing this task. I would also like to link to this post which shows the inverse to this permutation, and was useful to rearrange things back for summing over all the values when I was done.

Additionally I am attempting to use the same permutation technique on a tensor with Tensorflow (please see this post)

Approach #1 : Here's one approach by creating a 2D array of indices such that those are skipped at each i-th position for each row and then using it for indexing into the first axis of the input array -

def approach1(a):
    n = a.shape[0]
    c = np.nonzero(~np.eye(n,dtype=bool))[1].reshape(n,n-1) # dim0 indices
    return a[c]

Sample run -

In [272]: a
Out[272]: 
array([[56, 95],
       [31, 73],
       [76, 61]])

In [273]: approach1(a)
Out[273]: 
array([[[31, 73],
        [76, 61]],

       [[56, 95],
        [76, 61]],

       [[56, 95],
        [31, 73]]])

Approach #2 : Here's another way using np.broadcast_to that creates an extended view into the input array, which is then masked to get the desired output -

def approach2(a):
    n = a.shape[0]
    mask = ~np.eye(n,dtype=bool)
    return np.broadcast_to(a, (n, n, a.shape[-1]))[mask].reshape(n,n-1,-1)

Runtime test

In [258]: a = np.random.randint(11,99,(200,3))

In [259]: np.allclose(approach1(a), approach2(a))
Out[259]: True

In [260]: %timeit approach1(a)
1000 loops, best of 3: 1.43 ms per loop

In [261]: %timeit approach2(a)
1000 loops, best of 3: 1.56 ms per loop