One Hot Encoding using numpy

If the input is zero I want to make an array which looks like this:

[1,0,0,0,0,0,0,0,0,0]

and if the input is 5:

[0,0,0,0,0,1,0,0,0,0]

For the above I wrote:

np.put(np.zeros(10),5,1)

but it did not work.

Is there any way in which, this can be implemented in one line?

标签： python numpy one-hot one-hot-encoding

9条回答

你好瞎i

2楼-- · 2019-01-17 04:20

Use np.identify or np.eye. You can try something like this with your input i, and the array size s:

np.identify(s)[i:i+1]

For example, print(np.identity(5)[0:1]) will result:

[[ 1.  0.  0.  0.  0.  0.  0.  0.  0.  0.]]

If you are using TensorFlow, you can use tf.one_hot: https://www.tensorflow.org/api_docs/python/array_ops/slicing_and_joining#one_hot

0人赞添加讨论(0) 举报

乱世女痞

3楼-- · 2019-01-17 04:21

Something like :

np.array([int(i == 5) for i in range(10)])

Should do the trick. But I suppose there exist other solutions using numpy.

edit : the reason why your formula does not work : np.put does not return anything, it just modifies the element given in first parameter. The good answer while using np.put() is :

a = np.zeros(10)
np.put(a,5,1)

The problem is that it can't be done in one line, as you need to define the array before passing it to np.put()

0人赞添加讨论(0) 举报

三岁会撩人

4楼-- · 2019-01-17 04:23

import time
start_time = time.time()
z=[]
for l in [1,2,3,4,5,6,1,2,3,4,4,6,]:
    a= np.repeat(0,10)
    np.put(a,l,1)
    z.append(a)
print("--- %s seconds ---" % (time.time() - start_time))

#--- 0.00174784660339 seconds ---

import time
start_time = time.time()
z=[]
for l in [1,2,3,4,5,6,1,2,3,4,4,6,]:
    z.append(np.array([int(i == l) for i in range(10)]))
print("--- %s seconds ---" % (time.time() - start_time))

#--- 0.000400066375732 seconds ---

0人赞添加讨论(0) 举报

走好不送

5楼-- · 2019-01-17 04:25

The problem here is that you save your array nowhere. The put function works in place on the array and returns nothing. Since you never give your array a name you can not address it later. So this

one_pos = 5
x = np.zeros(10)
np.put(x, one_pos, 1)

would work, but then you could just use indexing:

one_pos = 5
x = np.zeros(10)
x[one_pos] = 1

In my opinion that would be the correct way to do this if no special reason exists to do this as a one liner. This might also be easier to read and readable code is good code.

0人赞添加讨论(0) 举报

Copy-Paste solution

def get_one_hot(targets, nb_classes):
    res = np.eye(nb_classes)[np.array(targets).reshape(-1)]
    return res.reshape(list(targets.shape)+[nb_classes])

Package

You can use mpu.ml.indices2one_hot. It's tested and simple to use:

import mpu.ml
one_hot = mpu.ml.indices2one_hot([1, 3, 0], nb_classes=5)

0人赞添加讨论(0) 举报

Melony?

7楼-- · 2019-01-17 04:39

You could use List comprehension:

[0 if i !=5 else 1 for i in range(10)]

turns to

[0,0,0,0,0,1,0,0,0,0]

0人赞添加讨论(0) 举报

1 2 下一页