l have two numpy arrays the first one contains data and the second one contains labels.
l want to shuffle the data with respect to their labels. In other way, how can l shuffle my labels and data in the same order.
import numpy as np
data=np.genfromtxt("dataset.csv", delimiter=',')
classes=np.genfromtxt("labels.csv",dtype=np.str , delimiter='\t')
x=np.random.shuffle(data)
y=x[classes]
do this preserves the order of shuffling ?
Generate a random order of elements with np.random.permutation
and simply index into the arrays data
and classes
with those -
idx = np.random.permutation(len(data))
x,y = data[idx], classes[idx]
Alternatively you can concatenate the data and labels together, shuffle them and then separate them into input x and label y as shown below:
def read_data(filename, delimiter, datatype): # Read data from a file
return = np.genfromtxt(filename, delimiter, dtype= datatype)
classes = read_data('labels.csv', dtype= np.str , delimiter='\t')
data = read_data('data.csv', delimiter=',')
dataset = np.r_['1', data, classes] # Concatenate along second axis
def dataset_shuffle(dataset): # Returns separated shuffled data and classes from dataset
np.random.shuffle(dataset)
n, m = dataset.shape
x = data[:, 0:m-1]
y = data[:, m-1]
return x, y # Return shuffled x and y with preserved order