I am doing an exercise for a Machine Learning course. I appended to a matrix a dataset of images in form of arrays into datamatrix, then I standardized it and then computed the principal components. Labels is an array containing for each image the label (that was the subdirectory containing it) I need to visualize pairs of principal components, in this part the first two. The suggestion from the professor was to use the matplotli.scatter function, I found the seaborn.scatterplot function that seems better, but with none of the two I managed to put a legend with the labels names onto it.
pca = PCA()
X_t = pca.fit_transform(datamatrix)
X_r = pca.inverse_transform(X_t)
plt.figure(figsize=(25,5))
colours = ['r','g','b','p']
plt.subplot(1, 3, 1)
sns.scatterplot(X_t[:,0], X_t[:,1], hue=labels, palette=colours, legend='full')
plt.title('PC 1 and 2')
I am new to Python and to Machine Learnings libaries
Edit: As suggested I tried modifying the cod:
data = {"x" : X_t[:,0], "y" : X_t[:,1], "label" : labels}
sns.scatterplot(x="x", y="y", hue="label", palette=colours, data=data, legend='full')
But I obtain the same result: I have the legend, but without the name of the labels capture