I am trying to classify data based on prespecified labels.
Got two columns and shown below:
room_class room_cluster
Standard single sea view Standard
Deluxe twin Single Deluxe
Suite Superior room ocean view Suite
Superior Double twin Superior
Deluxe Double room Deluxe
As seen above room_cluster in the set of labels.
The code snippet is as follows:
le = preprocessing.LabelEncoder()
datar = df
#### Separate data into feature and Labels
x = datar.room_class
y = datar.room_cluster
#### Using Label encoder to change string onto 'int'
le.fit(x)
addv = le.transform(x)
asb = addv.reshape(-1,1)
#### Splitting into training and testing set adn then using Knn
x_train,x_test,y_train,y_test=train_test_split(asb,y,test_size=0.40)
classifier=neighbors.KNeighborsClassifier(n_neighbors=3)
classifier.fit(x_train,y_train)
predictions = classifier.predict(x_test)
#### Checking the accuracy
print(accuracy_score(y_test,predictions))
The accuracy that I'm getting on test data is only 78%, is there something wrong within the code that is hindering the accuracy level.
How Do I use this model to predict on custom features, for example:
Input : 'Suite Single sea view'
Output : 'Suite'
Input : 'Superior Suite twin'
Output : 'Superior'
I have coded it roughly so bear it with me.
References: