I am trying with a sample dataFrame :
data = [['Alex','USA',0],['Bob','India',1],['Clarke','SriLanka',0]]
df = pd.DataFrame(data,columns=['Name','Country','Traget'])
Now from here, I used get_dummies to convert string column to an integer:
column_names=['Name','Country']
one_hot = pd.get_dummies(df[column_names])
After conversion the columns are: Age,Name_Alex,Name_Bob,Name_Clarke,Country_India,Country_SriLanka,Country_USA
Slicing the data.
x=df[["Name_Alex","Name_Bob","Name_Clarke","Country_India","Country_SriLanka","Country_USA"]].values
y=df['Age'].values
Splitting the dataset in train and test
from sklearn.cross_validation import train_test_split
x_train,x_test,y_train,y_test=train_test_split(x,y,train_size=float(0.5),random_state=0)
Logistic Regression
from sklearn.linear_model import LogisticRegression
logreg = LogisticRegression()
logreg.fit(x_train, y_train)
Now, model is trained.
For prediction let say i want to predict the "target" by giving "Name" and "Country".
Like : ["Alex","USA"].
Prediction.
If I used this:
logreg.predict([["Alex","USA"]).
obviously it will not work.
I suggest you to use sklearn label encoders and one hot encoder packages instead of pd.get_dummies.
Once you initialise label encoder and one hot encoder per feature then save it somewhere so that when you want to do prediction on the data you can easily import saved label encoders and one hot encoders and encode your features again.
This way you are encoding your features again in the same way as you did while making training set.
Below is the code which I use for saving encoders:
Now I save this onehotencoder_dict and label encoder_dict and use it later for encoding.