How can I work with my own dataset in scikit-learn

2019-03-26 07:31发布

How can I work with my own dataset in scikit-learn? Scikit Tutorial always take as example to load his dataset (digit dataset, flower dataset...)

http://scikit-learn.org/stable/datasets/index.html ie: from sklearn.datasets import load_iris

I have my images and I have no idea how create new one.

Particularly, for starting, i use this example i found (i use library opencv):

img =cv2.imread('telamone.jpg')

# Convert them to grayscale
imgg =cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

# SURF extraction
surf = cv2.SURF()
kp, descritors = surf.detect(imgg,None,useProvidedKeypoints = False)

# Setting up samples and responses for kNN
samples = np.array(descritors)
responses = np.arange(len(kp),dtype = np.float32)

I would like to extract features of a set of images, in a way useful to implement a machine learning algorithm!

1条回答
手持菜刀,她持情操
2楼-- · 2019-03-26 08:03

You would first need to clearly define what you are trying to achieve: "extract feature to a set of images, in a way useful to implement a machine learning algorithm!" is much too vague to give you any guidance.

Are you trying to do:

  • image classification of the picture as a whole (e.g. indoor scene vs outdoor scene)?

  • object recognition (e.g. recognizing several instances of the same object in different pictures) inside sub-parts of a set of pictures, maybe using a scan procedures with windows of various sizes?

  • object detection and class-based categorization (e.g. finding all occurrences of cars or pedestrians in pictures and a bounding box around each occurrence of instances of those classes)?

  • full picture semantic parsing a.k.a. segmentation of the pixels + class categorization of each segment (build, road, people, trees)...

Each of those tasks will require different pipelines (feature extraction + machine learning models combo).

You should probably start by reading a book on the subject, for instance: http://szeliski.org/Book/

Also as a side note, stackoverflow is probably not the best place to ask such open ended questions.

查看更多
登录 后发表回答