How do I use wit.ai with existing rows of data?

2019-06-07 07:01发布

问题:

I have a lot of existing data that I would like to use as training data for a wit.ai chatbot. The data is stored in a csv file where each row has a statement/question and a response to that statement/question.

I know that wit.ai requires you to assign intents to comments made and so I'm wondering if there is a way to simply send over the data I have and have the chatbot start learning intents on its own.

Thanks!

回答1:

"Teaching" Wit.Ai is not exactly what some might think it is.

You will have to create stories for your User says column. The replies are irrelevant to be honest. You can't "teach" wit.ai to reply. Replies are defined in the story or in your code.

What wit.ai might need from your data are keywords and key-phrases which make the entity recognition better for wit.ai.

Here is the simplest example:

Entity color is recognized based on keywords listed. So if you have a lot of data as an example of user input - you can try to break it down first into "which entities which user input should produce" and then keywords from those input.

Using your data for "teaching" - would be a little difficult since it will require you to create a lot of Stories in wit.ai to cover possible user input and entity identification. But you can still do it like this:

(rough example)

  1. Make one story about user asking the time for example
  2. Mark in the user input which entities should be derived from that input:
  3. Sort your list you have to get all possible way of asking for the time:
    • How late is it?
    • Can you tell me the time?
    • I wonder what's the time now?
  4. Use a script (Python) to "shoot" all these user inputs at your story.
  5. Once done - go to Understanding time of wit.ai and go through all input correcting\adding the entities you defined.

This process will "teach" entities if they are keywords based or some other algorithm.

That's the best I can think of about how to use your existing data. Wit.Ai is different from other language processing tool-sets and "teaching" it with existing data is somewhat "puzzling" :)



回答2:

Thanks for posting. We know this is not perfect yet but we release an import/export feature a few days ago. Looking at the structure of the json export, one can probably easily feed with existing data. It would require creating one story per statement/question and a response. More info here: https://wit.ai/docs/recipes#copyexportversion-my-app