I'm testing Watson Conversation API with a possible dialog my company wants to create. We are developing with Brazilian Portuguese. Given the portugues is a rich language and sometimes the users can make mistakes, we want to predict these possible errors, mainly with special chars and accents.
For sample, the word produção
can be written by users like: produção
, producao
, produçao
, producão
. Is possible to have a regular expression on the intents and entities to have something like the picture bellow? Sometimes we have another word to make a sense liek produção final
, produção geral
, produção passada
, etc.
Another quick question, is possible to create examples on intents merging with entities values, using something like @(producao)
(like image)?
Thank you
You cannot use regular expressions in intents or entities however I think you should still be able to cope with variations.
There is currently no built in handling of typos or accent normalization when matching intents however if there are enough features in a sentence to match on, the occasional typo shouldn't cause problems. For very short examples, there may be some value in adding additional examples for common mistakes.
For entities, you can include synonyms and I have used that to include common mistakes before.
You shouldn't try to include a reference to an entity directly in your intents. For example, rather than Qual a @(producao)
you should just have Qual a produção
, along with other examples of the same intent, perhaps with different entities, or different synonyms for the same entity. For example, I might have the following examples for a #directions intent...
- How do I get to the hotel by car?
- Can you give me directions to the hotel by road?
- Which is the nearest station if I travel by train
- Which bus route will get me to the hotel?
Along with values like car, bus, train, bicycle, etc. for a @transport entity. (Sorry I can't give a Brazilian Portuguese example!) There's no need to explicitly name the entity/entities you're expecting to find in an intent.
And finally, you can use regular expressions in conditions on dialog nodes, for example...
input.text.matches( 'produ[cç][aã]o' )
In this case, just for the complement and more bits of knowledge, a few days ago IBM Watson Conversation released a new Beta version for use Patterns.
With Patterns in @Entities, you can use regular expressions.
The Patterns field lets you define specific patterns for an entity
value. A pattern must be entered as a regular expression in the field.
As in this example, for entity "ContactInfo", the patterns for phone, email values can be defined as follows:
Examples:
localPhone: (\d{3})-(\d{4})
, e.g. 426-4968
fullUSphone: (\d{3})-(\d{3})-(\d{4})
, e.g. 800-426-4968
email: \b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\b
, e.g. test@gmail.com
Often when using pattern entities, it will be necessary to store the text that matches the pattern in a context variable (or action variable), from within your dialog tree.
Imagine a case where you are asking a user for their email address. The dialog node condition will contain a condition similar to @contactInfo:email
. In order to assign the user-entered email as a context variable, the following syntax can be used to capture the pattern match within the dialog node's response section:
{
"context" : {
"email": "@contactInfo.literal"
}
}
Obs.: The pattern matching engine employed by the Conversation service has some syntax limitations, which are necessary in order to avoid performance concerns which can occur when using other regular expression engines. Notably, entity patterns may not contain:
- Positive repetitions (e.g., x*+)
- Backreferences (e.g., \g1)
- Conditional branches (e.g., (?(cond)true))
See more about Defining Entities in Watson Conversation (focused in step 7)
you dont need to worry about accent, plural or misspelled word. Watson, LUIS, API.AI and so on take this as features and works for each word. For example:
Cartão de Crédito > Kartão de Crédito > cartao de crebito
All of these works fine !