I'm new to python and natural language processing, and I'm trying to learn using the nltk book. I'm doing the exercises at the end of Chapter 2, and there is a question I'm stuck on. "In the discussion of comparative wordlists, we created an object called translate which you could look up using words in both German and Italian in order to get corresponding words in English. What problem might arise with this approach? Can you suggest a way to avoid this problem?"
The book had me use the swadesh corpus to create a 'translator', as follows:
`from nltk.corpus import swadesh
fr2en = swadesh.entries(['fr', 'en'])
de2en = swadesh.entries(['de', 'en'])
es2en = swadesh.entries(['es', 'en'])
translate = dict(fr2en)
translate.update(dict(de2en))
translate.update(dict(es2en))`
One problem I saw was that when you translate the German word for dog (hund) to English, it only takes the uppercase form:
translate['Hund']
returns 'dog'
, while translate['hund']
returns KeyError: 'hund'
Is there a way to make the translator translate words regardless of case? I've been playing around with it, like doing translate.update(dict(de2en.lower))
and what not to no avail. I feel like I'm missing something obvious. Could anyone help me?
Thanks!