i need to do function on arabic words by using python.. and i need to link arabic wordnet with python to do some method like :
wn.synset('جميل')
i find Multilingual Lexicons: AWN - ArabicWN
http://www.talp.upc.edu/index.php/technology/resources/multilingual-lexicons-and-machine-translation-resources/multilingual-lexicons/72-awn
and i try to run : A set of basic python functions for accessing the database
http://nlp.lsi.upc.edu/awn/AWNDatabaseManagement.py.gz
but when run the code(AWNDatabaseManagement.py) this error occur:
processing file E:/usuaris/horacio/arabicWN/AWNdatabase/upc_db.xml
file E:/usuaris/horacio/arabicWN/AWNdatabase/upc_db.xml not correct
Traceback (most recent call last):
File "/Users/s/Desktop/arab", line 403, in <module>
wn.compute_index_w()
NameError: global name 'wn' is not defined
any idea?
AWNDatabaseManagement.py
should be fed by the argument-i
that has the Arabic WordNet as a value. If the argument is not specified, it will use a default pathE:/usuaris/horacio/arabicWN/AWNdatabase/upc_db.xml
.So to resolve that, download the xml database of Arabic WordNet
upc_db.xml
. I suggest to place it in the same folder with the scriptAWNDatabaseManagement.py
. Then,run:This what I got after running it, no errors:
You can also change the line 320
to
and then run the script without
-i
You can load it:
if it fails, check that you are putting the xml resource in the right path.
Now to get something like
wn.synset('جميل')
. Arabic Wordnet has a functionwn.get_synsets_from_word(word)
, but it gives offsets. Also it accepts the words only as vocalized in the database. For example, you should useجَمِيل
notجميل
:300218842
is the offset of the synset of جميل . I suggest to use the next method instead. list words by:you will get a result like this:
Choose your word, and pick an id of its ids. IDs are written in Buckwalter romanization. Many ids means the word has different meanings. Describe the chosen word by:
Now you have the synsets list. For more information about a synset, use: