I have this java application that should load and print data with french special characters from a .dbf or dBase3 file but it doesn't work; the characters are not showing.
I asked this question thinking that the problem was related only to the printing, but if you see the comments you can understand that i figured out that the problem was related to the database and not to the printing, since when adding a special character to my JTextPane, it prints normally... and i tried changing the character set of the textPane but still the same problem.
Also, to complicate even more the question for those out there that love solving difficult problems, when i use MS Access to open my .dbf file, the characters are there. So i'm thinking, the error probably happens while loading the data from the database... By the way, to do data fetching, i'm using this API called xBaseJ that doesn't use sql, but it's own implementation.
I hope i have given all the necessary details and also i'd really appreciate any help, really.. any idea could help me figure out the solution (and the problem too).
Edit Now, with the Answer of Ethan Furman, we know that the problem is related to the encoding of database wich is Plain old Ascii and it's not related to the xBaseJ API.
Now, the question should be: Is it possible to change the encoding of a dBase database? And how can i do it? Thank you @Ethan Furman, And thanks in advance for any help related to this question.
Finally, i found the answer...
First of all and as mentioned, thanks to Ethan Furman, i figured out that the problem was related to the encoding of the dbf Database and not to the xBaseJ API.
Then i had to search for hours for a tool that can help me change the charset of the database which is Ascii. I found out that OpenOffice from Apache does that but the problem is that i don't have OpenOffice on my windows, and i tried to download it 5 or 6 times but every time it is interrupted since my internet connection is really really bad (it downloads at the speed of 6 to 7Kbs) and the .exe file is 209 mB. So i had to search even more for another software to do the needed task.. And i don't how i found this DBF Commander that does more than just changing the charset. Anyways, downloaded the trial version that does everything but shows a window telling you to buy it everytime you do anything :D.
Finally, i changed the charset from Ascii (850 International MS-DOS or something) to 1252 Windows Ansi... aaaaand boom! it works!
I still think that there's a difference between the terms "codePage" "Charset" and "encoding" and i'm using them the same.. But at least now i know they exist, and that's a new thing that i learned.
Anyways, thank you again Ethan Furman, and i'd like to thank Google also for making this possible :D!
I could be wrong but try setting your database to UTF-8. I'm guessing this problem has to do with character encoding.
dbf
files all use encodings, and notutf-8
. Which encoding was used is a part of the metadata stored in the first few bytes of the file. You are facing one of two scenarios:The encoding is stored properly in the
dbf
fileIf this is happening then MS Acess is properly using that information to decode the raw dbf data into unicode, and xBaseJ is not.
The encoding is not stored properly in the file
If this is happening then MS Access is getting a lucky guess on the encoding, and xBaseJ is refusing to guess.
You need to find a tool that will examine the
dbf
file and tell you which encoding was stored in it. If you don't know of any, and you don't mind having Python on your machine, you can use a dbf module I wrote to figure it out:which will print out the encoding, number of fields, size of a record, field names, etc.
Note on installing (which can be such a pain)
Ideally, you should be able to install pip, and then do a
pip install enum34 dbf --upgrade
which will put the latest versions of those two libraries in the correct spot on your system.Failing that, you'll want to grab both enum34 and dbf from PyPI and put
enum.py
anddbf.py
in your Python'ssite-packages
folder:Update
If, after doing all that, you discover that the codepage/encoding was never set in the file (it's amazing how often this happens), then you can also use
dbf
to change it (if you know what it should be):You can try this library: xbase4j. As I learned, in many DBF files the "language" flag is set incorrectly or is not set at all. To solve this problem, just specify the the proper language before opening the DBF file. Something like this:
Feel free to contact me if you need some help.
Regards,