I want to convert all the .odt
.doc
.xls
.pdf
files to .txt
files.
I want to convert these files to text files using a shell script or a perl script
I want to convert all the .odt
.doc
.xls
.pdf
files to .txt
files.
I want to convert these files to text files using a shell script or a perl script
For word documents, you can try
antiword
, at least on linux. It's a command line utility that takes a word document as an argument, and spits out the text from that document (as best as it can figure) to Standard Output. Maybe you can specify an ouput file too. I can't remember the details of how it works. I haven't used it in a while. Not sure if it can handle OO documents.