Is there a programmatic solution to this that does not involve having Office on the server?
Update: This solution will be deployed in a .Net shop, so for now PHP and Java approaches aren't on the table (though I was impressed with the libraries themselves).
We will be receiving documents in csv, .xls, and .xlsx formats that need to be parsed and their data shoved into a DB. We're planning on using the OpenXML SDK for all of the parsing goodness and want to operate over only one file type.
You can achieve this using the Apache POI library for Java.
I've used it to read in a complete mix of
.xls
and.xlsx
files, and I always output.xlsx
.For
.csv
files, import using the Super CSV library and export using the Apache POI library above.For csv files i would recommend a combination of http://kbcsv.codeplex.com/ to read the csv file into a datatable and EPPPLUS to use its .FromDataTable Method to convert it to an xlsx file. I works great for me and is very fast. For reading xls files there is no free Implementation that I know of :(
Or use PHPExcel ( http://www.phpexcel.net ) if you want a PHP solution rather than java
and you can use for parse columns.
you can use below method for .csv, xlsx, .txt files.