My current workflow involves using Applescript to essentially delimit Excel data and format it into plain text files. We're pushing towards an all Swift environment, but I haven't yet found any sort of kits for parsing my Excel data into Swift.
The only thing I can think of is to use C or something and wrap it, but that's not ideal. Any better suggestions for parsing this data for use in Swift?
The goal is to eliminate Applescript, but I'm not sure if that will be possible while still interacting with Excel files. Scripting Excel via Applescript seems to be the only method.
EDIT: I don't have the option of eliminating Excel from this workflow. This is how the data will be coming to the application, thus I have to include it.
Being able to streamline the process of parsing this data then processing it will be paramount. I know Applescript has been good in the past with helping me to process it; however, it's getting a little too closed-off for me.
I've been looking at writing something in Swift/Cocoa, but that still might require the data to be extracted with an Applescript, right?
A big plus for pushing Swift is the readability. I don't know Objective-C all that well, and swift would be an easier transition, I feel.
My workflow on PC has been using the COM object, which as has been said, isn't available in the Mac Excel app. I'm only looking for data extraction at this point. Some previous apps did processing within the app, but I'm looking to make this very self-contained, thus all processing within the app I'm developing. Once the data is extracted from the .XLS or .XLSX files, I'll be doing some text editing via RegEx and perhaps a little number crunching. Nothing too crazy. As of now, it will run on the client side, but I'm looking to extend this to a server process.
In Mac OS X 10.6 Snow Leopard Apple introduced the AppleScriptObjC framework which makes it very easy to interact between Cocoa and AppleScript. AppleScript code and a Objective-C like syntax can be used in the same source file. It's much more convenient than
Scripting Bridge
andNSAppleScript
.AppleScriptObjC cannot be used directly in Swift because the command
loadAppleScriptObjectiveCScripts
of NSBundle is not bridged to Swift.However you can use a Objective-C bridge class for example
ASObjC.h
ASObjC.m
Create a AppleScript source file form the AppleScriptObjC template
ASExcel.applescript
Link to the AppleScriptObjC framework if necessary.
Create the Bridging Header and import
ASObjC.h
Then you can call AppleScriptObjC from Swift with
or
There is no need to export Excel files to CSV for Swift as you can use an existing open-source library for parsing XLSX files. If you use CocoaPods or Swift Package Manager for integrating 3rd-party libraries,
CoreXLSX
supports those. After the library is integrated, you can use it like this:This will open
file.xlsx
and print all cells within that file. You can also filter cells by references and access only cell data that you need for your automation.It's somewhat unclear if you're trying to eliminate Excel as a dependency (which is not unreasonable: it costs money and not everyone has it) or AppleScript as a language (totally understandable, but a bad practical move as Apple's alternatives for application automation all suck).
There are third-party Excel-parsing libraries available for other languages, e.g. I've used Python's
openpyxl
(for .xlsx files) andxlrd
(for .xsl) libraries successfully in my own projects. And I see through the magicks of Googles that someone's written an ObjC framework, DHlibxls, which [assuming no dynamic trickery] should be usable directly from Swift, but I've not used it myself so can't tell you anything more.1. Export to plaintext CSV
If all you're trying to do is extract data from Excel to use elsewhere, as opposed to capturing Excel formulas and formatting, then you probably should not try to read the .xls file. XLS is a complex format. It's good for Excel, not for general data interchange.
Similarly, you probably don't need to use AppleScript or anything else to integrate with Excel, if all you want to do is save the data as plaintext. Excel already knows how to save data as plaintext. Just use Excel's "Save As" command. (That's what it's called on the Mac. I don't know about PCs.)
The question is then what plaintext format to use. One obvious choice for this is a plaintext comma-separated value file (CSV) because it's a simple de facto standard (as opposed to a complex official standard like XML). This will make it easy to consume in Swift, or in any other language.
2. Export in UTF-8 encoding if possible, otherwise as UTF-16
So how do you do that exactly? Plaintext is wonderfully simple, but one subtlety that you need to keep track of is the text encoding. A text encoding is a way of representing characters in a plaintext file. Unfortunately, you cannot reliably tell the encoding of a file just by inspecting the file, so you need to choose an encoding when you save it and remember to use that encoding when you read it. If you mess this up, accented characters, typographer's quotation marks, dashes, and other non-ASCII characters will get mangled. So what text encoding should you use? The short answer is, you should always use UTF-8 if possible.
But if you're working with an older version of Excel, then you may not be able to use UTF-8. In that case, you should use UTF-16. In particular, UTF-16 is, I believe, the only export option in Excel 2011 for Mac which produces a predictable result which will not depend in surprising ways on obscure locale settings or Microsoft-specific encodings.
So if you're on Excel 2011 for Mac, for instance, choose "UTF-16 Unicode Text" from Excel's Save As command.
This will cause Excel to save the file so that every row is a line of text, and every column is separated by a tab character. (So technically, this is a tab-separated value files, rather than a comma-separated value file.)
3. Import with Swift
Now you have a plaintext file, which you know was saved in a UTF-8 (or UTF-16) encoding. So now you can read it and parse it in Swift.
If your Excel data is complicated, you may need a full-featured CSV parser. The best choice is probably CHCSVParser.
Using CHCSV, you can parse the file with the following code:
(You could also call it from Swift, of course.)
On the other hand, if you're data is relatively simple (for instance, it has no escaped characters), then you might not need to use an external library at all. You can write some Swift code that parses tab-separated values just by reading in the file as a string, splitting on newlines, and then splitting on tabs.
This function will take a
String
representing TSV data and return an array of dictionaries:So you only need to read the file into a string and pass it to this function. That snippet comes from this gist for a tsv-to-json converter. And if you need to know more about which text encodings Microsoft products produce, and which ones Cocoa can auto-detect, then this repo on text encoding contains the research on export specimens which led to the conclusion that UTF-16 is the way to go for old Microsoft products on the Mac.
(I realize I'm linking to my own repos here. Apologies?)
You can use ScriptingBridge or NSAppleScript to interact with Apple Scriptable stuff
ScriptingBridge can generate a header file from the Apple Script dictionary.
NSAppleScript can execute any AppleScript for you by passing a
String