Before I start, note that I'm using the linux shell (via using subprocess.call()
from Python), and I am using openFST.
I've been sifting through documents and questions about openFST, but I cannot seem to find an answer to this question: how does one actually give input to an openFST-defined, compiled and composed FST? Where does the output go? Do I simply execute 'fstproject'? If so, how would I, say, give it a string to transduce, and print the various transductions when the end-state(s) have been reached?
I apologize if this question seems obvious. I'm not very familiar with openFST as of yet.
The example from Paul Dixon is great. As the OP uses Python I thought I'd add a quick example on how you can "run" transducers with Open FST's Python wrapper. It's a shame that you can not create "linear chain automata" with Open FST, but it's simple to automate as seen below:
Let's define a simple Transducer that uppercases the letter "a":
Now we can simply apply the transducer using :
Output:
To see how to use it for an acceptor look at my other answer
One way is to create your machine that performs the transformation. A very simple example would be to upper case a string.
M.wfst
The accompanying symbols file contains a line for for each symbols of the alphabet. Note 0 is reserved for null (epsilon) transitions and has special meaning in many of the operations.
M.syms
Then compile the machine
For an input string "abc" create a linear chain automata, this is a left-to-right chain with an arc for each character. This is an acceptor so we only need a column for the input symbols.
I.wfst
Compile as an acceptor
Then compose the machines and print
This will give the output
The output of fstcompose is a lattice of all transductions of the input string. (In this case there is only one). If M.ofst is more complicated fstshortestpath can be used to extract n-strings using the flags --unique -nshortest=n. This output is again a transducer, you could either scrap the output of fstprint, or use C++ code and the OpenFst library to run depth first search to extract the strings.
Inserting fstproject --project_output will convert the output to an acceptor containing only the output labels.
Gives the following
This is an acceptor because the input and output labels are the same, the --acceptor options can be used to generate more succinct output.