I am trying to create a generic formatter/parser combination.
Example scenario:
- I have a string for string.Format(), e.g.
var format = "{0}-{1}"
- I have an array of object (string) for the input, e.g.
var arr = new[] { "asdf", "qwer" }
- I am formatting the array using the format string, e.g.
var res = string.Format(format, arr)
What I am trying to do is to revert back the formatted string back into the array of object (string). Something like (pseudo code):
var arr2 = string.Unformat(format, res)
// when: res = "asdf-qwer"
// arr2 should be equal to arr
Anyone have experience doing something like this? I'm thinking about using regular expressions (modify the original format string, and then pass it to Regex.Matches to get the array) and run it for each placeholder in the format string. Is this feasible or is there any other more efficient solution?
It's simply not possible in the generic case. Some information will be "lost" (string boundaries) in the
Format
method. Assume:How would you "Unformat" it?
Assuming "-" is not in the original strings, can you not just use Split?
Note that this only applies to the presented example with an assumption. Any reverse algorithm is dependent on the kind of formatting employed; an inverse operation may not even be possible, as noted by the other answers.
While the comments about lost information are valid, sometimes you just want to get the string values of of a string with known formatting.
One method is this blog post written by a friend of mine. He implemented an extension method called
string[] ParseExact()
, akin toDateTime.ParseExact()
. Data is returned as an array of strings, but if you can live with that, it is terribly handy.You can't unformat because information is lost.
String.Format
is a "destructive" algorithm, which means you can't (always) go back.Create a new class inheriting from
string
, where you add a member that keeps track of the"{0}-{1}"
and the{ "asdf", "qwer" }
, overrideToString()
, and modify a little your code.If it becomes too tricky, just create the same class, but not inheriting from
string
and modify a little more your code.IMO, that's the best way to do this.
A simple solution might be to
format
This would resolve the ambiguities to the shortest possible match.
(I'm not good at RegEx, so please correct me, folks :))
After formatting, you can put the resulting string and the array of objects into a dictionary with the string as key:
and in Unformat method, you can simply pass a string and look up that string and return the array used: