I have a string coming from a UI that contains control characters such as line feeds and carrage returns.
I would like to do something like this:
String input = uiString.replaceAll(<regex for all control characters> , "")
Surely this has been done before!?
Something like this should do the trick:
Using Guava, probably more efficient than using the full regex engine, and certainly more readable...
Alternately, just using regexes, albeit not quite as readably or efficiently...
To remove just ASCII control characters, use the
Cntrl
character classTo remove all 65 of the characters that Unicode refers to as "control characters", use the
Cntrl
character class inUNICODE_CHARACTER_CLASS
mode, with the(?U)
flag:To additionally remove unicode "format" characters - things like the control characters for making text go right-to-left, or the soft hyphen - also nuke the
Cf
character class:The Guava CharMatcher.JAVA_ISO_CONTROL is deprecated, use javaIsoControl() instead: