I'd like to write a method that converts CamelCase into a human-readable name.
Here's the test case:
public void testSplitCamelCase() {
assertEquals("lowercase", splitCamelCase("lowercase"));
assertEquals("Class", splitCamelCase("Class"));
assertEquals("My Class", splitCamelCase("MyClass"));
assertEquals("HTML", splitCamelCase("HTML"));
assertEquals("PDF Loader", splitCamelCase("PDFLoader"));
assertEquals("A String", splitCamelCase("AString"));
assertEquals("Simple XML Parser", splitCamelCase("SimpleXMLParser"));
assertEquals("GL 11 Version", splitCamelCase("GL11Version"));
}
The following Regex can be used to identify the capitals inside words:
It matches every capital letter, that is ether after a non-capital letter or digit or followed by a lower case letter and every digit after a letter.
How to insert a space before them is beyond my Java skills =)
Edited to include the digit case and the PDF Loader case.
This works in .NET... optimize to your liking. I added comments so you can understand what each piece is doing. (RegEx can be hard to understand)
http://code.google.com/p/inflection-js/
You could chain the String.underscore().humanize() methods to take a CamelCase string and convert it into a human readable string.
I took the Regex from polygenelubricants and turned it into an extension method on objects:
This turns everything into a readable sentence. It does a ToString on the object passed. Then it uses the Regex given by polygenelubricants to split the string. Then it ToLowers each word except for the first word and any acronyms. Thought it might be useful for someone out there.
You can use org.modeshape.common.text.Inflector.
Specifically:
Maven artifact is: org.modeshape:modeshape-common:2.3.0.Final
on JBoss repository: https://repository.jboss.org/nexus/content/repositories/releases
Here's the JAR file: https://repository.jboss.org/nexus/content/repositories/releases/org/modeshape/modeshape-common/2.3.0.Final/modeshape-common-2.3.0.Final.jar
I think you will have to iterate over the string and detect changes from lowercase to uppercase, uppercase to lowercase, alphabetic to numeric, numeric to alphabetic. On every change you detect insert a space with one exception though: on a change from upper- to lowercase you insert the space one character before.