Below code prints an Array of fileNames.
val pdfFileArray = getFiles()
for(fileName <- pdfFileArray){
println(fileName)
}
I'm trying to convert this Array (pdfFileArray) into an array which contains unique file name extensions.
Is something like below the correct way of doing this in scala ?
Set<String> fileNameSet = new HashSet<String>
val pdfFileArray = getFiles()
for(fileName <- pdfFileArray){
String extension = fileName.substring(fileName.lastIndexOf('.'));
fileNameSet.add(extension)
}
You could do this:
val fileNameSet = pdfFileArray.groupBy(_.split('.').last).keys
This assumes that all you filenames will have an extension and you only want the last extension. i.e. something.html.erb has the extension 'erb'
This will properly handle files with no extension (by ignoring them)
val extensions = getFiles().map{_.split('.').tail.lastOption}.flatten.distinct
so
Array("foo.jpg", "bar.jpg", "baz.png", "foobar")
becomes
Array("jpg", "png")
There's a method in scala's collection called distinct
, which takes away all duplicate entries in the collection. So for instance:
scala> List(1, 2, 3, 1, 2).distinct
res3: List[Int] = List(1, 2, 3)
Is that what you're looking for?
For a sake of completeness:
List("foo.jpg", "bar.jpg").map(_.takeRight(3)).toSet
Here I'm assuming that all extensions are 3 chars long. Conversion to Set, just like .distinct method (which uses mutable set underneath, by the way) in other answers gives you unique items.
You can also do it with regex, which gives a more general solution because you can redefine the expression to match anything you want:
val R = """.*\.(.+)""".r
getFiles.collect{ case R(x) => x }.distinct