How to create a collection of unique values based

2019-08-07 05:45发布

问题:

Below code prints an Array of fileNames.

  val pdfFileArray = getFiles()
  for(fileName <- pdfFileArray){
    println(fileName)
  }

I'm trying to convert this Array (pdfFileArray) into an array which contains unique file name extensions.

Is something like below the correct way of doing this in scala ?

  Set<String> fileNameSet = new HashSet<String>
  val pdfFileArray = getFiles()
  for(fileName <- pdfFileArray){
    String extension = fileName.substring(fileName.lastIndexOf('.'));
    fileNameSet.add(extension)
  }

回答1:

You could do this:

val fileNameSet = pdfFileArray.groupBy(_.split('.').last).keys

This assumes that all you filenames will have an extension and you only want the last extension. i.e. something.html.erb has the extension 'erb'



回答2:

This will properly handle files with no extension (by ignoring them)

val extensions = getFiles().map{_.split('.').tail.lastOption}.flatten.distinct

so

Array("foo.jpg", "bar.jpg", "baz.png", "foobar")

becomes

Array("jpg", "png")


回答3:

There's a method in scala's collection called distinct, which takes away all duplicate entries in the collection. So for instance:

scala> List(1, 2, 3, 1, 2).distinct
res3: List[Int] = List(1, 2, 3)

Is that what you're looking for?



回答4:

For a sake of completeness:

List("foo.jpg", "bar.jpg").map(_.takeRight(3)).toSet

Here I'm assuming that all extensions are 3 chars long. Conversion to Set, just like .distinct method (which uses mutable set underneath, by the way) in other answers gives you unique items.



回答5:

You can also do it with regex, which gives a more general solution because you can redefine the expression to match anything you want:

val R = """.*\.(.+)""".r
getFiles.collect{ case R(x) => x }.distinct


标签: scala