How to write to HDFS using Scala

2019-02-17 00:36发布


I am learning Scala and i need to write a custom file to HDFS. I have my own HDFS running on a Cloudera image using vmware fusion on my laptop.

This is my actual code:

package org.glassfish.samples

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

* @author ${}
object App {

def main(args : Array[String]) {
println( "Trying to write to HDFS..." )
val conf = new Configuration()
val fs= FileSystem.get(conf)
val output = fs.create(new Path("hdfs://quickstart.cloudera:8020/tmp/mySample.txt"))
val writer = new PrintWriter(output)
try {
    writer.write("this is a test") 
finally {


And i am getting this exception:

Caused by: java.lang.IllegalArgumentException: Wrong FS: hdfs://quickstart.cloudera:8020/tmp, expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(
at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(
at org.apache.hadoop.fs.ChecksumFileSystem.mkdirs(
at org.apache.hadoop.fs.ChecksumFileSystem.create(
at org.apache.hadoop.fs.ChecksumFileSystem.create(
at org.apache.hadoop.fs.FileSystem.create(
at org.apache.hadoop.fs.FileSystem.create(
at org.apache.hadoop.fs.FileSystem.create(
at org.apache.hadoop.fs.FileSystem.create(
at org.glassfish.samples.App$.main(App.scala:19)
at org.glassfish.samples.App.main(App.scala)
... 6 more

I can access hdfs using the terminal and Hue

[cloudera@quickstart ~]$ hdfs dfs -ls /tmp
Found 3 items
drwxr-xr-x   - hdfs     supergroup          0 2015-06-09 17:54 /tmp/hadoop-yarn
drwx-wx-wx   - hive     supergroup          0 2015-08-17 15:24 /tmp/hive
drwxr-xr-x   - cloudera supergroup          0 2015-08-17 16:50 /tmp/labdata

this is my pom.xml

I ran the project using the command:

mvn clean package scala:run

What do i am doing wrong? thank you in advance!

EDIT after @jeroenr advice

This is actual code:

package org.glassfish.samples

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

* @author ${}
object App {

//def foo(x : Array[String]) = x.foldLeft("")((a,b) => a + b)

def main(args : Array[String]) {
println( "Trying to write to HDFS..." )
val conf = new Configuration()
//conf.set("fs.defaultFS", "hdfs://quickstart.cloudera:8020")
conf.set("fs.defaultFS", "hdfs://")
val fs= FileSystem.get(conf)
val output = fs.create(new Path("/tmp/mySample.txt"))
val writer = new PrintWriter(output)
try {
    writer.write("this is a test") 
finally {



Have a look at this this example here. I think the problem is that you don't configure the default file system using

conf.set("fs.defaultFS", "hdfs://quickstart.cloudera:8020")

and pass the relative path, like so:

fs.create(new Path("/tmp/mySample.txt"))

to write to the file, call 'write' directly on the output stream returned by fs.create, like so:

val os = fs.create(new Path("/tmp/mySample.txt"))
os.write("This is a test".getBytes)