How can I process a file uploaded through an HTML

2019-04-06 11:59发布

问题:

I have an application written using Spray, and I have a page which has an <input type="file" name="foo"> form element that gets POSTed to /fileUpload.

I have a Spray route set up to listen to the path /fileUpload using this code:

path("fileUpload") {
  get { ctx => {
    val request: HttpRequest = ctx.request
    //Process the file, somehow with request?
    ctx.complete("File Uploaded")
  }}
}

I can't figure out how to get the POST body and get a handle on the file, and I can't find any examples online.

It must be possible to receive a file and process it with Spray or even through simple Scala or Java, but I don't know how to do it.

Can anybody help?
Thanks!

回答1:

It's possible with Spray, although I haven't checked if streaming works properly. I fiddled a bit and got this working:

  post {
    content(as[MultipartFormData]) {
      def extractOriginalText(formData: MultipartFormData): String = {
        formData.parts.mapValues { (bodyPart) =>
          bodyPart.content.map{
            (content) => new String(content.buffer)
          }
        }.values.flatten.foldLeft("")(_ + _)
      }
      formData =>
        _.complete(
          extractOriginalText(formData)
        );
    }

If you upload a plain text file to a service that has this code in it, it coughs the original text back up. I've got it running together with an ajax upload; it should also work with an old fashioned file upload form.

It seems to me that there must be an easier way to do this, particularly the deep nesting of the content is rather clunky. Let me know if you find a simplification.

UPDATE (thx akauppi):

entity(as[MultipartFormData]) { formData => 
  complete( formData.fields.map { _.entity.asString }.flatten.foldLeft("")(_ + _) ) 
}


回答2:

I ended up using the code below. Wasn't too hard, but there really should have been a Spray sample available somewhere about this.

multipart/form-data forms must always be used (instead of the traditional application/x-www-form-urlencoded) if binary uploads are involved. More details here.

My requirements were:

  • need to upload binary files of reasonably large size
  • wanted to have metadata as fields (not embedded in URL or the upload's filename)

Some questions:

  • is the way I'm managing errors the "best" way?

It's in the essence of REST API design to treat the client as a "human" (in debugging, we are), giving meaningful error messages in case something is wrong with the message.

post {
  // Note: We cannot use a regular 'return' to provide a routing mid-way. The last item matters, but we can
  // have a 'var' which collects the correct success / error info into it. It's a write-once variable.
  //
  var ret: Option[Route] = None

  // Multipart form
  //
  // To exercise this:
  //  $ curl -i -F "file=@filename.bin" -F "computer=MYPC" http://localhost:8080/your/route; echo
  //
  entity(as[MultipartFormData]) { formData =>
    val file = formData.get("file")
      // e.g. Some(
      //        BodyPart( HttpEntity( application/octet-stream, ...binary data...,
      //                    List(Content-Type: application/octet-stream, Content-Disposition: form-data; name=file; filename=<string>)))
    log.debug( s".file: $file")

    val computer = formData.get("computer")
    // e.g. Some( BodyPart( HttpEntity(text/plain; charset=UTF-8,MYPC), List(Content-Disposition: form-data; name=computer)))
    log.debug( s"computer: $computer" )

    // Note: The types are mentioned below simply to make code comprehension easier. Scala could deduce them.
    //
    for( file_bodypart: BodyPart <- file;
        computer_bodypart: BodyPart <- computer ) {
      // BodyPart: http://spray.io/documentation/1.1-SNAPSHOT/api/index.html#spray.http.BodyPart

      val file_entity: HttpEntity = file_bodypart.entity
        //
        // HttpEntity: http://spray.io/documentation/1.1-SNAPSHOT/api/index.html#spray.http.HttpEntity
        //
        // HttpData: http://spray.io/documentation/1.1-SNAPSHOT/api/index.html#spray.http.HttpData

      log.debug( s"File entity length: ${file_entity.data.length}" )

      val file_bin= file_entity.data.toByteArray
      log.debug( s"File bin length: ${file_bin.length}" )

      val computer_name = computer_bodypart.entity.asString    //note: could give encoding as parameter
      log.debug( s"Computer name: $computer_name" )

      // We have the data here, pass it on to an actor and return OK
      //
      ...left out, application specific...

      ret = Some(complete("Got the file, thanks.")) // the string doesn't actually matter, we're just being polite
    }

    ret.getOrElse(
      complete( BadRequest, "Missing fields, expecting file=<binary>, computer=<string>" )
    )
  }
}


回答3:

Ok, after trying to write a Spray Unmarshaller for multipart form data, I decided to just write a scala HttpServlet that would receive the form submission, and used Apache's FileUpload library to process the request:

class FileUploadServlet extends HttpServlet {

  override def doPost(request: HttpServletRequest, response: HttpServletResponse) {
    val contentType = request.getContentType
    val boundary = contentType.substring(contentType.indexOf("boundary=")+9)
    val multipartStream = new MultipartStream(request.getInputStream, boundary)

    // Do work with the multipart stream
  }

}


回答4:

To grab the posted (possibly binary) file, and stick it somewhere temporarily, I used this:

  post {
    entity(as[MultipartFormData]) {
      formData => {
        val ftmp = File.createTempFile("upload", ".tmp", new File("/tmp"))
        val output = new FileOutputStream(ftmp)
        formData.fields.foreach(f => output.write(f.entity.data.toByteArray ) )
        output.close()
        complete("done, file in: " + ftmp.getName())
      }
    }
  }