Groovy - File handling from http URL

2019-02-11 02:40发布

The files in one of our servers can be accessed via http. So, when we bring up a url similar to the following, we get a list of files/directories in that location:

http://mytestserver/files/

From this list, I need to select only those files that match a regex format.

If this was a location in the disk, I can make use of the method eachFileMatch and filter the files that I need.

Can someone help me how we can do this from a http URL?

4条回答
smile是对你的礼貌
2楼-- · 2019-02-11 02:56

I would think it would be far better to put an FTP server there, if you want to serve files.

Unless your HTTP server supports a known file serving protocol such as WebDAV, you're going to have to jump through some hoops to use it as a file server.

You would need to use a HTTP client, such as the Groovy HttpBuilder.

When you make a request to that URL, your HTTP server returns a response. If you have directory listings enabled, then most HTTP servers will return a HTML page providing you with links to the files and subdirectories within that directory.

You would need to parse that HTML response, perhaps using some regular expressions to extract the file links that you want from it.

But every HTTP server returns such listings in its own format, so you would have to adapt it to the format used by your server.

查看更多
Lonely孤独者°
3楼-- · 2019-02-11 02:56

expadnded version of Grooveek code with https and providing cookie to get to webdavs behind login/password:

@Grab(group='org.jsoup', module='jsoup', version='1.7.3')
import org.jsoup.Jsoup

import javax.net.ssl.HostnameVerifier
import javax.net.ssl.HttpsURLConnection
import javax.net.ssl.SSLContext
import javax.net.ssl.TrustManager
import javax.net.ssl.X509TrustManager

def nullTrustManager = [
checkClientTrusted: { chain, authType ->  },
checkServerTrusted: { chain, authType ->  },
getAcceptedIssuers: { null }
]

def nullHostnameVerifier = [
verify: { hostname, session -> true }
]

SSLContext sc = SSLContext.getInstance("SSL")
sc.init(null, [nullTrustManager as X509TrustManager] as TrustManager[],     null)
HttpsURLConnection.setDefaultSSLSocketFactory(sc.getSocketFactory())
HttpsURLConnection.setDefaultHostnameVerifier(nullHostnameVerifier as     HostnameVerifier)

def (doc,files,dirs)  = 
    [Jsoup.connect('https://webdav/address').cookie('JSESSIONID','XYZsessionid').get(),[],[]]
doc.select("[href]").each{href ->
    def filename = href.text()
    def path = href.attr('href')
    path.endsWith("/")?dirs.add(filename):files.add(filename)
    }
println """DIRECTORIES :
${dirs.join('\n')}
FILES : 
${files.join('\n')}
"""
查看更多
男人必须洒脱
4楼-- · 2019-02-11 02:58

Another version of @tim_yates Answer using JSoup

@Grab(group='org.jsoup', module='jsoup', version='1.7.3')
import org.jsoup.Jsoup
def (doc,files, dirs) = [Jsoup.connect('http://central.maven.org/maven2/com/bloidonia/groovy-stream/').get(),[],[]]
doc.select("pre > a").each{href ->
    def filename = href.text()
    filename.endsWith("/")?dirs.add(filename):files.add(filename)
}
println """DIRECTORIES : 
${dirs.join('\n')}
FILES : 
${files.join('\n')}
"""
查看更多
Melony?
5楼-- · 2019-02-11 03:09

No, you'll need to do some parsing of the returned HTML.

Given this page as an example: http://central.maven.org/maven2/com/bloidonia/groovy-stream/

We'd need to do something like:

@Grab( 'org.ccil.cowan.tagsoup:tagsoup:1.2.1' )

def url = 'http://central.maven.org/maven2/com/bloidonia/groovy-stream/'.toURL()

new XmlSlurper( new org.ccil.cowan.tagsoup.Parser() ).parseText( url.text )
                                                     .body
                                                     .pre
                                                     .a
                                                     .each { link ->
    if( link.@href.text().endsWith( '/' ) ) {
        println "FOLDER : ${link.text()}"
    }
    else {
        println "FILE   : ${link.text()}"
    }
}

Which prints out:

FOLDER : ../
FOLDER : 0.5.1/
FOLDER : 0.5.2/
FOLDER : 0.5.3/
FOLDER : 0.5.4/
FOLDER : 0.6/
FOLDER : 0.6.1/
FOLDER : 0.6.2/
FILE   : maven-metadata.xml
FILE   : maven-metadata.xml.md5
FILE   : maven-metadata.xml.sha1

Obviously, you'd need to tweak the body.pre.a bit to match the output of your webserver for directory listings

查看更多
登录 后发表回答