Is it possible to read only first N bytes from the

Here is the question.

Given the url http://www.example.com, can we read the first N bytes out of the page?

using wget, we can download the whole page.
using curl, there is -r, 0-499 specifies the first 500 bytes. Seems solve the problem.

You should also be aware that many HTTP/1.1 servers do not have this feature enabled, so that when you attempt to get a range, you'll instead get the whole document.
using urlib in python. similar question here, but according to Konstantin's comment, is that really true?

Last time I tried this technique it failed because it was actually impossible to read from the HTTP server only specified amount of data, i.e. you implicitly read all HTTP response and only then read first N bytes out of it. So at the end you ended up downloading the whole 1Gb malicious response.

So the problem is that how can we read the first N bytes from the HTTP server in practice?

Regards & Thanks

标签： linux http url command

5条回答

beautiful°

2楼-- · 2019-01-23 14:12

You can do it natively by the next curl command (no need to donwload whole document). According to culr man page:

RANGES HTTP 1.1 introduced byte-ranges. Using this, a client can request to get only one or more subparts of a specified document. curl supports this with the -r flag.
Get the first 100 bytes of a document:
    curl -r 0-99 http://www.get.this/

Get the last 500 bytes of a document:  
    curl -r -500 http://www.get.this/

`curl` also supports simple ranges for FTP files as well.
Then you can only specify start and stop position.

Get the first 100 bytes of a document using FTP:
    curl -r 0-99 ftp://www.get.this/README

It works for me even with Java web app that deployed to GigaSpaces.

0人赞添加讨论(0) 举报

倾城　Initia

3楼-- · 2019-01-23 14:15

You should also be aware that many HTTP/1.1 servers do not have this feature enabled, so that when you attempt to get a range, you'll instead get the whole document.

You will have to get the whole web anyways, so you can get the web with curl and pipe it to head, for example.

head

c, --bytes=[-]N print the first N bytes of each file; with the leading '-', print all but the last N bytes of each file

0人赞添加讨论(0) 举报

Rolldiameter

4楼-- · 2019-01-23 14:15

Make a socket connection. Read the bytes you want. Close, and you're done.

0人赞添加讨论(0) 举报

萌系小妹纸

5楼-- · 2019-01-23 14:25

I came here looking for a way to time the server's processing time, which I thought I could measure by telling curl to stop downloading after 1 byte or something.

For me, the better solution turned out to be to do a HEAD request, since this usually lets the server process the request as normal but does not return any response body:

time curl --head <URL>

0人赞添加讨论(0) 举报

爷的心禁止访问

6楼-- · 2019-01-23 14:30

curl <url> | head -c 499

curl <url> | dd bs=1 count=499

should do

Also there are simpler utils with perhaps borader availability like

    netcat host 80 <<"HERE" | dd count=499 of=output.fragment
GET /urlpath/query?string=more&bloddy=stuff

HERE

GET /urlpath/query?string=more&bloddy=stuff

0人赞添加讨论(0) 举报

Is it possible to read only first N bytes from the

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间