regex to strip out image urls?

I need to separate out a bunch of image urls from a document in which the images are associated with names like this:

bellpepper = "http://images.com/bellpepper.jpg"
cabbage = "http://images.com/cabbage.jpg"
lettuce = "http://images.com/lettuce.jpg"
pumpkin = "http://images.com/pumpkin.jpg"

I assume I can detect the start of a link with:

/http:[^ ,]+/i

But how can I get all of the links separated from the document?

EDIT: To clarify the question: I just want to strip out the URLs from the file minus the variable name, equals sign and double quotes so I have a new file that is just a list of URLs, one per line.

标签： regex url parsing image

4条回答

放我归山

2楼-- · 2019-07-25 14:38

If the format is constant, then this should work (python):

import re
s = """bellpepper = "http://images.com/bellpepper.jpg" (...) """
re.findall("\"(http://.+?)\"", s)

Note: this is not "find an image in a file" regexp, just an answer to the question :)

0人赞添加讨论(0) 举报

爱情/是我丢掉的垃圾

3楼-- · 2019-07-25 14:46

Try this...

(http://)([a-zA-Z0-9\/\\.])*

0人赞添加讨论(0) 举报

Anthone

4楼-- · 2019-07-25 14:49

do you mean to say you have that kind of format in your document and you just want to get the http part? you can just split on the "=" delimiter without regex

$f = fopen("file","r");
if ($f){
    while( !feof($f) ){
        $line = fgets($f,4096);
        $s = explode(" = ",$line);
        $s = preg_replace("/\"/","",$s);
        print $s[1];
    }
    fclose($f);
}

on the command line :

#php5 myscript.php > newfile.ext

if you are using other languages other than PHP, there are similar string splitting method you can use. eg Python/Perl's split(). please read your doc to find out

0人赞添加讨论(0) 举报

聊天终结者

5楼-- · 2019-07-25 14:54

You may try this, if your tool supports positive lookbehind:

/(?<=")[^"\n]+/

0人赞添加讨论(0) 举报

regex to strip out image urls?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间