regex matching image url with spaces

2019-04-17 15:19发布

问题:

I need to match a image url like this:

http://site.com/site.com/files/images/img (5).jpg

Something like this works fine:

.replace(/(http:\/\/([ \S]+\.(jpg|png|gif)))/ig, "<div style=\"background: url($1)\"></div>")

Except if I have something like this:

http://site.com/site.com/files/audio/audiofile.mp3 http://site.com/site.com/files/images/img (5).jpg

How do I match only the image?

Thanks!

Edit: And I'm using javascript.

回答1:

Proper URLs should not have spaces in them, they should have %20 or a plus '+' instead. If you had them written with those alternatives then your matching would be much easier.



回答2:

Assuming images will always be in the 'images' directory, try:

http://.*/images/(.*?).(jpe?g|gif|png)

If you can't assume an images directory:

http://.*/(.*?).(jpe?g|gif|png)

Group 1 and 2 should have what you want (file name and extension).

I tested the regular expression here and here and it appears to do what you want.



回答3:

Why not:

/([^/]+\.(jpg|png|gif))$


回答4:

Using

http:\/\/.*\/(.*)\.(jpg|png|gif)

should do the trick if all you want is the name of the image. The first group is the file name and the second group is the file extension.



回答5:

Can you assume that the urls will be space delimited, or return delimited?

As in, can you assume this input?

site.com/images/images/lol (5).jpg
site.com/images/other/radio.mp3
site.com/images/images/copter (3).jpg

If you are going to have your delimiter as part of your string to return, things get tricky. What kind of volume are you talking about here? Could you do it semi-manually at all, or does the process have to be automated?



回答6:

This would be an approach:

^((\w+):)?\/\/((\w|\.)+(:\d+)?)[^:]+\.(jpe?g|gif|png)$

Mathing on the colon. (:) In this case it's only accepted for the protocol and port (optional).

This will not match:

http://site.com/site.com/files/audio/audiofile.mp3 http://site.com/site.com/files/images/img (5).jpg

This will match (colon in second http:// removed)

"/audiofile.mp3 http/" will count as a folder in "/audio/"

http://site.com/site.com/files/audio/audiofile.mp3 http//site.com/site.com/files/images/img (5).jpg

It's not fool proof. There are other characters that are not allowed in filenames ( * | " < > )