如何到grep在文件中的URL？(How to grep for a URL in a file?)

2019-08-03 06:53发布

例如，我有一个包含IMG URL一个巨大的HTML文件： http://ex.example.com/hIh39j+ud9wr4/Uusfh.jpeg

我想这个URL，假设它是在整个文件中的唯一 URL。

cat file.html | grep -o 'http://ex[a-zA-Z.-]*/[a-zA-Z.-]*/[a-zA-Z.,-]*'

这只有在URL没有加号。

如何使工作+迹象呢？

Answer 1:

你错过了字符类0-9 （也没用使用猫）：

grep -o 'http://ex[a-zA-Z.-]*/[a-zA-Z0-9+-]*/[a-zA-Z0-9.,-+]*' file.html

略有改善，使用-i为不区分大小写和只匹配图像.jpg或.jpeg 。

grep -io 'http://ex[a-z.-]*/[a-z0-9+-]*/[a-z0-9.,-+]*[.jpe?g]' file.html

或者怎么样只是：

grep -io 'http://ex.example.*[.jpe?g]' file.html

以下修复你的正则表达式 的这种特殊情况下 （包括数字和加标志）：

http://ex[a-zA-Z.-]*/[a-zA-Z0-9.+-]*/[a-zA-Z0-9.+-]*

echo "For example, I have a huge HTML file that contains img URL: http://ex.example.com/hIh39j+ud9wr4/Uusfh.jpeg"

我想这个URL，假设它是在整个文件中的唯一URL。

cat file.html | grep -o 'http://ex[a-zA-Z.-]*/[a-zA-Z.-]*/[a-zA-Z.,-]*'

这只有在URL没有加号。如何使工作+迹象呢？

cat file.html| grep -o 'http://ex[a-zA-Z.-]*/[a-zA-Z0-9.+-]*/[a-zA-Z0-9.+-]*'

输出：

http://ex.example.com/hIh39j+ud9wr4/Uusfh.jpeg

这不提取所有有效的URL。有很多在这个网站约URL匹配其他的答案。

文章来源: How to grep for a URL in a file?