how to get a list of all paths/files on a webpage

2019-09-01 04:47发布

站内文章 / PHP

18 0

别忘想泡老子

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I use wget -p $url to get all the files on a webpage so that I can get a list. But for some URLs, it turns out that only the index.html can be fetched by wget. Is there a way to get a list of files on a specific URL by wget or cURL? Do I need to check the request headers and response headers?

回答1:

Some servers do not let you browse directory listings, and if there's a default document in that directory, it takes over and you can't browse either.

You need to implement a spider that parses all the paths and files and links, and creates a directory structure of files that are declared and used in the HTML. Then you can download those files.