Why does Chrome request a robots.txt?

2019-04-06 22:01发布

问题:

I have noticed in my logs that Chrome requested a robots.txt alongside everything I expected it to.

[...]
2017-09-17 15:22:35 - (sanic)[INFO]: Goin' Fast @ http://0.0.0.0:8080
2017-09-17 15:22:35 - (sanic)[INFO]: Starting worker [26704]
2017-09-17 15:22:39 - (network)[INFO][127.0.0.1:36312]: GET http://localhost:8080/  200 148
2017-09-17 15:22:39 - (sanic)[ERROR]: Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/sanic/app.py", line 493, in handle_request
    handler, args, kwargs, uri = self.router.get(request)
  File "/usr/local/lib/python3.5/dist-packages/sanic/router.py", line 307, in get
    return self._get(request.path, request.method, '')
  File "/usr/local/lib/python3.5/dist-packages/sanic/router.py", line 356, in _get
    raise NotFound('Requested URL {} not found'.format(url))
sanic.exceptions.NotFound: Requested URL /robots.txt not found

2017-09-17 15:22:39 - (network)[INFO][127.0.0.1:36316]: GET http://localhost:8080/robots.txt  404 42
[...]

I am running Chromium:

60.0.3112.113 (Developer Build) Built on Ubuntu, running on Ubuntu 16.04 (64-bit)

Why is this happening? Can someone elaborate?

回答1:

There is the possibility it was not your Website that was requesting the robots.txt file, but one of the Chrome extensions (like the Wappalizer you mentioned). This would explain why it only happened in Chrome.

To know for sure you could check the Network tab of Chrome's DevTools to see at which point the request is made, and if it comes from one of your scripts.



回答2:

For chrome, there exists a plugin (SeeRobots) that checks if a robots.txt defines rules for search engines etc. - probably you have installed this plugin?

https://chrome.google.com/webstore/detail/seerobots/hnljoiodjfgpnddiekagpbblnjedcnfp?hl=de