I am making a crawler application. I wish to crawl websites and find the depth of the webpages retrieved. I read about different crawling and parsing tools but to no avail. None of them seem to provide support to calculate the depth. I am also unsure about which crawler tool to use which can get closest to desired functionality. Any help is appreciated.
相关问题
- Correctly parse PDF paragraphs with Python
- R: eval(parse()) error message: cannot ope
- Grails External Configuration. Can't access to
- How do I parse a .pls file using PHP? Having troub
- grails unit test + Thread
相关文章
- How do I get from a type to the TryParse method?
- Scrapy - Select specific link based on text
- Slow ANTLR4 generated Parser in Python, but fast i
- Parsing JSON in QML [duplicate]
- Grails: How to make everything I create Upper Case
- How do I generate an AST from a string of C++ usin
- JSoup will not fetch all items?
- Content is not allowed in prolog
The most important thing is probably the mapping of your Domain (and not the parser).
Because, if you are using a tree (More information on wikipedia), it is easy to calculate the depth (the min depth) of your URL.
Hope this helps.