parsing HTML on the iPhone [closed]

2018-12-31 15:13发布

Can anyone recommend a C or Objective-C library for HTML parsing? It needs to handle messy HTML code that won't quite validate.

Does such a library exist, or am I better off just trying to use regular expressions?

9条回答
深知你不懂我心
2楼-- · 2018-12-31 16:10

I wrote a lightweight wrapper around libxml which maybe useful:

Objective-C-HMTL-Parser

查看更多
浪荡孟婆
3楼-- · 2018-12-31 16:11

Just in case anyone has got here by googling for a nice XPath parser and gone off and used TFHpple, Note that TFHpple uses XPathQuery. This is pretty good, but has a memory leak.

In the function *PerformXPathQuery, if the nodes are found to be nil, it jumps out before cleaning up.

So where you see this bit of code: Add in the two cleanup lines.

  xmlNodeSetPtr nodes = xpathObj->nodesetval;
  if (!nodes)
    {
      NSLog(@"Nodes was nil.");
        /* Cleanup */
        xmlXPathFreeObject(xpathObj);
        xmlXPathFreeContext(xpathCtx);
      return nil;
    }

If you are doing a LOT of parsing, it's a vicious leak. Now.... how do I get my night back :-)

查看更多
泛滥B
4楼-- · 2018-12-31 16:11

You may want to check out ElementParser. It provides "just enough" parsing of HTML and XML. Nice interfaces make walking around XML / HTML documents very straightforward. http://touchtank.wordpress.com/

查看更多
登录 后发表回答