I'm trying, for the first time in my life, to contribute to open source software. Therefore I'm trying to help out on this ticket, as it seems to be a good "beginner ticket".
I have successfully got the string from the Twitter API: however, it's in this format:
<a href="http://twitter.com" rel="nofollow">Tweetie for Mac</a>
What I want to extract from this string is the URL (http://twitter.com
) and the name of the Twitter client (Tweetie for Mac
). How can I do this in Objective-C? As the URL's aren't the same I can't search for a specified index, and the same applies for the client name.
I haven't looked at Adium source but you should check if there are any categories available that extend e.g.
NSString
with methods for parsing html/xml to more usable structures, like a node tree for example. Then you could simply walk the tree and search for the required attributes.If not, you may either parse it yourself by dividing the string into tokens (tag open, tag close, tag attributes, quoted strings and so on), then look for the required attributes. Alternatively you could even use a regular expression if the strings always consist of a single html anchor element.
I know it's been discussed many times that regular expressions simply don't work for html parsing, but this is a specific scenario where it's actually reasonable. Better than running a full-blown, generic html/xml parser. That would be, as slycrel said, an overkill.
Assuming you have the HTML link already and aren't parsing an entire HTML page.
you know that this portion of the string will be the same:
so what you really want is a search to the first " and the closing
>
for the beginning of thea
tag.The easiest way to do this would be to find what is in the quotes (see this link for how to search NSStrings) and then get the text after the second to last
>
for your actual name.You could also use an NSXMLParser as that works on XML specifically, but that may be overkill for this case.