IOS How to find full rss feed link with nsscanner

2019-07-01 10:19发布


I am working on fetching data from rss feed based project.From searching on google i found that generally RSS link found in this format in source of HTML.

<link rel="alternate" type="application/rss+xml" title="RSS Feed" href="" />

so, I have to use nsscanner class to find the link of RSS feed from HTML source. but i don't know proper pattern and which i have to set scanUpToString: and haracterSetWithCharactersInString: or etc. So, please help me how to i find the full link of RSS feed.

Here is my try:

- (void)viewDidLoad {
NSString *googleString = @"";
NSURL *googleURL = [NSURL URLWithString:googleString];
NSError *error;
NSString *googlePage = [NSString stringWithContentsOfURL:googleURL encoding:NSASCIIStringEncoding

NSLog(@"%@",[self yourStringArrayWithHTMLSourceString:googlePage]);//will return NSMutableArray

-(NSMutableArray *)yourStringArrayWithHTMLSourceString:(NSString *)html
NSString *from = @"<a href=\"";
NSString *to = @"</a>";
NSMutableArray *array = [[NSMutableArray alloc]init];

NSScanner* scanner = [NSScanner scannerWithString:html];

[scanner scanUpToString:@"<link" intoString:nil];
if (![scanner isAtEnd]) {
    NSString *url = nil;

    [scanner scanUpToString:@"RSS Feed" intoString:nil];
    NSCharacterSet *charset = [NSCharacterSet characterSetWithCharactersInString:@"/>"];
    [scanner scanUpToCharactersFromSet:charset intoString:nil];
    [scanner scanCharactersFromSet:charset intoString:nil];
    [scanner scanUpToCharactersFromSet:charset intoString:&url];
    // "url" now contains the URL of the img

return array;

currently i am able find only link with this code .


But full link is :-


That is because

[NSCharacterSet characterSetWithCharactersInString:@"/>"];

contains characters "/" which is the last character of http:// and also the character right after

Edit: Here's a playground which shows the approach you could take.(Not fully tested)

It's in Swift but the API is the same in Obj-C.

var str = "<link rel=\"alternate\" type=\"application/rss+xml\" title=\"RSS Feed\" href=\"\" />";

var scanner = NSScanner.init(string: str);
var result: NSString?  = nil

scanner.scanUpToString("href=\"", intoString: nil);
scanner.scanString("href=\"", intoString: nil);
scanner.scanUpToString("\" />", intoString: &result);


Use "link" instead of "a" tags from this reference.

Reference : Regular expression in ios to extract href url and discard rest of anchor tag