Remove HTML Tags from an NSString on the iPhone

2018-12-31 09:09发布

There are a couple of different ways to remove HTML tags from an NSString in Cocoa.

One way is to render the string into an NSAttributedString and then grab the rendered text.

Another way is to use NSXMLDocument's -objectByApplyingXSLTString method to apply an XSLT transform that does it.

Unfortunately, the iPhone doesn't support NSAttributedString or NSXMLDocument. There are too many edge cases and malformed HTML documents for me to feel comfortable using regex or NSScanner. Does anyone have a solution to this?

One suggestion has been to simply look for opening and closing tag characters, this method won't work except for very trivial cases.

For example these cases (from the Perl Cookbook chapter on the same subject) would break this method:

<IMG SRC = "foo.gif" ALT = "A > B">

<!-- <A comment> -->

<script>if (a<b && a>c)</script>

<![INCLUDE CDATA [ >>>>>>>>>>>> ]]>

22条回答
忆尘夕之涩
2楼-- · 2018-12-31 09:25

If you want to get the content without the html tags from the web page (HTML document) , then use this code inside the UIWebViewDidfinishLoading delegate method.

  NSString *myText = [webView stringByEvaluatingJavaScriptFromString:@"document.documentElement.textContent"];
查看更多
谁念西风独自凉
3楼-- · 2018-12-31 09:28
#import "RegexKitLite.h"

string text = [html stringByReplacingOccurrencesOfRegex:@"<[^>]+>" withString:@""]
查看更多
公子世无双
4楼-- · 2018-12-31 09:30
UITextView *textview= [[UITextView alloc]initWithFrame:CGRectMake(10, 130, 250, 170)];
NSString *str = @"This is <font color='red'>simple</font>";
[textview setValue:str forKey:@"contentToHTMLString"];
textview.textAlignment = NSTextAlignmentLeft;
textview.editable = NO;
textview.font = [UIFont fontWithName:@"vardana" size:20.0];
[UIView addSubview:textview];

this is work fine for me

查看更多
荒废的爱情
5楼-- · 2018-12-31 09:30

Extending this more from m.kocikowski's and Dan J's answers with more explanation for newbies

1# First you have to create objective-c-categories to make the code useable in any class.

.h

@interface NSString (NAME_OF_CATEGORY)

- (NSString *)stringByStrippingHTML;

@end

.m

@implementation NSString (NAME_OF_CATEGORY)

- (NSString *)stringByStrippingHTML
{
NSMutableString *outString;
NSString *inputString = self;

if (inputString)
{
    outString = [[NSMutableString alloc] initWithString:inputString];

    if ([inputString length] > 0)
    {
        NSRange r;

        while ((r = [outString rangeOfString:@"<[^>]+>" options:NSRegularExpressionSearch]).location != NSNotFound)
        {
            [outString deleteCharactersInRange:r];
        }
    }
}

return outString;
}

@end

2# Then just import the .h file of the category class you've just created e.g.

#import "NSString+NAME_OF_CATEGORY.h"

3# Calling the Method.

NSString* sub = [result stringByStrippingHTML];
NSLog(@"%@", sub);

result is NSString I want to strip the tags from.

查看更多
何处买醉
6楼-- · 2018-12-31 09:30

Another one way:

Interface:

-(NSString *) stringByStrippingHTML:(NSString*)inputString;

Implementation

(NSString *) stringByStrippingHTML:(NSString*)inputString
{ 
NSAttributedString *attrString = [[NSAttributedString alloc] initWithData:[inputString dataUsingEncoding:NSUTF8StringEncoding] options:@{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,NSCharacterEncodingDocumentAttribute: @(NSUTF8StringEncoding)} documentAttributes:nil error:nil];
NSString *str= [attrString string]; 

//you can add here replacements as your needs:
    [str stringByReplacingOccurrencesOfString:@"[" withString:@""];
    [str stringByReplacingOccurrencesOfString:@"]" withString:@""];
    [str stringByReplacingOccurrencesOfString:@"\n" withString:@""];

    return str;
}

Realization

cell.exampleClass.text = [self stringByStrippingHTML:[exampleJSONParsingArray valueForKey: @"key"]];

or simple

NSString *myClearStr = [self stringByStrippingHTML:rudeStr];

查看更多
无与为乐者.
7楼-- · 2018-12-31 09:30

Here's a blog post that discusses a couple of libraries available for stripping HTML http://sugarmaplesoftware.com/25/strip-html-tags/ Note the comments where others solutions are offered.

查看更多
登录 后发表回答