NSData to NString conversion problem

2019-04-08 13:41发布

I'm getting an HTML file as NSData and need to extract some parts of it. For that I need to convert it to NSString with UTF8 encoding. The thing is that this conversion fails, probably because the NSData contains bytes that are invalid for UTF8. I have tried to get the byte array of the data and go over it, but each time I come across non ASCII character (hebrew letters for example) I get jibrish.

Help will be appreciated.

UPDATE:

To Gordon - the NSData generated like that:

    NSData *theData = [NSURLConnection sendSynchronousRequest:theRequest returningResponse:&theResponse error:&theError];

When I say that the conversion fails I mean that

[[NSString alloc] initWithData:temp encoding:NSUTF8StringEncoding]

returns nil

To Ed - Here is my code (I got the Byte array from NSData, found what I need, and constructed another Byte array from that - turned it to NSData and then attempted to convert it to NSString... sounds kinda complicated...)

-(NSString *)UTF8StringFromData:(NSData *)theData{
Byte *arr = [theData bytes];
NSUInteger begin1 = [self findIndexOf:@"<li>" bArr:arr size:[theData length]]+4;
NSUInteger end1 = [self findIndexOf:@"</li></ol>" bArr:arr size:[theData length]];
Byte *arr1 = (Byte *)malloc(sizeof(Byte)*((end1-begin1+1)));
NSLog(@"%d %d",begin1, end1);
int j = 0;
for (int i = begin1; i < end1; i++){
    arr1[j] = arr[i];
    j++;
}
arr1[j]='\0';
NSData *temp = [NSData dataWithBytes:arr1 length:j];

return [[NSString alloc] initWithData:temp encoding:NSUTF8StringEncoding];

}

4条回答
戒情不戒烟
2楼-- · 2019-04-08 13:54

have you checked the charset= in the HTTP headers and/or the document itself? The most likely reason for the conversion to fail is because the bytes don't represent a valid UTF-8 string.

查看更多
手持菜刀,她持情操
3楼-- · 2019-04-08 14:06

To Gordon - the NSData generated like that:

    NSData *theData = [NSURLConnection sendSynchronousRequest:theRequest returningResponse:&theResponse error:&theError];

When I say that the conversion fails I mean that

[[NSString alloc] initWithData:temp encoding:NSUTF8StringEncoding]

returns nil

To Ed - Here is my code (I got the Byte array from NSData, found what I need, and constructed another Byte array from that - turned it to NSData and then attempted to convert it to NSString... sounds kinda complicated...)

-(NSString *)UTF8StringFromData:(NSData *)theData{
Byte *arr = [theData bytes];
NSUInteger begin1 = [self findIndexOf:@"<li>" bArr:arr size:[theData length]]+4;
NSUInteger end1 = [self findIndexOf:@"</li></ol>" bArr:arr size:[theData length]];
Byte *arr1 = (Byte *)malloc(sizeof(Byte)*((end1-begin1+1)));
NSLog(@"%d %d",begin1, end1);
int j = 0;
for (int i = begin1; i < end1; i++){
    arr1[j] = arr[i];
    j++;
}
arr1[j]='\0';
NSData *temp = [NSData dataWithBytes:arr1 length:j];

return [[NSString alloc] initWithData:temp encoding:NSUTF8StringEncoding];

}

查看更多
smile是对你的礼貌
4楼-- · 2019-04-08 14:08

I'm not sure if you're aware, you don't really need to copy the array to another array before putting it into the new NSData object.

-(NSString *)UTF8StringFromData:(NSData *)theData {
  Byte *arr = [theData bytes];
  NSUInteger begin1 = [self findIndexOf:@"<li>" bArr:arr size:[theData length]]+4;
  NSUInteger end1 = [self findIndexOf:@"</li></ol>" bArr:arr size:[theData length]];
  Byte *arr1 = arr + begin1;
  NSData *temp = [NSData dataWithBytes:arr1 length:end1 - begin1];
  return [[NSString alloc] initWithData:temp encoding:NSUTF8StringEncoding];
}

As for your particular problem, I would try looking through the data manually using the debugger. Put a breakpoint after you have your array (arr1). When you hit it, open up the GDB console and try this:

print (char *)arr1

With your code, it should print out the string you're trying to get. (With the code I gave above, it won't stop after the . It'll just keep going).

If the result is not what you expect, then there's something wrong with the data, or perhaps with your begin1 and end1 boundaries.

查看更多
相关推荐>>
5楼-- · 2019-04-08 14:12

I know this is an old topic but it came up when I was looking for the solution today. I've solved it now so I'm just posting it for others who might run into this page looking for a solution.

Here's what I do in an asynchronous request:

I first store the text encoding name in connection:didReceiveResponse using

encodingName = [[NSString alloc] initWithString:[response textEncodingName]];

Then later in my connectionDidFinishLoading method I used

NSStringEncoding encoding = CFStringConvertEncodingToNSStringEncoding(CFStringConvertIANACharSetNameToEncoding((CFStringRef) encodingName));
NSString *payloadAsString = [[NSString alloc] initWithData:receivedData encoding:encoding];
查看更多
登录 后发表回答