c# and Encoding.ASCII.GetString

2楼-- · 2019-01-14 20:02

In this case you'd be better to compare the byte arrays rather than converting to string.

If you must convert to string, I suggest using the encoding Latin-1 aka ISO-8859-1 aka Code Page 28591 encoding, as this encoding will map all bytes with hex values are in the range 0-255 to the Unicode character with the same hex value - convenient for this scenario. Any of the following will get this encoding:

Encoding.GetEncoding(28591)
Encoding.GetEncoding("Latin1")
Encoding.GetEncoding("ISO-8859-1")

0人赞添加讨论(0) 举报

▲ chillily

3楼-- · 2019-01-14 20:21

If you then wrote:

Console.WriteLine(ascii)

And expected "FFD8" to print out, that's not the way GetString work. For that, you would need:

 string ascii = String.Format("{0:X02}{1:X02}", header[0], header[1]);

0人赞添加讨论(0) 举报

爷、活的狠高调

4楼-- · 2019-01-14 20:27

Are you sure "????" is the result?

What is the result of:

(int)ascii[0]
(int)ascii[1]

On the other hand, pure ASCII is 0-127 only...

0人赞添加讨论(0) 举报

欢心

5楼-- · 2019-01-14 20:28

Yes, that's because ASCII is only 7-bit - it doesn't define any values above 127. Encodings typically decode unknown binary values to '?' (although this can be changed using DecoderFallback).

If you're about to mention "extended ASCII" I suspect you actually want Encoding.Default which is "the default code page for the operating system"... code page 1252 on most Western systems, I believe.

What characters were you expecting?

EDIT: As per the accepted answer (I suspect the question was edited after I added my answer; I don't recall seeing anything about JPEG originally) you shouldn't convert binary data to text unless it's genuinely encoded text data. JPEG data is binary data - so you should be checking the actual bytes against the expected bytes.

Any time you convert arbitrary binary data (such as images, music or video) into text using a "plain" text encoding (such as ASCII, UTF-8 etc) you risk data loss. If you have to convert it to text, use Base64 which is nice and safe. If you just want to compare it with expected binary data, however, it's best not to convert it to text at all.

EDIT: Okay, here's a class to help image detection method for a given byte array. I haven't made it HTTP-specific; I'm not entirely sure whether you should really fetch the InputStream, read just a bit of it, and then fetch the stream again. I've ducked the issue by sticking to byte arrays :)

using System;
using System.Collections.Generic;
using System.Collections.ObjectModel;
using System.Linq;

public sealed class SignatureDetector
{
    public static readonly SignatureDetector Png =
        new SignatureDetector(0x89, 0x50, 0x4e, 0x47);

    public static readonly SignatureDetector Bmp =
        new SignatureDetector(0x42, 0x4d);

    public static readonly SignatureDetector Gif =
        new SignatureDetector(0x47, 0x49, 0x46);

    public static readonly SignatureDetector Jpeg =
        new SignatureDetector(0xff, 0xd8);

    public static readonly IEnumerable<SignatureDetector> Images =
        new ReadOnlyCollection<SignatureDetector>(new[]{Png, Bmp, Gif, Jpeg});

    private readonly byte[] bytes;

    public SignatureDetector(params byte[] bytes)
    {
        if (bytes == null)
        {
            throw new ArgumentNullException("bytes");
        }
        this.bytes = (byte[]) bytes.Clone();
    }

    public bool Matches(byte[] data)
    {
        if (data == null)
        {
            throw new ArgumentNullException("data");
        }
        if (data.Length < bytes.Length)
        {
            return false;
        }
        for (int i=0; i < bytes.Length; i++)
        {
            if (data[i] != bytes[i])
            {
                return false;
            }
        }
        return true;
    }    

    // Convenience method
    public static bool IsImage(byte[] data)
    {
        return Images.Any(detector => detector.Matches(data));
    }        
}

0人赞添加讨论(0) 举报

Rolldiameter

6楼-- · 2019-01-14 20:29

I once wrote a custom encoder/decoder that encoded bytes 0-255 to unicode characters 0-255 and back again.

It was only really useful for using string functions on something that isn't actually a string.

0人赞添加讨论(0) 举报

c# and Encoding.ASCII.GetString

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间