Regex to parse image data URI

2019-03-18 03:44发布

问题:

If I have :

<img src="data:image/gif;base64,R0lGODlhtwBEANUAAMbIypOVmO7v76yusOHi49AsSDY1N2NkZvvs6VVWWPDAutZOWJ+hpPPPyeqmoNlcYXBxdNTV1nx+gN51c4iJjEdHSfbc19M+UOeZk7m7veSMiNtpauGBfu2zrc4RQSMfIP///wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH5BAAAAAAALAAAAAC3AEQAAAb/QJBwSCwaj8ikcslsOp/QqHRKrVqv2Kx2y+16v+CweEwum8/otHrNbrvf8Lh8Tq/b7/i8fs" />

How can I parse the data part into:

  • Mime type (image/gif)
  • Encoding (base64)
  • Image data (the binary data)

回答1:

EDIT: expanded to show usage

var regex = new Regex(@"data:(?<mime>[\w/\-\.]+);(?<encoding>\w+),(?<data>.*)", RegexOptions.Compiled);

var match = regex.Match(input);

var mime = match.Groups["mime"].Value;
var encoding = match.Groups["encoding"].Value;
var data = match.Groups["data"].Value;

NOTE: The regex applies to the input shown in question. If there was a charset specified too, it would not work and would have to be rewritten.



回答2:

Actually, you don't need a regex for that. According to Wikipedia, the data URI format is

data:[<MIME-type>][;charset=<encoding>][;base64],<data>

so just do the following:

byte[] imagedata = Convert.FromBase64String(imageSrc.Substring(imageSrc.IndexOf(",") + 1));


回答3:

Here is my regular expression where I had to separate the mime-type (image/jpg) as well.

^data:(?<mimeType>(?<mime>\w+)\/(?<extension>\w+));(?<encoding>\w+),(?<data>.*)