Comparing two base64 image strings and removing ma

2019-08-05 05:36发布

问题:

Not sure if what I'm trying to do will work out, or is even possible. Basically I'm creating a remote desktop type app which captures the screen as a jpeg image and sends it to the client app for displaying.
I want to reduce the amount of data sent each time by comparing the image to the older one and only sending the differences. For example:

var bitmap = new Bitmap(1024, 720);

string oldBase = "";

using (var stream = new MemoryStream())
using (var graphics = Graphics.FromImage(bitmap))
{
    graphics.CopyFromScreen(bounds.X, bounds.Y, 0, 0, bounds.Size);
    bitmap.Save(stream, ImageFormat.Jpeg);
    string newBase = Convert.ToBase64String(stream.ToArray());

    // ! Do compare/replace stuff here with newBase and oldBase !

    // Store the old image as a base64 string.
    oldBase = newBase;
}

Using something like this I could compare both base64 strings and replace any matches. The matched text could be replaced with something like:

[number of characters replaced]

That way, on the client side I know where to replace the old data and add the new. Again, I'm not sure if this would even work so anyones thoughts on this would be very appreciated. :) If it is possible, could you point me in the right direction? Thanks.

回答1:

You can do this by comparing the bitmap bits directly. Look into Bitmap.LockBits, which will give you a BitmapData pointer from which you can get the pixel data. You can then compare the pixels for each scan line and encode them into whatever format you want to use for transport.

Note that a scan line's length in bytes is always a multiple of 4. So unless you're using 32-bit color, you have to take into account the padding that might be at the end of the scan line. That's what the Stride property is for in the BitmapData structure.

Doing things on a per-scanline basis is easier, but potentially not as efficient (in terms of reducing the amount of data sent) as treating the bitmap as one contiguous block of data. Your transport format should look something like:

<start marker>
// for each scan line
<scan line marker><scan line number>
<pixel position><number of pixels><pixel data>
<pixel position><number of pixels><pixel data>
...
// next scan line
<scan line marker><scan line number>
...
<end marker>

each <pixel position><number of pixels><pixel data> entry is a run of changed pixels. If a scan line has no changed pixels, you can choose not to send it. Or you can just send the scan line marker and number, followed immediately by the next scan line.

Two bytes will be enough for the <pixel position> field and for the <number of pixels> field. So you have an overhead of four bytes for each block. An optimization you might be interested in, after you have the simplest version working, would be to combine blocks of changed/unchanged pixels if there are small runs. For example, if you have uucucuc, where u is an unchanged pixel and c is a changed pixel, you'll probably want to encode the cucuc as one run of five changed pixels. That will reduce the amount of data you have to transmit.

Note that this isn't the best way to do things, but it's simple, effective, and relatively easy to implement.

In any case, once you've encoded things, you can run the data through the built-in GZip compressor (although doing so might not help much) and then push it down the pipe to the client, which would decompress it and interpret the result.

It would be easiest to build this on a single machine, using two windows to verify the results. Once that's working, you can hook up the network transport piece. Debugging the initial cut by having that transport step in the middle could prove very frustrating.



回答2:

We're currently working on something very similar - basically, what you're trying to implement is video codec (very simple motion jpeg). There are some simple approaches and some very complicated.

  1. The simplest approach is to compare consecutive frames and send only the differences. You may try to compare color differences between the frames in RGB space or YCbCr space and send only the pixels that changed with some metadata.
  2. The more complicated solution is to compare the pictures after DCT transformation but before entropy coding. That would give you better comparisons and remove some ugly artifacts.
  3. Check more info on JPEG, Motion JPEG, H.264 - you may use some methods these codecs are using or simply use the existing codec if possible.


回答3:

This wont work for a JPEG. You need to use BMP, or possibly uncompressed TIFF.

I think if it were me I'd use BMP, scan the pixels for changes and construct a PNG where everything except the changes were transparent.

First, this would reduce your transmission size because the PNG conpression is quite good especially for repeating pixels.

Second, it makes dispay on the receiving end very easy since you can simply paint the new image overtop the old image.