Algorithm for slicing planes (in place) out of an

2019-02-09 21:22发布

I've got a flat array of byte RGB values that goes R1 G1 B1 R2 G2 B2 R3 G3 B3 ... Rn Gn Bn. So my data looks like:

char imageData[WIDTH * HEIGHT * 3];

But I want to pass a WIDTH*HEIGHT array to an existing C library that expects a single plane of this data. That would be a sequence of just the R values (or just the G, or just the B).

It's easy enough to allocate a new array and copy the data (duh). But the images are very large. If it weren't a C library but took some kind of iteration interface to finesse the "slicing" traversal, that would be great. But I can't edit the code I'm calling...it wants a plain old pointer to a block of sequential memory.

HOWEVER I have write access to this array. It is viable to create a routine that would sort it into color planes. I'd also need a reverse transformation that would put it back, but by definition the same method that sorted it into planes could be applied to unsort it.

How efficiently can I (in place) turn this array into R1 R2 R3 ... Rn G1 G2 G3 ... Gn B1 B2 B3 ... Bn and then back again? Any non-naive algorithms?

3条回答
一纸荒年 Trace。
2楼-- · 2019-02-09 21:51

If you only need one plane, this seems pretty easy. If you need all 3 you will probably have better luck with a more sophisticated algorithm.

void PlanarizeR(char * imageData, int width, int height)
{
    char *in = imageData;
    int pixelCount = width * height;
    for (int i = 0;  i < pixelCount;  ++i, in+=3)
        std::swap(*in, imageData[i])
}

It shouldn't be too hard to run the loop backwards from high to low to reverse the process.

查看更多
何必那么认真
3楼-- · 2019-02-09 22:12

This paper "A Simple In-Place Algorithm for In-Shuffle" describes how to transpose matrix of 2*N and gives a hint how to do it for other cases, so 3*N may be also possible. This answer to other question shows that it is indeed possible.

Or use an algorithm which writes each value to its transposed place, then does the same for the value from that place, and so on until cycle is connected. Flag processed values in a bit vector. And continue until this vector is all 1s.

Both algorithms are not cache-friendly. Probably some clever use of PREFETCH instruction can improve this.

Edit:

C++, RGB to single planes, not optimized:

#include <iostream>
#include <bitset>
#include <vector>

enum {N = 8};

void transpose(std::vector<char>& a)
{
  std::bitset<3*N> b;

  for (int i = 1; i < 3*N; ++i)
  {
    if (b[i])
      continue;

    int ptr = i;
    int next;
    char nextVal = a[i];

    do {
      next = ptr/3 + N*(ptr%3);
      char prevVal = nextVal;
      nextVal = a[next];
      a[next] = prevVal;
      ptr = next;
      b[ptr] = true;
    }
    while (ptr != i);
  }
}

int main()
{
  std::vector<char> a(3*N);

  for (int i = 0; i != 3*N; ++i)
    a[i] = i;

  transpose(a);

  for (int i = 0; i != 3*N; ++i)
    std::cout << (int)a[i] << std::endl;

  return 0;
}

My original intent is to use a bit vector of size WIDTHHEIGHT, which gives overhead of WIDTHHEIGHT/8. But it is always possible to sacrifice speed for space. The bit vector may be of size WIDTH or HEIGHT or any desirable value, even 0. The trick is to maintain a pointer to the cell, before which all values are transposed. The bit vector is for cells, starting from this pointer. After it is all 1s, It is moved to next position, then all the algorithm steps are performed except actual data movement. And the bit vector is ready to continue transposing. This variant is O(N^2) instead of O(N).

Edit2:

PREFITCH optimization is not difficult to implement: just calculate indexes, invoke PREFETCH, and put indexes to a queue (ringbuffer), then get indexes from the queue and move data.

Edit3:

The idea of other algorithm, which is O(1) size, O(N*log(N)) time, is cache-friendly and may be faster than "cycle" algorithm (for image sizes < 1Gb):

  • Split N*3 matrix to several 3*3 matrices of char and transpose them
  • Split the result to 3*3 matrices of char[3] and transpose them
  • Continue while matrices size is less than the array size
  • Now we have up to 3*2*log3(N) ordered pieces. Join them.
  • First join pieces of equal size. Very simple "cycles" of length 4 may be used.
  • Join unequal-sized pieces with reverse(reverse(a), reverse(b))
查看更多
时光不老,我们不散
4楼-- · 2019-02-09 22:12
char *imageData = malloc (WIDTH * HEIGHT * 3 * sizeof(char));

this function do this R1 R2 R3 ... Rn G1 G2 G3 ... Gn B1 B2 B3 ... Bn,,

char *toRGB(char *imageData, int WIDTH, int HEIGHT){
    int len = WIDTH * HEIGHT;
    char *RGB = malloc (len * sizeof(char));
    int i, j = 0,flag = 0;

    for(i=0 ; i<=len ; i++, j+=3){
        if(j<len)
            RGB[i] = imageData[j];
        else{
            switch(flag){
                case 0: j=-2; flag=1; break;   // j=-2 the next iteration will start at j=1
                case 1: j=-1; break;   // j=-1 the next iteration will start at j=2
            }
        }
    }
    return RGB;
}
查看更多
登录 后发表回答