C++ valarray vs. vector

2楼-- · 2019-01-05 08:44

valarray is kind of an orphan that was born in the wrong place at the wrong time. It's an attempt at optimization, fairly specifically for the machines that were used for heavy-duty math when it was written -- specifically, vector processors like the Crays.

For a vector processor, what you generally wanted to do was apply a single operation to an entire array, then apply the next operation to the entire array, and so on until you'd done everything you needed to do.

Unless you're dealing with fairly small arrays, however, that tends to work poorly with caching. On most modern machines, what you'd generally prefer (to the extent possible) would be to load part of the array, do all the operations on it you're going to, then move on to the next part of the array.

valarray is also supposed to eliminate any possibility of aliasing, which (at least theoretically) lets the compiler improve speed because it's more free to store values in registers. In reality, however, I'm not at all sure that any real implementation takes advantage of this to any significant degree. I suspect it's rather a chicken-and-egg sort of problem -- without compiler support it didn't become popular, and as long as it's not popular, nobody's going to go to the trouble of working on their compiler to support it.

There's also a bewildering (literally) array of ancillary classes to use with valarray. You get slice, slice_array, gslice and gslice_array to play with pieces of a valarray, and make it act like a multi-dimensional array. You also get mask_array to "mask" an operation (e.g. add items in x to y, but only at the positions where z is non-zero). To make more than trivial use of valarray, you have to learn a lot about these ancillary classes, some of which are pretty complex and none of which seems (at least to me) very well documented.

Bottom line: while it has moments of brilliance, and can do some things pretty neatly, there are also some very good reasons that it is (and will almost certainly remain) obscure.

Edit (eight years later, in 2017): Some of the preceding has become obsolete to at least some degree. For one example, Intel has implemented an optimized version of valarray for their compiler. It uses the Intel Integrated Performance Primitives (Intel IPP) to improve performance. Although the exact performance improvement undoubtedly varies, a quick test with simple code shows around a 2:1 improvement in speed, compared to identical code compiled with the "standard" implementation of valarray.

So, while I'm not entirely convinced that C++ programmers will be starting to use valarray in huge numbers, there are least some circumstances in which it can provide a speed improvement.

0人赞添加讨论(0) 举报

放我归山

3楼-- · 2019-01-05 08:44

I know valarrays have some syntactic sugar

I have to say that I don't think std::valarrays have much in way of syntactic sugar. The syntax is different, but I wouldn't call the difference "sugar." The API is weird. The section on std::valarrays in The C++ Programming Language mentions this unusual API and the fact that, since std::valarrays are expected to be highly optimized, any error messages you get while using them will probably be non-intuitive.

Out of curiosity, about a year ago I pitted std::valarray against std::vector. I no longer have the code or the precise results (although it shouldn't be hard to write your own). Using GCC I did get a little performance benefit when using std::valarray for simple math, but not for my implementations to calculate standard deviation (and, of course, standard deviation isn't that complex, as far as math goes). ~~I suspect that operations on each item in a large std::vector play better with caches than operations on std::valarrays.~~ (NOTE, following advice from musiphil, I've managed to get almost identical performance from vector and valarray).

In the end, I decided to use std::vector while paying close attention to things like memory allocation and temporary object creation.

Both std::vector and std::valarray store the data in a contiguous block. However, they access that data using different patterns, and more importantly, the API for std::valarray encourages different access patterns than the API for std::vector.

For the standard deviation example, at a particular step I needed to find the collection's mean and the difference between each element's value and the mean.

For the std::valarray, I did something like:

std::valarray<double> original_values = ... // obviously I put something here
double mean = original_values.sum() / original_values.size();
std::valarray<double> temp(mean, original_values.size());
std::valarray<double> differences_from_mean = original_values - temp;

I may have been more clever with std::slice or std::gslice. It's been over five years now.

For std::vector, I did something along the lines of:

std::vector<double> original_values = ... // obviously, I put something here
double mean = std::accumulate(original_values.begin(), original_values.end(), 0.0) / original_values.size();

std::vector<double> differences_from_mean;
differences_from_mean.reserve(original_values.size());
std::transform(original_values.begin(), original_values.end(), std::back_inserter(differences_from_mean), std::bind1st(std::minus<double>(), mean));

Today I would certainly write that differently. If nothing else, I would take advantage of C++11 lambdas.

It's obvious that these two snippets of code do different things. For one, the std::vector example doesn't make an intermediate collection like the std::valarray example does. However, I think it's fair to compare them because the differences are tied to the differences between std::vector and std::valarray.

When I wrote this answer, I suspected that subtracting the value of elements from two std::valarrays (last line in the std::valarray example) would be less cache-friendly than the corresponding line in the std::vector example (which happens to also be the last line).

It turns out, however, that

std::valarray<double> original_values = ... // obviously I put something here
double mean = original_values.sum() / original_values.size();
std::valarray<double> differences_from_mean = original_values - mean;

Does the same thing as the std::vector example, and has almost identical performance. In the end, the question is which API you prefer.

0人赞添加讨论(0) 举报

兄弟一词,经得起流年.

4楼-- · 2019-01-05 08:47

During the standardization of C++98, valarray was designed to allow some sort of fast mathematical computations. However, around that time Todd Veldhuizen invented expression templates and created blitz++, and similar template-meta techniques were invented, which made valarrays pretty much obsolete before the standard was even released. IIRC, the original proposer(s) of valarray abandoned it halfway into the standardization, which (if true) didn't help it either.

ISTR that the main reason it wasn't removed from the standard is that nobody took the time to evaluate the issue thoroughly and write a proposal to remove it.

Please keep in mind, however, that all this is vaguely remembered hearsay. Take this with a grain of salt and hope someone corrects or confirms this.

0人赞添加讨论(0) 举报

爷的心禁止访问

5楼-- · 2019-01-05 08:47

valarray was supposed to let some FORTRAN vector-processing goodness rub off on C++. Somehow the necessary compiler support never really happened.

The Josuttis books contains some interesting (somewhat disparaging) commentary on valarray (here and here).

However, Intel now seem to be revisiting valarray in their recent compiler releases (e.g see slide 9); this is an interesting development given that their 4-way SIMD SSE instruction set is about to be joined by 8-way AVX and 16-way Larrabee instructions and in the interests of portability it'll likely be much better to code with an abstraction like valarray than (say) intrinsics.

0人赞添加讨论(0) 举报

手持菜刀，她持情操

6楼-- · 2019-01-05 08:50

I found one good usage for valarray. It's to use valarray just like numpy arrays.

auto x = linspace(0, 2 * 3.14, 100);
plot(x, sin(x) + sin(3.f * x) / 3.f + sin(5.f * x) / 5.f);

We can implement above with valarray.

valarray<float> linspace(float start, float stop, int size)
{
    valarray<float> v(size);
    for(int i=0; i<size; i++) v[i] = start + i * (stop-start)/size;
    return v;
}

std::valarray<float> arange(float start, float step, float stop)
{
    int size = (stop - start) / step;
    valarray<float> v(size);
    for(int i=0; i<size; i++) v[i] = start + step * i;
    return v;
}

string psstm(string command)
{//return system call output as string
    string s;
    char tmp[1000];
    FILE* f = popen(command.c_str(), "r");
    while(fgets(tmp, sizeof(tmp), f)) s += tmp;
    pclose(f);
    return s;
}

string plot(const valarray<float>& x, const valarray<float>& y)
{
    int sz = x.size();
    assert(sz == y.size());
    int bytes = sz * sizeof(float) * 2;
    const char* name = "plot1";
    int shm_fd = shm_open(name, O_CREAT | O_RDWR, 0666);
    ftruncate(shm_fd, bytes);
    float* ptr = (float*)mmap(0, bytes, PROT_WRITE, MAP_SHARED, shm_fd, 0);
    for(int i=0; i<sz; i++) {
        *ptr++ = x[i];
        *ptr++ = y[i];
    }

    string command = "python plot.py ";
    string s = psstm(command + to_string(sz));
    shm_unlink(name);
    return s;
}

Also, we need python script.

import sys, posix_ipc, os, struct
import matplotlib.pyplot as plt

sz = int(sys.argv[1])
f = posix_ipc.SharedMemory("plot1")
x = [0] * sz
y = [0] * sz
for i in range(sz):
    x[i], y[i] = struct.unpack('ff', os.read(f.fd, 8))
os.close(f.fd)
plt.plot(x, y)
plt.show()

0人赞添加讨论(0) 举报

\"骚年 ilove

7楼-- · 2019-01-05 08:55

The C++11 standard says:

The valarray array classes are defined to be free of certain forms of aliasing, thus allowing operations on these classes to be optimized.

See C++11 26.6.1-2.

0人赞添加讨论(0) 举报

C++ valarray vs. vector

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间