C++ correct way to return pointer to array from fu

2019-01-07 10:53发布

问题:

I am fairly new to C++ and have been avoiding pointers. From what I've read online I cannot return an array but I can return a pointer to it. I made a small code to test it and was wondering if this was the normal / correct way to do this:

#include <iostream>
using namespace std;

int* test (int in[5]) {
    int* out = in;
    return out;
}

int main() {
    int arr[5] = {1, 2, 3, 4, 5};
    int* pArr = test(arr);
    for (int i = 0; i < 5; i++) cout<<pArr[i]<<endl;
    cout<<endl;
    return 0;
}

Edit: This seems to be no good. How should I rewrite it?

int* test (int a[5], int b[5]) {
    int c[5];
    for (int i = 0; i < 5; i++) c[i] = a[i]+b[i];
    int* out = c;
    return out;
}

Edit (again): Thank you, there are a lot of helpful answers here :) I ended up with this as the function:

int* test (int* a, int* b) {
    int* c = new int[5];
    for (int i = 0; i < 5; i++) c[i] = a[i]+b[i];
    return c;
}

and this to call it:

int m1[5] = {1, 2, 3, 4, 5};
int m2[5] = {6, 7, 8, 9, 10};
int* m3 = test(m1, m2);
for (int i = 0; i < 5; i++) cout<<m3[i]<<endl;
cout<<endl;
delete[] m3;

回答1:

Your code as it stands is correct but I am having a hard time figuring out how it could/would be used in a real world scenario. With that said, please be aware of a few caveats when returning pointers from functions:

  • When you create an array with syntax int arr[5];, it's allocated on the stack and is local to the function.
  • C++ allows you to return a pointer to this array, but it is undefined behavior to use the memory pointed to by this pointer outside of its local scope. Read this great answer using a real world analogy to get a much clear understanding than what I could ever explain.
  • You can still use the array outside the scope if you can guarantee that memory of the array has not be purged. In your case this is true when you pass arr to test().
  • If you want to pass around pointers to a dynamically allocated array without worrying about memory leaks, you should do some reading on std::unique_ptr/std::shared_ptr<>.

Edit - to answer the use-case of matrix multiplication

You have two options. The naive way is to use std::unique_ptr/std::shared_ptr<>. The Modern C++ way is to have a Matrix class where you overload operator * and you absolutely must use the new rvalue references if you want to avoid copying the result of the multiplication to get it out of the function. In addition to having your copy constructor, operator = and destructor, you also need to have move constructor and move assignment operator. Go through the questions and answers of this search to gain more insight on how to achieve this.

Edit 2 - answer to appended question

int* test (int a[5], int b[5]) {
    int *c = new int[5];
    for (int i = 0; i < 5; i++) c[i] = a[i]+b[i];
    return c;
}

If you are using this as int *res = test(a,b);, then sometime later in your code, you should call delete []res to free the memory allocated in the test() function. You see now the problem is it is extremely hard to manually keep track of when to make the call to delete. Hence the approaches on how to deal with it where outlined in the answer.



回答2:

Your code is OK. Note though that if you return a pointer to an array, and that array goes out of scope, you should not use that pointer anymore. Example:

int* test (void)
{
    int out[5];
    return out;
}

The above will never work, because out does not exist anymore when test() returns. The returned pointer must not be used anymore. If you do use it, you will be reading/writing to memory you shouldn't.

In your original code, the arr array goes out of scope when main() returns. Obviously that's no problem, since returning from main() also means that your program is terminating.

If you want something that will stick around and cannot go out of scope, you should allocate it with new:

int* test (void)
{
    int* out = new int[5];
    return out;
}

The returned pointer will always be valid. Remember do delete it again when you're done with it though, using delete[]:

int* array = test();
// ...
// Done with the array.
delete[] array;

Deleting it is the only way to reclaim the memory it uses.



回答3:

New answer to new question:

You cannot return pointer to automatic variable (int c[5]) from the function. Automatic variable ends its lifetime with return enclosing block (function in this case) - so you are returning pointer to not existing array.

Either make your variable dynamic:

int* test (int a[5], int b[5]) {
    int* c = new int[5];
    for (int i = 0; i < 5; i++) c[i] = a[i]+b[i];
    return c;
}

Or change your implementation to use std::array:

std::array<int,5> test (const std::array<int,5>& a, const std::array<int,5>& b) 
{
   std::array<int,5> c;
   for (int i = 0; i < 5; i++) c[i] = a[i]+b[i];
   return c;
}

In case your compiler does not provide std::array you can replace it with simple struct containing an array:

struct array_int_5 { 
   int data[5];
   int& operator [](int i) { return data[i]; } 
   int operator const [](int i) { return data[i]; } 
};

Old answer to old question:

Your code is correct, and ... hmm, well, ... useless. Since arrays can be assigned to pointers without extra function (note that you are already using this in your function):

int arr[5] = {1, 2, 3, 4, 5};
//int* pArr = test(arr);
int* pArr = arr;

Morever signature of your function:

int* test (int in[5])

Is equivalent to:

int* test (int* in)

So you see it makes no sense.

However this signature takes an array, not pointer:

int* test (int (&in)[5])


回答4:

A variable referencing an array is basically a pointer to its first element, so yes, you can legitimately return a pointer to an array, because thery're essentially the same thing. Check this out yourself:

#include <assert.h>

int main() {
  int a[] = {1, 2, 3, 4, 5}; 

  int* pArr = a;
  int* pFirstElem = &(a[0]);

  assert(a == pArr);
  assert(a == pFirstElem);

  return 0;
}

This also means that passing an array to a function should be done via pointer (and not via int in[5]), and possibly along with the length of the array:

int* test(int* in, int len) {
    int* out = in;
    return out;
}

That said, you're right that using pointers (without fully understanding them) is pretty dangerous. For example, referencing an array that was allocated on the stack and went out of scope yields undefined behavior:

#include <iostream>

using namespace std;

int main() {
  int* pArr = 0;
  {
    int a[] = {1, 2, 3, 4, 5};
    pArr = a; // or test(a) if you wish
  }
  // a[] went out of scope here, but pArr holds a pointer to it

  // all bets are off, this can output "1", output 1st chapter
  // of "Romeo and Juliet", crash the program or destroy the
  // universe
  cout << pArr[0] << endl; // WRONG!

  return 0;
}

So if you don't feel competent enough, just use std::vector.

[answer to the updated question]

The correct way to write your test function is either this:

void test(int* a, int* b, int* c, int len) {
  for (int i = 0; i < len; ++i) c[i] = a[i] + b[i];
}
...
int main() {
   int a[5] = {...}, b[5] = {...}, c[5] = {};
   test(a, b, c, 5);
   // c now holds the result
}

Or this (using std::vector):

#include <vector>

vector<int> test(const vector<int>& a, const vector<int>& b) {
  vector<int> result(a.size());
  for (int i = 0; i < a.size(); ++i) {
    result[i] = a[i] + b[i];
  }
  return result; // copy will be elided
}


回答5:

In a real app, the way you returned the array is called using an out parameter. Of course you don't actually have to return a pointer to the array, because the caller already has it, you just need to fill in the array. It's also common to pass another argument specifying the size of the array so as to not overflow it.

Using an out parameter has the disadvantage that the caller may not know how large the array needs to be to store the result. In that case, you can return a std::vector or similar array class instance.



回答6:

Your code (which looks ok) doesn't return a pointer to an array. It returns a pointer to the first element of an array.

In fact that's usually what you want to do. Most manipulation of arrays are done via pointers to individual elements, not via pointers to the array as a whole.

You can define a pointer to an array, for example this:

double (*p)[42];

defines p as a pointer to a 42-element array of doubles. A big problem with that is that you have to specify the number of elements in the array as part of the type -- and that number has to be a compile-time constant. Most programs that deal with arrays need to deal with arrays of varying sizes; a given array's size won't vary after it's been created, but its initial size isn't necessarily known at compile time, and different array objects can have different sizes.

A pointer to the first element of an array lets you use either pointer arithmetic or the indexing operator [] to traverse the elements of the array. But the pointer doesn't tell you how many elements the array has; you generally have to keep track of that yourself.

If a function needs to create an array and return a pointer to its first element, you have to manage the storage for that array yourself, in one of several ways. You can have the caller pass in a pointer to (the first element of) an array object, probably along with another argument specifying its size -- which means the caller has to know how big the array needs to be. Or the function can return a pointer to (the first element of) a static array defined inside the function -- which means the size of the array is fixed, and the same array will be clobbered by a second call to the function. Or the function can allocate the array on the heap -- which makes the caller responsible for deallocating it later.

Everything I've written so far is common to C and C++, and in fact it's much more in the style of C than C++. Section 6 of the comp.lang.c FAQ discusses the behavior of arrays and pointers in C.

But if you're writing in C++, you're probably better off using C++ idioms. For example, the C++ standard library provides a number of headers defining container classes such as <vector> and <array>, which will take care of most of this stuff for you. Unless you have a particular reason to use raw arrays and pointers, you're probably better off just using C++ containers instead.

EDIT : I think you edited your question as I was typing this answer. The new code at the end of your question is, as you observer, no good; it returns a pointer to an object that ceases to exist as soon as the function returns. I think I've covered the alternatives.



回答7:

you can (sort of) return an array

instead of

int m1[5] = {1, 2, 3, 4, 5};
int m2[5] = {6, 7, 8, 9, 10};
int* m3 = test(m1, m2);

write

struct mystruct
{
  int arr[5];
};


int m1[5] = {1, 2, 3, 4, 5};
int m2[5] = {6, 7, 8, 9, 10};
mystruct m3 = test(m1,m2);

where test looks like

struct mystruct test(int m1[5], int m2[5])
{
  struct mystruct s;
  for (int i = 0; i < 5; ++i ) s.arr[i]=m1[i]+m2[i];
  return s;
}

not very efficient since one is copying it delivers a copy of the array