indexx() Numerical Recipes (C) index sorting algor

2019-09-13 01:52发布

问题:

I am trying to use the indexx() algorithm from Numerical Recipes (NR) in C and have found a very strange bug.

(NR is publicly available here: http://www2.units.it/ipl/students_area/imm2/files/Numerical_Recipes.pdf page 338, section 8.4)

The function should output an array of indices that correspond to elements of the input array of floats, sorted from low to high.

Below is a minimal working example, showing that the algorithm appears to ignore the first two elements. The output array first two elements are always 0 followed by the length of the array (9 in this example). The remaining elements appear to be sorted correctly.

Oh and I tried to ask on the NR forums but have been waiting a long time for my account to be activated... Many thanks in advance!

[Edited array names]

#include "nr_c/nr.h"

#include <stdio.h>
#include <stdlib.h>


int main()
{
    float unsorted[9] = {4., 5., 2., 6., 3., 8., 1., 9., 7.};
    long unsigned int sort[9];

    printf("Unsorted input array:\n");
    for (int i=0; i<9; i++) {
        printf("%.1f  ", unsorted[i]);
    }
    printf("\n\n");

    indexx(9, unsorted, sort);

    printf("Indexx output array:\n");
    for (int i=0; i<9; i++) {
        printf("%d    ", sort[i]);
    }
    printf("\n\n");

    printf("Should-be-sorted array:\n");
    for (int i=0; i<9; i++) {
        printf("%.1f  ", unsorted[sort[i]]);
    }
    printf("\n\n");

    return 0;
}

Output:

Unsorted input array:
4.0  5.0  2.0  6.0  3.0  8.0  1.0  9.0  7.0  

Indexx output array:
0    9    6    2    4    1    3    8    5    

Should-be-sorted array:
4.0  0.0  1.0  2.0  3.0  5.0  6.0  7.0  8.0 

回答1:

The Numerical Recipes C code uses 1-based indexing. (because of its FORTRAN origin, the first version was written in FORTRAN, and fortran uses 1-based indexing for arrays and matrices. The C version was probably based on the output of f2c ...) In the original code in the question, the indexx() function ignores the first element of both the unsorted[] and sort[] arrays. Plus: it accesses one beyond the arrays's last elements (resulting in UB) As a result, both 0 and 9 are present in the index-array. (the initial 0 is in fact uninitialised memory, but it has not been touched by the indexx() function)


If my hypothesis is correct, the following should work:


#include "nr_c/nr.h"

#include <stdio.h>
#include <stdlib.h>


int main()
{
    float unsorted[9] = {4., 5., 2., 6., 3., 8., 1., 9., 7.};
    long unsigned int sort[9];

    printf("Unsorted input array:\n");
    for (int i=0; i<9; i++) {
        printf("%.1f  ", unsorted[i]);
    }
    printf("\n\n");

    indexx(9, unsorted-1, sort-1); // <<-- HERE

    printf("Indexx output array:\n");
    for (int i=0; i<9; i++) {
        printf("%d    ", sort[i]);
    }
    printf("\n\n");

    printf("Should-be-sorted array:\n");
    for (int i=0; i<9; i++) {
        printf("%.1f  ", unsorted[sort[i]-1]); // <<-- AND HERE
    }
    printf("\n\n");

    return 0;
}

The numerical recipes in C code assumes 1-based indexing: an N sized array has indexes 1..N. This was done to not confuse the mathematicians. (as a result, a whole generation of programmers has been confused) The alloccation functions are wrappers around malloc(), which cheat by returning a pointer to the the -1th member of the allocated space. For a float vector this could be like:


#include <stdlib.h>

float * float_vector(unsigned size)
{
float * array;
array = calloc( size, sizeof *array);
if (!array) return NULL;
return array -1;
}