Sorting data in binary file without using arrays

2019-09-03 13:14发布

问题:

Hello I have an assignment where I am supposed to generate random numbers. Then write them to a binary file. After that I am supposed to read the data print it to the screen. Then finally, I have to sort the data and output. I have to do that without using arrays. I was able to do the first two parts. However, I couldn't do the last part correctly, so I went to google and youtube looking for an illustration on how to do that, but I was out of luck. I would really love to know what is it that I am doing wrong, thank you.

I think I am supposed to use recursion fseek and casting to avoid using arrays; however, I don't know how to use them.

Here is my code along with the output:

#include <iostream>
#include <fstream>
#include <iomanip>
#include <stdlib.h> 

using namespace std;

//Function prototypes
void Write_File(int num);
void Read_File(int num);
void Compare(int num);

//**********************

int main()
{
    int num = 0;

    cout << "Enter a positive number to generate random numbers ";
    cin >> num;

    Write_File(num);
    Read_File(num);
    Compare(num);


    return 0;
}

//***************

// Functions Definitions
void Write_File(int num)
{
    ofstream fout("sortfile.dat", ios::out | ios::binary | ios::beg);
    int number = 0;

    if (fout.is_open())
    {
        for (int x = 0; x < num; x++)
        {
            number = rand();
            fout << number << endl;
        }

        fout.close();
    }
    else
    {
        cout << "Error, File not opened" << endl;
    }
}

void Read_File(int num)
{
    ifstream fin("sortfile.dat", ios::in | ios::binary | ios::beg);
    int number = 0;

    cout << endl << "randomly generated numbers befor sorting\n";
    if (fin.is_open())//check first if open already
    {
        for (int i = 0; i < num; i++)
        {
            fin >> number;
            cout << number << endl;
        }


            fin.close();
        }
        else
            cout << "File not opened in read phase." << endl;
}


void Compare(int num) 
{
    int number = 0;
    int temp = 0;
    int hold = 0;
    ifstream fin("sortfile.dat", ios::in | ios::binary | ios::beg);

    if (fin.is_open())//check first if open already
    {
        for (int i = 0; i < num; i++)
        {
            fin >> number;
            fin >> temp;
            if (number > temp)
            {


                hold = temp;
                temp = number;
                number = hold;

            }
            else
            {
                ofstream fout("sortfile.dat", ios::out | ios::binary | ios::beg);
                fout << number;
                fout << temp;
            }

            ofstream fout("sortfile.dat", ios::out | ios::binary | ios::beg);
            fout << number;
            fout << temp;
        }
        cout << endl << "Nums after sort\n";
        for (int i = 0; i < num; i++)
        {
            fin >> number;
            cout << number << endl;
        }

        fin.close();
    }
    else
        cout << "File not opened in read phase." << endl;
}

Output:

Enter a positive number to generate random numbers 4

randomly generated numbers befor sorting
41
18467
6334
26500

Nums after sort
6334
6334
6334
6334
Press any key to continue . . .

回答1:

Ok, there are several things wrong here. First, you are not going to be able to do a one pass sort by just swapping adjacent elements. That would be an O(n) sort, and if you managed to implement that, you would be an instant computer science celebrity. Your professor said you can't use arrays? If by "arrays", he/she meant any kind of in-memory buffer, then that's just silly. If you want a sort that doesn't use arrays, in a literal sense, just use a std::map. Insert the numbers into the map as you read them from the file. Then, use an iterator to retrieve the numbers in sorted order, printing them out as you do so.

Secondly, you are doing some things with file io that are probably not what you want. In your loop where you read pairs of numbers from the file and then possibly swap them, your control flow allows for the pair of numbers to be written twice to the file, in the event that they are in sorted order. Also, you construct one, or possibly two, ofstream objects in each loop iteration. Each new output stream starts at the beginning of the file, so you're repeatedly writing to the beginning of the file. You also don't seek your input stream to the beginning of the file before printing out the "sorted" contents. The reason it's printing out the same number 4 times is that it's reached the end of the file, and isn't assigning any new value to "number".

My advice would be to really avoid trying to read and write to the file at the same time. Read the numbers in (not to an array, mind you ;) ), sort them, then output them. (From the assignment spec, it doesn't seem like you have to write them to the file again, but I'm not sure).



回答2:

From what I can tell you keep recreating your output stream which means you keep writing to the beginning of your file.

In your compare function:

for (int i = 0; i < num; i++)
{
      ...

     // for every iteration you create an output stream at the beginning of the file
      ofstream fout("sortfile.dat", ios::out | ios::binary | ios::beg);
      fout << number;
      fout << temp;
}


回答3:

Hmm, to sort a bunch of elements, whatever algorithm you use, there will be a time where you have at least to store one element (or its index) in a specified place in its storage => what ordinary programmers called using an array (or array or indexes).

Your requirement could be understood in two ways:

  • you cannot use arrays but are allowed to used other C++ containers. Fine, take a vector (that you use like an array once filled), or a map as suggested by @c_dubs. Maybe, but if I gave a requirement for not to use arrays and not explicitely allow C++ containers, I would not be pleased with that solution
  • you cannot use in memory storage at all. You must directly sort you file (or a copy of the file) on disk. If you follow that path, just use fseek to move pointer on file, and you will be able to read and write elements directly at a defined position on disk. That is enough to use any sort algorythm.

Beware: my advice is to use direct disk sorting for that assignement, but never do that in real world! Anyway it can be useful where sorting large number of elements, to split the initial data in bunches small enough to be processed in memory sort them and store them back to disk and then merge those sorted files. It is the only foolproof way to sort data that could not be loaded in memory.



回答4:

My guess is that you're allowed to use variables to hold numbers from the file and multiple files to hold temp data (or a single file that you random access as if it was 4 files). You could implement a bottom up 2 way merge sort using 4 files. It would be similar to the 4 tape drive sort mentioned in the wiki article on merge sort.

http://en.wikipedia.org/wiki/Merge_sort#Use_with_tape_drives