Char* vs String Speed in C++

2019-05-06 03:04发布

I have a C++ program that will read in data from a binary file and originally I stored data in std::vector<char*> data. I have changed my code so that I am now using strings instead of char*, so that std::vector<std::string> data. Some changes I had to make was to change from strcmp to compare for example.

However I have seen my execution time dramatically increase. For a sample file, when I used char* it took 0.38s and after the conversion to string it took 1.72s on my Linux machine. I observed a similar problem on my Windows machine with execution time increasing from 0.59s to 1.05s.

I believe this function is causing the slow down. It is part of the converter class, note private variables designated with_ at the end of variable name. I clearly am having memory problems here and stuck in between C and C++ code. I want this to be C++ code, so I updated the code at the bottom.

I access ids_ and names_ many times in another function too, so access speed is very important. Through the use of creating a map instead of two separate vectors, I have been able to achieve faster speeds with more stable C++ code. Thanks to everyone!

Example NewList.Txt

2515    ABC 23.5    32  -99 1875.7  1  
1676    XYZ 12.5    31  -97 530.82  2  
279  FOO 45.5    31  -96  530.8  3  

OLD Code:

void converter::updateNewList(){
    FILE* NewList;
    char lineBuffer[100];
    char* id = 0;
    char* name = 0;

    int l = 0;
    int n;

    NewList = fopen("NewList.txt","r");
    if (NewList == NULL){
        std::cerr << "Error in reading NewList.txt\n";
        exit(EXIT_FAILURE);
    } 

    while(!feof(NewList)){
        fgets (lineBuffer , 100 , NewList); // Read line    
        l = 0;
        while (!isspace(lineBuffer[l])){
            l = l + 1;
        }

        id = new char[l];
        switch (l){
            case 1: 
                n = sprintf (id, "%c", lineBuffer[0]);
                break;
            case 2:
                n = sprintf (id, "%c%c", lineBuffer[0], lineBuffer[1]);
                break;
            case 3:
                n = sprintf (id, "%c%c%c", lineBuffer[0], lineBuffer[1], lineBuffer[2]);        
                break;
            case 4:
                n = sprintf (id, "%c%c%c%c", lineBuffer[0], lineBuffer[1], lineBuffer[2],lineBuffer[3]);
                break;
            default:
                n = -1;
                break;
        }
        if (n < 0){
            std::cerr << "Error in processing ids from NewList.txt\n";
            exit(EXIT_FAILURE);
        }

        l = l + 1;
        int s = l;
        while (!isspace(lineBuffer[l])){
            l = l + 1;
        }
        name = new char[l-s];
        switch (l-s){
            case 2:
                n = sprintf (name, "%c%c", lineBuffer[s+0], lineBuffer[s+1]);
                break;
            case 3:
                n = sprintf (name, "%c%c%c", lineBuffer[s+0], lineBuffer[s+1], lineBuffer[s+2]);
                break;
            case 4:
                n = sprintf (name, "%c%c%c%c", lineBuffer[s+0], lineBuffer[s+1], lineBuffer[s+2],lineBuffer[s+3]);
                break;
            default:
                n = -1;
                break;
        }
        if (n < 0){
            std::cerr << "Error in processing short name from NewList.txt\n";
            exit(EXIT_FAILURE);
        }


        ids_.push_back ( std::string(id) );
        names_.push_back(std::string(name));
    }

    bool isFound = false;
    for (unsigned int i = 0; i < siteNames_.size(); i ++) {
        isFound = false;
        for (unsigned int j = 0; j < names_.size(); j ++) {
            if (siteNames_[i].compare(names_[j]) == 0){
                isFound = true;
            }
        }
    }

    fclose(NewList);
    delete [] id;
    delete [] name;
}

C++ CODE

void converter::updateNewList(){
    std::ifstream NewList ("NewList.txt");

    while(NewList.good()){
        unsigned int id (0);
        std::string name;

        // get the ID and name
        NewList >> id >> name;

        // ignore the rest of the line
        NewList.ignore( std::numeric_limits<std::streamsize>::max(), '\n');

        info_.insert(std::pair<std::string, unsigned int>(name,id));

    }

    NewList.close();
}

UPDATE: Follow up question: Bottleneck from comparing strings and thanks for the very useful help! I will not be making these mistakes in the future!

8条回答
小情绪 Triste *
2楼-- · 2019-05-06 03:48

You can use a profiler to find out where your code consumes most time. If you are for example using gcc, you can compile your program with -pg. When you run it, it saves profiling results in a file. You can the run gprof on the binary to get human readable results. Once you know where most time is consumed you can post that piece of code for further questions.

查看更多
对你真心纯属浪费
3楼-- · 2019-05-06 03:50

I believe the main issue here is that your string version is copying things twice -- first into dynamically allocated char[] called name and id, and then into std::strings, while your vector<char *> version probably does not do that. To make the string version faster, you need to read directly into the strings and get rid of all the redundant copies

查看更多
登录 后发表回答