Accessing the first Character of a String with no

2019-07-01 23:17发布

问题:

I am implementing a suffix trie in C++. The implementation of the Trie contructor can be seen below.

#include <iostream>
#include <cstring>
#include "Trie.hpp"
using namespace std;

Trie::Trie(string T){   
    T += "#";                           //terminating character     
    this->T = T;

    nodes.reserve(T.length() * (T.length() + 1) / 2);   //The number of nodes is bounded above by n(n+1)/2. The reserve prevents reallocation (http://stackoverflow.com/questions/41557421/vectors-and-pointers/41557463) 

    vector<string> suffix;              //vector of suffixes
    for(unsigned int i = 0; i < T.length(); i++)
        suffix.push_back(T.substr(i, T.length()-i));

    //Create the Root, and start from it
    nodes.push_back(Node(""));          //root has blank label
    Node* currentNode = &nodes[0];

    //While there are words in the array of suffixes
    while(!suffix.empty()){

        //If the character under consideration already has an edge, then this will be its index. Otherwise, it's -1.
        int edgeIndex = currentNode->childLoc(suffix[0].at(0));     

        //If there is no such edge, add the rest of the word
        if(edgeIndex == -1){
            addWord(currentNode, suffix[0]);                //add rest of word
            suffix.erase(suffix.begin());                   //erase the suffix from the suffix vector
        }

        //if there is
        else{
            currentNode = (currentNode->getEdge(edgeIndex))->getTo();       //current Node is the next Node
            suffix[0] = suffix[0].substr(1, suffix[0].length());            //remove first character
        }           
    }   
}

//This function adds the rest of a word
void Trie::addWord(Node* parent, string word){  
    for(unsigned int i = 0; i < word.length(); i++){                //For each remaining letter
        nodes.push_back(Node(parent->getLabel()+word.at(i)));       //Add a node with label of parent + label of edge
        Edge e(word.at(i), parent, &nodes.back());                  //Create an edge joining the parent to the node we just added
        parent->addEdge(e);                                         //Join the two with this edge   
    }
}

I am using two data structures, Node and Edge which have some getters and setters and properties you would expect. The method childLoc() returns the location of an edge (if it exists) representing a given character.

The code compiles just fine, but for some reason I get this error at runtime:

terminate called after throwing an instance of 'std::out_of_range'
  what():  basic_string::at: __n (which is 0) >= this->size() (which is 0)
Aborted (core dumped)

I've been told that this error means I am accessing the first character of an empty string, but I can't see where this is happening in the code.

回答1:

I see two code parts that are potentially responsible for std::out_of_range:

First: The following expression might access an empty string at position 0. This might happen as (as shown in the second part), you shrink strings contained in the suffix-vector:

int edgeIndex = currentNode->childLoc(suffix[0].at(0));

Second, you operate on the entries in suffix-vector with the risk that the strings are to short:

suffix[0] = suffix[0].substr(1, suffix[0].length()); 

Operation substr will also yield std::out_of_range if first operand (i.e. pos-argument) exceeds the array length (cf. string::substr):

pos: Position of the first character to be copied as a substring. If this is equal to the string length, the function returns an empty string. If this is greater than the string length, it throws out_of_range. Note: The first character is denoted by a value of 0 (not 1).

For finding out which of these expressions is actually responsible for the exception, I'd suggest to consult your debugger :-)