What's the difference between these two methods of reading an input file?
1) Using 'ifstream.get()'
and
2) Using a vector<char>
with ifstreambuf_iterator<char>
(less understood by me!)
(other than the obvious answer of having nifty vector methods to work with)
The input file is XML, and as you see below, immediately parsed into a rapidxml document. (initialized elsewhere, see example main function.)
First, let me show you two ways to write the 'load_config' function, one using ifstream.get()
and one using vector<char>
Method 1 ifstream.get()
provides working code, and a safe rapidXML document object:
rapidxml::xml_document<> *load_config(rapidxml::xml_document<> *doc){
ifstream myfile("inputfile");
//read in config file
char ch;
char buffer[65536];
size_t chars_read = 0;
while(myfile.get(ch) && (chars_read < 65535)){
buffer[chars_read++] = ch;
}
buffer[chars_read++] = '\0';
cout<<"clearing old doc"<<endl;
doc->clear();
doc->parse<0>(buffer);
//debug returns as expected here
cout << "load_config: Name of my first node is: " << doc->first_node()->name() << "\n";
return doc;
}
Method 2 results in a cloberred rapidXML document by another library - specifically, a call to curl_global_init(CURL_GLOBAL_SSL) [see main code below] - but I'm not blaming it on curl_global_init just yet.
rapidxml::xml_document<> *load_config(rapidxml::xml_document<> *doc){
ifstream myfile("inputfile");
vector<char> buffer((istreambuf_iterator<char>(inputfile)),
istreambuf_iterator<char>( ));
buffer.push_back('\0');
cout<<"file looks like:"<<endl; //looks fine
cout<<&buffer[0]<<endl;
cout<<"clearing old doc"<<endl;
doc->clear();
doc->parse<0>(&buffer[0]);
//debug prints as expected
cout << "load_config: Name of my first node is: " << doc->first_node()->name() << "\n";
return doc;
}
main code:
int main(void){
rapidxml::xml_document *doc;
doc = new rapidxml::xml_document;
load_config(doc);
// this works fine:
cout << "Name of my first node is: " << doc->first_node()->name() << "\n";
curl_global_init(CURL_GLOBAL_SSL); //Docs say do this first.
// debug broken object instance:
// note a trashed 'doc' here if using vector<char> method
// - seems to be because of above line... name is NULL
// and other nodes are now NULL
// causing segfaults down stream.
cout << "Name of my first node is: " << doc->first_node()->name() << "\n";
I am pretty darn sure this is all executed in a single thread, but maybe there is something going on beyond my understanding.
I'm also worried that I only fixed a symptom, not a cause... by simply changing my file load function. Looking to the community for help here!
Question: Why would moving away from the vector to a character array fix this?
Hint: I'm aware that rapidXML uses some clever memory management that actually accesses the input string directly.
Hint: The main function above creates a dynamic (new) xml_document. This was not in the original code, and is an artifact of debugging changes. The original (failing) code declared it and did not dynamically allocate it, but identical problems occurred.
Another Hint for full disclosure (although I don't see why it matters) - there is another instance of a vector in this mess of code that is populated by the data in the rapidxml::xml_document object.