I'm saving data into the bongo as bulk insert. The data that's an array of JSON object contain date, numeric, alphanumeric data all saved as string.
Sample Data
[{
"CARDNO": "661",
"HOLDERNO": "661",
"HOLDERNAME": "S",
"IODATE": "4/1/2012",
"IOTIME": "00:03:27",
"IOGATENO": "01-3",
"IOGATENAME": "FWork",
"IOSTATUS": "Entry",
"DEPARTMENTNO": "1",
"UPDATE_STATUS": "1"
}, {
"CARDNO": "711",
"HOLDERNO": "711",
"HOLDERNAME": "P",
"IODATE": "4/1/2012",
"IOTIME": "04:35:33",
"IOGATENO": "01-7",
"IOGATENAME": "FDWork",
"IOSTATUS": "Exit",
"DEPARTMENTNO": "3",
"UPDATE_STATUS": "1"
}]
My Query
var start = new Date(2012, 4, 15);
var end = new Date(2012, 4, 1);
collection.find({
"IODATE": {
$gte: start,
$lt: end
}
}).toArray(function (err, data) {
if (err) {
console.log(err);
} else {
console.log(data.length)
}
//res.send(data.length);
res.send(JSON.stringify(data));
});
It's not returning result, I think it is because the value of "IODATE"
is in string inside db.
How to work around this issue? I may need to do bulk insert since the data can be of 200 million or so.
One last try at this, because you don't have a good record of accepting good advice.
Your date formats as they stand are going to bite you. Even where trying to work around them. Here are the problems:
The format is not lexical. Which means that even with a string comparison operators like $gte, $lte are just not going to work. A lexical date would be "2012-01-04" in "yyyy-mm-dd" format. That would work with the operators.
You could look at $substr (and it's complete lack of documentation, search on SO for real usage) within aggregate but your date format is lacking the double digit form of day
and month
ie "04/01/2012", so that is going to blow up the positional nature of the operator. Also you would have to transform before any $match which means you blow up any chance of reducing your pipeline input, so you are stuck with not being able to filter your large resultset by date
.
It's a horrible case, but there really is no other practical solution to the large data problem here than to convert your dates. Strings in the form that you have just do not cut it. Either, in order of preference convert to:
- BSON date
- epoch timestamp as long (whatever)
- Lexical string representation (as described)
Your main case seems to be filtering, so updating the dataset is the only pratical alternative. Otherwise you are stuck with "paging" results and doing a lot of manual work, that could otherwise be done server side.