How do you query mongodb to find the length of a particular string/text field? Additionally how would you find the max length of a query set?
问题:
回答1:
Unfortunately the aggregation framework doesn't support a "len" operator to automatically convert strings to their length while you do a query. So you have to solve this in your own code. You could
- use a MapReduce function to calculate string lengths
- query for the strings and calculate their length on the application layer
The difference between these method is that the first runs on the database while the latter runs on your application server. I would recommend the latter option, because MapReduce can be very slow and cumbersome to use.
回答2:
Sky is the limit! No, actually it is 16 MB
for a document in mongodb
. This can be the maximum length of a string in the records.
In order to find the maximum length in a query set, you can do this work around:
- Keep the length of text together with the text itself.
- Sort your result set descending on the length.
- Get the first element which has maximum length.
回答3:
how about use regular expression instead.
> db.apps.find({$where:"(this.id.length gt 6) && (this.id.length lt 15) " } ).count(); 2548 > db.apps.find({$where:" (this.id.length gt 6) && (this.id.length lt 15) " } ).explain(); { "cursor" : "BasicCursor", "isMultiKey" : false, "n" : 2548, "nscannedObjects" : 88736, "nscanned" : 88736, "nscannedObjectsAllPlans" : 88736, "nscannedAllPlans" : 88736, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 1, "nChunkSkips" : 0, "millis" : 1523, "indexBounds" : { }, "server" : "shuhaimac.local:27017" }
> db.apps.find({id:/\w{7,16}/i}).count(); 2548 > db.apps.find({id:/\w{7,16}/i}).explain(); { "cursor" : "BtreeCursor id_1 multi", "isMultiKey" : false, "n" : 2548, "nscannedObjects" : 2548, "nscanned" : 88736, "nscannedObjectsAllPlans" : 2548, "nscannedAllPlans" : 88736, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 122, "indexBounds" : { "id" : [ [ "", { } ], [ /\w{7,16}/i, /\w{7,16}/i ] ] }, "server" : "shuhaimac.local:27017" }
回答4:
So, I hope this helps. :-) I ran into the same problem - and it took me a while to get the map-reduce working.
$response = $Mongo->yourdb->command(array(
"mapreduce" => "yourcollection",
"map" => new MongoCode(" function() { emit( this.groupbykey, this.thestring.length ); } "),
"reduce" => new MongoCode(" function(k, vals) { return Math.max.apply(null, vals); } "),
"query" => array("groupbykey" => "somevalue"),
"out" => array("inline" => 0)
));
Response will hold the map-reduce result
Array
(
[results] => Array
(
[0] => Array
(
[_id] => groupbykeyvalue
[value] => 106
)
)
[counts] => Array
(
[input] => 7341
[emit] => 7341
[reduce] => 76
[output] => 1
)
[timeMillis] => 189
[timing] => Array
(
[shardProcessing] => 171
[postProcessing] => 17
)
[shardCounts] => Array
(
[someshard:27017] => Array
Good luck, let me know if you need a different variant!
回答5:
Unlike SQL MongoDB doesn't really know the length of fields as such. At most when it indexes it knows that the field is either under 1024 bytes or not.
As such this is something you will probably have to fix client side. You could use a $where
here but I think your looking at this wrong if you want to do that.
You could also use and MR here as @Philipp states but again you are probably looking into the wrong thing here.
A query in MongoDB is in fact a BSON document. As such the maximum length of a query set (depending on what you define as a "query set") is always 16MB (at the moment).
Many drivers provide a means by which to encode a structure (hash or dict or whatever) to BSON allowing you to judge the length of the encoded string to understand the size of your query.