I am using java/morphia to deal with mongodb. The default ObjectId is not very convenient to use from Java layer. I would like to make it a String type while keep the key generation process using ObjectId, say _id = new ObjectId.toString()
.
I want to know if there is any side effects doing it this way? For example, will it impact the database performance or causing key conflicts in any means? Will it affect the sharding environment ...
You can use any type of value for an _id
field (except for Arrays). If you choose not to use ObjectId, you'll have to somehow guarantee uniqueness of values (casting ObjectId to string will do). If you try to insert duplicate key, error will occur and you'll have to deal with it.
I'm not sure what effect will it have on sharded cluster when you attempt to insert two documents with the same _id to different shards. I suspect that it will let you insert, but this will bite you later. (I'll have to test this).
That said, you should have no troubles with _id = (new ObjectId).toString()
.
I actually did the same thing because I was having some problem converting the ObjectId to JSON.
I then did something like
@Id
private String id;
public String getId() {
return id();
}
public void setId(String id) {
this.id = id;
}
And everything worked fine untill I decided to update a previously inserted document, when i got the object by Id sent it to the page via JSON and receive the same updated object also by JSON post and then used the save function from the Datastore, instead of updating the previous data it inserted a new document instead of updating the one that was already.
Even worst the new document had the same ID than the previously inserted one, something i thought was impossible.
Anyway i setted the private object as an ObjectID and just left the get set as string and then it worked as expected, not sure that helps in your case thought.
@Id
private ObjectId id;
public String getId() {
return id.toString();
}
public void setId(String id) {
this.id = new ObjectId(id);
}
Yes, you can use a string as your _id.
I'd recommend it only if you have some value (in the document) that naturally is a good unique key. I used this design in one collection where there was a string geo-tag, of the form "xxxxyyyy"; this unique-per-document field was going to HAVE to be in the document and I had to build an index on it... so why not use it as a key? (This avoided one extra key-value pair, AND avoided a second index on the collection, since MondoDB naturally builds an index on "_id". Given the size of the collection, both of these added up to some serious space savings.)
However, from the tone of your question ("ObjectIDs are not very convenient"), if the only reason you want to use a string is you don't want to be bothered with figuring out how to neatly manage ObjectIDs... I'd suggest it is worth your time to get your head around them. I'm sure they are no trouble... once you've figured out your trouble with them.
Otherwise: what is your options? Will you concoct string IDs EVERY TIME you use a MongoDB in the future?
I would like to add that it is not always a good idea to use the automatically generated BSON ObjectID as a unique identifier, if it gets passed to the application: it can potentially be manipulated by the user.
ObjectIDs appear to be generated sequentially, so if you fail to implement the necessary authorization mechanisms, malicious user could simply increment the value he has, to access resources he should not have access to.
Therefore using UUID type identifiers will provide a layer of security-through-obscurity. Of course, Authorization (is this user allowed to access requested resource) is a must, but you should be aware of the aforementioned ObjectID feature.
To get the best of both worlds, generate UUID which matches your ObjectID length (12 or 24 characters) and use it to create your own _id of ObjectID type.