Subset looks like an interesting, thin MongoDB wrapper.
In one of the examples given, there are Tweets and Users. However, User
is a subdocument of Tweet
. In classical SQL, this would be normalized into two separate tables with a foreign key from Tweet to User. In MongoDB, this wouldn't necessitate a DBRef
, storing the user's ObjectId
would be sufficient.
Both in Subset and Salat this would result in these case classes:
case class Tweet(_id: ObjectId, content: String, userId: ObjectId)
case class User(_id: ObjectId, name: String)
So there's no guarantee that the ObjectId in Tweet actually resolves to a User (making it less typesafe). I also have to write the same query for each class that references User (or move it to some trait).
So what I'd like to achieve is to have case class Tweet(_id: ObjectId, content: String, userId: User)
, in code, and the ObjectId
in the database. Is this possible, and if so, how? What are good alternatives?
Yes, it's possible. Actually it's even simpler than having a "user" sub-document in a "tweet". When "user" is a reference, it is just a scalar value, MongoDB and "Subset" has no mechanisms to query subdocument fields.
I've prepared a simple REPLable snippet of code for you (it assumes you have two collections -- "tweets" and "users").
Preparations...
import org.bson.types.ObjectId
import com.mongodb._
import com.osinka.subset._
import Document.DocumentId
val db = new Mongo("localhost") getDB "test"
val tweets = db getCollection "tweets"
val users = db getCollection "users"
Our User
case class
case class User(_id: ObjectId, name: String)
A number of fields for tweets and user
val content = "content".fieldOf[String]
val user = "user".fieldOf[User]
val name = "name".fieldOf[String]
Here more complicated things start to happen. What we need is a ValueReader
that's capable of getting ObjectId
based on field name, but then goes to another collection and reads an object from there.
This can be written as a single piece of code, that does all things at once (you may see such a variant in the answer history), but it would be more idiomatic to express it as a combination of readers. Suppose we have a ValueReader[User]
that reads from DBObject
:
val userFromDBObject = ValueReader({
case DocumentId(id) ~ name(name) => User(id, name)
})
What's left is a generic ValueReader[T]
that expects ObjectId
and retrieves an object from a specific collection using supplied underlying reader:
class RefReader[T](val collection: DBCollection, val underlying: ValueReader[T]) extends ValueReader[T] {
override def unpack(o: Any):Option[T] =
o match {
case id: ObjectId =>
Option(collection findOne id) flatMap {underlying.unpack _}
case _ =>
None
}
}
Then, we may say our type class for reading User
s from references is merely
implicit val userReader = new RefReader[User](users, userFromDBObject)
(I am grateful to you for this question, since this use case is quite
rare and I had no real motivation to develop a generic solution. I think
I need to include this kind of helper into "Subset" finally..
I shall appreciate your feedback on this approach)
And this is how you would use it:
import collection.JavaConverters._
tweets.find.iterator.asScala foreach {
case Document.DocumentId(id) ~ content(content) ~ user(u) =>
println("%s - %s by %s".format(id, content, u))
}
Alexander Azarov answer works probably fine, but I would personally not do it this way.
What you have is a Tweet that only have an ObjectId reference to the user.
And you want to load the user during tweet load because for your domain it is probably easier to manipulate. In any case, unless you use subdocuments (not always a good choice), you have to query the DB again to retrieve the user data, and this is what is done by Alexander Azarov.
You would rather do a transformation function that transforms a Tweet to a TweetWithUser or something like that.
def transform(tweet: Tweet) = TweetWithUser( tweet.id, tweet.content, findUserWithId(tweet.userId) )
I don't really see why you would expect a framework to resolve something that you could have done yourself very easily in a single line of code.
And remember in your application, in some cases you don't even need the whole User object, so it is expensive to query twice the database while it's not always needed. You should only use the case class with the full User data when you really need that user data, and not simply always load the full user data because it seems more convenient.
Or if you want to manipulate User objects anyway, you would have a User proxy, on which you could access the id attribute directly, and on any other access, a db query would be done. In Java/SQL, Hibernate is doing with lazy loading of relationships, but I'm not sure it's a good idea to use that with MongoDB and it breaks immutability