I'm having a little problem coming up with an architecture for tag driven software I'm designing.
What I want to do is store plain text in database, which is liked to an owner and other entities. The plain text is filled with tags just like Twitters hashtags and should be searchable/indexable. That can be done application side and as a result I'm gonna have tons of small chunks of data that need to be processed for business intelligens.
No one is gonna ready the plain text it's only about the analysis which doesn't need to be consistent and can be run asynchron.
I know that Twitter uses several databases: Gizzard and Cassandra for tweets and FlockDb for relations.
I don't feel like using a hybrid to accomplish relations and I don't want to build the next social network either. What I need to do though is analytics over all tags in relation to other entities.
How can I solve the hash tag problem, or how can I process the text to make it work?
I'm really searching for a nice solution not just any solution. I really know how to create a schema for SQL.
Thanks for helping me through that database jungle.