In my project (similar to mediafire
and rapidshare
), clients can upload files to the server. I am using DB2 database
and IBM WAS web server
and JSP as server side scripting
. I am creating my own encryption algorithm, as it is the main aim of the project.
I need suggestion whether files themselves should be stored in the database or if only the location of the files should be stored. Which approach is best?
There are Pros and Cons for storing BLOBs in the database.
Advantages
- DBMS support for BLOBs is very good nowadays
- JDBC driver support for BLOBs is very good
- access to the "documents" can happen inside a transaction. No need to worry about manual cleanup or "housekeeping". If the row is deleted, so is the BLOB data
- Don't have to worry about filesystem limits. Filesystems are typically not very good at storing million of files in a single directory. You will have to distribute your files across several directories.
- Everything is backed up together. If you take a database backup you have everything, no need to worry about an additional filesystem backup (but see below)
- Easily accessible through SQL (no FTP or other tools necessary). That access is already there and under control.
- Same access controls as for the rest of the data. No need to set up OS user groups to limit access to the BLOB files.
Disadvantages
- Not accessible from the OS directly (problem if you need to manipulate the files using commandline tools)
- Cannot be served by e.g. a webserver directly (that could be performance problem)
- Database backup (and restore) is more complicated (because of size). Incremental backups are usually more efficient in the filesystem
- DBMS cache considerations
- Not suited for high-write scenarios
You need to judge for yourself which advantage and which disadvantage is more important for you.
I don't share the wide-spread assumption that storing BLOBs in a database is always a bad idea. It depends - as with many other decisions.
It's general knowledge that storing files in the database -especially big ones- it's generally a bad idea. There are brilliant explanations in these questions:
Storing a file in a database as opposed to the file system?
Storing Images in DB - Yea or Nay?
And I'd like to highlight some points myself:
- Storing files in your DBMS will make your data very big, and big databases are a maintaining hell (specially backups)
- Portability becomes an issue, as every DBMS vendor makes its own implementation of BLOB files
- There's a performance lost related to SELECT sentences to BLOB fields, compared to disk access
Well my Opinion would be to store the relevant information like path, name, description, etc... in the database and keep the file evtl. encrypted on the filesystem, it would be cheaper to scale your system adding a webserver than adding a database one as webspace is cheap comparing with databases, all you will need then is to add an IP column to your database or server name so you can address teh new webserver.