If I have a static database consisting of folders and files, would access and manipulation be faster than SQL server type databases, considering this would be used in a CGI script?
When working with files and folders, what are the tricks to better performance?
As others have said, it depends: on the size and nature of the data and the operations you're planning to run on it.
Particularly for a CGI script, you're going to incur a performance hit for connecting to a database server on every page view. However if you create a naive file-based approach, you could easily create worse performance problems ;-)
As well as a Berkeley DB File solution you could also consider using SQLite. This creates a SQL interface to a database stored in a local file. You can access it with DBI and SQL but there's no server, configuration or network protocol. This could allow easier migration if a database server is necessary in the future (example: if you decide to have multiple front-end servers, but need to share state).
Without knowing any details, I'd suggest using a SQLite/DBI solution then reviewing the performance. This will give flexibility with a reasonably simple start up and decent performance.
As a general rule, databases are slower than files.
If you require indexing of your files, a hard-coded access path on customised indexing structures will always have the potential to be faster if you do it correctly.
But 'performance' is not the the goal when choosing a database over a file based solution.
You should ask yourself whether your system needs any of the benefits that a database would provide. If so, then the small performance overhead is quite acceptable.
So:
Basically, the question is more of which would be easier to develop. The performance difference between the two is not worth wasting dev time.
As others have pointed out: it depends!
If you really need to find out which is going to be more performant for your purposes, you may want to generate some sample data to store in each format and then run some benchmarks. The Benchmark.pm module comes with Perl, and makes it fairly simple to do a side-by-side comparison with something like this:
You can type
perldoc Benchmark
to get more complete documentation.Depends on what your information is and what your access patterns and scale are. Two of the biggest benefits of a relational databases are:
Caching. Unless you're very clever, you can't write a cache as good as that of a DB server
Optimizer.
However, for certain specialized applications, neither of these 2 benefits manifest itself compared to files+folders data store - therefore the answer is a resounding "depends".
As for files/folders, the tricks are:
It is very useful to use files instead of db when it comes to images if site structure is suitable. Create folders representing your matching data and place images inside. For example you have an article site, you store your articles in db. You don't have to place your image paths on db, name folders with your primary keys like 1,2,3.. and put images inside. E-books, music files, videos, this approach can be used in all media files. Same logic works with xml files if you won't search for something.
I'm going to give you the same answer everyone else gave you, It's Depends
In a simple scenario with a single server that returns data (READ Only), Yes file system will be great and easy to manage.
But, when you have more than one server you'll have to manage distributed files system like glusterfs, ceph, etc..
A database is a tool to manage all of it for you, distributed files system, compression, read/write, locks etc..
hope that's helpful.