Is it faster to access data from files or a databa

2019-01-21 12:53发布

If I have a static database consisting of folders and files, would access and manipulation be faster than SQL server type databases, considering this would be used in a CGI script?

When working with files and folders, what are the tricks to better performance?

10条回答
够拽才男人
2楼-- · 2019-01-21 13:27

From my little bit of experience, server-based databases (even those served on the local machine) tend to to have very slow throughput compared to local filesystems. However, this depends on some things, one of which being asymptotic complexity. Comparing scanning a big list of files against using a database with an index to look up an item, the database wins.

My little bit of experience is with PostgreSQL. I had a table with three million rows, and I went to update a mere 8,000 records. It took 8 seconds.

As for the quote "Premature optimization is the root of all evil.", I would take that with a grain of salt. If you write your application using a database, then find it to be slow, it might take a tremendous amount of time to switch to a filesystem-based approach or something else (e.g. SQLite). I would say your best bet is to create a very simple prototype of your workload, and test it with both approaches. I believe it is important to know which is faster in this case.

查看更多
放我归山
3楼-- · 2019-01-21 13:29

It depends on the profile of the data and what logic you are going to be using to access it. If you simply need to save and fetch named nodes then a filesystem-based database may be faster and more efficient. (You could also have a look at Berkeley DB for that purpose.) If you need to do index-based searches, and especially if you need to join different sets of data based on keys, then an SQL database is your best bet.

I would just go with whatever solution seems the most natural for your application.

查看更多
叼着烟拽天下
4楼-- · 2019-01-21 13:29

To quickly access files, depending on what you are doing, an mmap can be very handy. I just wrote about this in the Effective Perl blog as Memory-map files instead of slurping them.

However, I expect that a database server would be much faster. It's difficult to say what would be faster for you when we have no idea what you are doing, what sort of data you need to access, and so on.

查看更多
啃猪蹄的小仙女
5楼-- · 2019-01-21 13:30

I'll add to the it depends crowd.

This is the kind of question that has no generic answer but is heavily dependent on the situation at hand. I even recently moved some data from a SQL database to a flat file system because the overhead of the DB, combined with some DB connection reliability issues, made using flat files a better choice.

Some questions I would ask myself when making the choice include:

  1. How am I consuming the data? For example will I just be reading from the beginning to the end rows in the order entered? Or will I be searching for rows that match multiple criteria?

  2. How often will I be accessing the data during one program execution? Will I go once to get all books with Salinger as the author or will I go several times to get several different authors? Will I go more than once for several different criteria?

  3. How will I be adding data? Can I just append a row to the end and that's perfect for my retrieval or will it need to be resorted?

  4. How logical will the code look in six months? I emphasize this because I think this is too often forgotten in designing things (not just code, this hobby horse is actually from my days as a Navy mechanic cursing mechanical engineers). In six months when I have to maintain your code (or you do after working another project) which way of storing and retrieving data will make more sense. If going from flat files to a DB results in a 1% efficiency improvement but adds a week of figuring things out when you have to update the code have you really improved things.

查看更多
登录 后发表回答