DB with best inserts/sec performance?

2019-01-21 21:15发布

问题:

We deploy an (AJAX - based) Instant messenger which is serviced by a Comet server. We have a requirement to store the sent messages in a DB for long-term archival purposes in order to meet legal retention requirements.

Which DB engine provides the best performance in this write-once, read never (with rare exceptions) requirement?

We need at least 5000 Insert/Sec. I am assuming neither MySQL nor PostgreSQL can meet these requirements.

Any proposals for a higher performance solution? HamsterDB, SQLite, MongoDB ...?

回答1:

If you are never going to query the data, then i wouldn't store it to a database at all, you will never beat the performance of just writing them to a flat file.

What you might want to consider is the scaling issues, what happens when it's to slow to write the data to a flat file, will you invest in faster disk's, or something else.

Another thing to consider is how to scale the service so that you can add more servers without having to coordinate the logs of each server and consolidate them manually.

edit: You wrote that you want to have it in a database, and then i would also consider security issues with havening the data on line, what happens when your service gets compromised, do you want your attackers to be able to alter the history of what have been said?

It might be smarter to store it temporary to a file, and then dump it to an off-site place that's not accessible if your Internet fronts gets hacked.



回答2:

Please ignore the above Benchmark we had a bug inside.

We have Insert 1M records with following columns: id (int), status (int), message (140 char, random). All tests was done with C++ Driver on a Desktop PC i5 with 500 GB Sata Disk.

Benchmark with MongoDB:

1M Records Insert without Index

time: 23s, insert/s: 43478

1M Records Insert with Index on Id

time: 50s, insert/s: 20000

next we add 1M records to the same table with Index and 1M records

time: 78s, insert/s: 12820

that all result in near of 4gb files on fs.

Benchmark with MySQL:

1M Records Insert without Index

time: 49s, insert/s: 20408

1M Records Insert with Index

time: 56s, insert/s: 17857

next we add 1M records to the same table with Index and 1M records

time: 56s, insert/s: 17857

exactly same performance, no loss on mysql on growth

We see Mongo has eat around 384 MB Ram during this test and load 3 cores of the cpu, MySQL was happy with 14 MB and load only 1 core.

Edorian was on the right way with his proposal, I will do some more Benchmark and I'm sure we can reach on a 2x Quad Core Server 50K Inserts/sec.

I think MySQL will be the right way to go.



回答3:

If you don't need to do queries, then database is not what you need. Use a log file.



回答4:

it's only stored for legal reasons.

And what about the detailed requirements? You mention the NoSQL solutions, but these can't promise the data is realy stored on disk. In PostgreSQL everything is transaction safe, so you're 100% sure the data is on disk and is available. (just don't turn of fsync)

Speed has a lot to do with your hardware, your configuration and your application. PostgreSQL can insert thousands of record per second on good hardware and using a correct configuration, it can be painfully slow using the same hardware but using a plain stupid configuration and/or the wrong approach in your application. A single INSERT is slow, many INSERT's in a single transaction are much faster, prepared statements even faster and COPY does magic when you need speed. It's up to you.



回答5:

Firebird can easily handle 5000 Insert/sec if table doesn't have indices.



回答6:

I don't know why you would rule out MySQL. It could handle high inserts per second. If you really want high inserts, use the BLACK HOLE table type with replication. It's essentially writing to a log file that eventually gets replicated to a regular database table. You could even query the slave without affecting insert speeds.



回答7:

Depending in your system setup MySql can easily handle over 50.000 inserts per sec.

For tests on a current system i am working on we got to over 200k inserts per sec. with 100 concurrent connections on 10 tables (just some values).

Not saying that this is the best choice since other systems like couch could make replication/backups/scaling easier but dismissing mysql solely on the fact that it can't handle so minor amounts of data it a little to harsh.

I guess there are better solutions (read: cheaper, easier to administer) solutions out there.



回答8:

If money plays no role, you can use TimesTen. http://www.oracle.com/timesten/index.html

A complete in memory database, with amazing speed.



回答9:

I would use the log file for this, but if you must use a database, I highly recommend Firebird. I just tested the speed, it inserts about 10k records per second on quite average hardware (3 years old desktop computer). The table has one compound index, so I guess it would work even faster without it:

milanb@kiklop:~$ fbexport -i -d test -f test.fbx -v table1 -p **
Connecting to: 'LOCALHOST'...Connected.
Creating and starting transaction...Done.
Create statement...Done.
Doing verbatim import of table: TABLE1
Importing data...
SQL: INSERT INTO TABLE1 (AKCIJA,DATUM,KORISNIK,PK,TABELA)  VALUES (?,?,?,?,?)
Prepare statement...Done.
Checkpoint at: 1000 lines.
Checkpoint at: 2000 lines.
Checkpoint at: 3000 lines.
...etc.
Checkpoint at: 20000 lines.
Checkpoint at: 21000 lines.
Checkpoint at: 22000 lines.

Start   : Thu Aug 19 10:43:12 2010
End     : Thu Aug 19 10:43:14 2010
Elapsed : 2 seconds.
22264 rows imported from test.fbx.

Firebird is open source, and completely free even for commercial projects.



回答10:

I believe the answer will as well depend on hard disk type (SSD or not) and also the size of the data you insert. I was inserting a single field data into MongoDB on a dual core Ubuntu machine and was hitting over 100 records per second. I introduced some quite large data to a field and it dropped down to about 9ps and the CPU running at about 175%! The box doesn't have SSD and so I wonder if I'd have gotten better with that.

I also ran MySQL and it was taking 50 seconds just to insert 50 records on a table with 20m records (with about 4 decent indexes too) so as well with MySQL it will depend on how many indexes you have in place.