I have read that:
The blockchain database isn’t stored in any single location, meaning the records it keeps are truly public and easily verifiable. No centralized version of this information exists for a hacker to corrupt. Hosted by millions of computers simultaneously, its data is accessible to anyone on the internet.
So my question is can we store blockchain in for example sql? or it can only be stored in a database that works on its own?
Currently decentralized blockchain applications have few options to store data. Decentralized storage options are:
- Storing everything in blockchain itself
- Peer to peer file system, such as IPFS Decentralized cloud file storages, such as Storj, Sia, Ethereum Swarm, etc.
- Distributed Databases, such as Apache Cassandra,
Rethink DB, etc.
- BigChainDB
- Ties DB
Let’s consider them all in detail:
- Storing everything in blockchain itself: Storing everything in blockchain is the simplest solution. Currently most of the simple decentralized applications work exactly this way. However, this approach has significant drawbacks. First of all transactions to blockchain are slow to confirm. It may seem to be fast for money transfer (anyone can wait a minute), but it is extremely slow for a rich application data flow. Rich application may require many thousands transactions per second. Secondly, it is immutable. The immutability is the strength of blockchain that gives it high robustness but it is a weakness for a data storage. User may change their profile or replace their photo, still all the previous data will sit in blockchain forever and can be seen by anyone. The immutability results in one more drawback - the capacity. If all the applications would keep their data in blockchain, the blockchain size will grow rapidly, exceeding publicly available hard drive capacity. Full nodes can require special hardware. It may result in dangerous centralization of blockchain. That’s why storing data in blockchain only is not a good option for a rich decentralized application.
- Peer to peer file system, such as InterPlanetary File System. IPFS allows to share files on client computers and unites them in the global file system. The technology is based on BitTorrent protocol and Distributed Hash Table. There are several good moments. It is really peer to peer - to share anything first put it on your own computer. It will be downloaded only if anyone needs it. It is content addressable, so it is impossible to forge content by the given address. Popular files can be downloaded very quickly thanks to BitTorrent protocol. However it also has some drawbacks. You should stay online if you want to share your files. At least before someone becomes interested and wants to download them from you. It serves only static files, they can not be modified or removed once uploaded. And of course you can not search these files by their meaningful content.
- Decentralized cloud file storages: There are also decentralized cloud file storages that lift some of IPFS limitations. From the user’s point of view these storages are just cloud storages like Dropbox, for example. The difference is that the content is hosted on user’s computers who offer their hard drive space for rent, rather than in datacenters. There are plenty of such projects nowadays. For example, Sia, Storj, Ethereum Swarm. You don’t need to stay online to share your files anymore. Just upload the file and it is available in the cloud. These storages are highly reliable, fast enough, have enormous capacity. Still they serve static files only, no content search anyway and, since they are built on the rented hardware, they are not free.
- Distributed Databases: Since we need to store structured data and seek for advanced query capabilities we may look at the distributed noSql databases. Why noSql? Because strict transactional SQL databases can not be truly distributed due to the restrictions of the CAP-theorem. To make a database distributed we must sacrifice either consistency or availability. NoSQL databases choose availability over consistency replacing it with so called “eventual consistency” where all the database nodes in the network become consistent some time later. There are many mature realizations of such databases, for example MongoDB, Apache Cassandra, RethinkDB and so on. They are very good - fast, scalable, fault tolerant, support rich query language but still have fatal drawback for our application. They are not Byzantine-proof. All the nodes of the cluster fully trust each other. So any malicious node can destroy the whole database.
- BigChainDB: There is another project called BigChainDB that claims to solve the data storage and transaction speed problem. It is also a blockchain but with enormous data capacity and really fast transactions. Let us see how it is possible. BigChainDB is build upon RethinkDB cluster, I mentioned this NoSQL database on the previous slide. BigChainDB uses it to store all the blocks and transactions. That is why it shows such a high throughput - it is the one of the underlying noSQL database. All the BigChainDB nodes (denoted BDB on the slide) are connected to the cluster and have full write access to the database. Here comes a problem - the whole BigChainDB is not byzantine-proof! Any malicious BDB node can destroy the RethinkDB cluster. The BigChainDB team is aware of this problem and promises to solve it sometime in the future, however it is the corner stone of the architecture and changing it may not be possible.Anyway, BigChainDB may be good for a private blockchain. But in my opinion, to avoid confusion it should have been named BigPrivateBlockchain. It is not an option for a public storage.
- Ties DB: The currently available options could be a good public database. The closest to the ideal are the noSql databases. The only thing they lack is byzantine fault tolerance. The Ties.Network Database: ties.network is a deep modification of the Cassandra database and offers a preferable solution: The TiesDB inherits the majority of features from the underlying noSQL databases and adds byzantine fault tolerance and incentives. With these features it can become a public database and enable feature-rich applications on Ethereum and other blockchains with smart contracts. The database is writable by any user. But the users are identified by their public key and all the requests are signed. Once created, record remembers its creator who becomes an owner of the record. After that the record can be modified only by the owner. Everyone can read all records, because the database is public. All the permissions are checked on request and replication. Additional permissions can be managed via a smart contract.
source: here