The BASE acronym is used to describe the properties of certain databases, usually NoSQL databases. It's often referred to as the opposite of ACID.
There are only few articles that touch upon the details of BASE, whereas ACID has plenty of articles that elaborate on each of the atomicity, consistency, isolation and durability properties. Wikipedia only devotes a few lines to the term.
This leaves me with some questions about the definition:
Basically Available, Soft state, Eventual consistency
I have interpreted these properties as follows, using this article and my imagination:
Basically available could refer to the perceived availability of the data. If a single node fails, part of the data won't be available, but the entire data layer stays operational.
- Is this interpretation correct, or does it refer to something else?
- Update: deducing from Mau's answer, could it mean the entire data layer is always accepting new data, i.e. there are no locking scenarios that prevent data from being inserted immediately?
Soft state: All I could find was the concept of data needing a period refresh. Without a refresh, the data will expire or be deleted.
- Automatic deletion of data in a database seems strange to me.
- Expired or stale data makes more sense. But this concept would apply to any type of redundant data storage, not just NoSQL. Does it describe something else then?
Eventual consistency means that updates will eventually ripple through to all servers, given enough time.
- This property is clear to me.
Can someone explain these properties in detail?
Or is it just a far-fetched and meaningless acronym that refers to the concepts of acids and bases as found in chemistry?
The BASE acronym was defined by Eric Brewer, who is also known for formulating the CAP theorem.
The CAP theorem states that a distributed computer system cannot guarantee all of the following three properties at the same time:
- Consistency
- Availability
- Partition tolerance
A BASE system gives up on consistency.
- Basically available indicates that the system does guarantee availability, in terms of the CAP theorem.
- Soft state indicates that the state of the system may change over time, even without input. This is because of the eventual consistency model.
- Eventual consistency indicates that the system will become consistent over time, given that the system doesn't receive input during that time.
Brewer does admit that the acronym is contrived:
I came up with [the BASE] acronym with my students in their office earlier that year. I agree it is contrived a bit, but so is "ACID" -- much more than people realize, so we figured it was good enough.
It has to do with BASE: the BASE jumper kind is always Basically Available (to new relationships), in a Soft state (none of his relationship last very long) and Eventually consistent (one day he will get married).
It could just be because ACID is one set of properties that substances show( in Chemistry) and BASE is a complement set of them.So it could be just to show the contrast between the two that the acronym was made up and then 'Basically Available Soft State Eventual Consistency' was decided as it's full-form.
To add to the other answers, I think the acronyms were derived to show a scale between the two terms to distinguish how reliable transactions or requests where between RDMS versus Big Data.
From this article acid vs base
In Chemistry, pH measures the relative basicity and acidity of an
aqueous (solvent in water) solution. The pH scale extends from 0
(highly acidic substances such as battery acid) to 14 (highly alkaline
substances like lie); pure water at 77° F (25° C) has a pH of 7 and is
neutral.
Data engineers have cleverly borrowed acid vs base from chemists and
created acronyms that while not exact in their meanings, are still apt
representations of what is happening within a given database system
when discussing the reliability of transaction processing.
One other point, since I work with Big Data using Elasticsearch. To clarify, an instance of Elasticsearch is a node and a group of nodes form a cluster.
To me from a practical standpoint, BA (Basically Available), in this context, has the idea of multiple master nodes to handle the Elasticsearch cluster and it's operations.
If you have 3 master nodes and the currently directing master node goes down, the system stays up, albeit in a less efficient state, and another master node takes its place as the main directing master node. If two master nodes go down, the system still stays up and the last master node takes over.