Can ArangoDB scale like MongoDB or CouchDB

I am reading about ArangoDB and it is more interesting, but I cannot find where in the documentation how ArangoDB scales. Does ArangoDB scale and can it use sharding like MongoDB or CouchDB?

+3


source to share


2 answers


As I understand it, this does not allow delineating (up to version 2.0) but replicating. From link



AvocadoDB allows repetitions easily. We like the zero admin principle. Replication with AvocadoDB is very simple: enter the IP address and click!

The following types of replication are for version 2:

  • master master synchronous,
  • master master asynchronous,
  • master-slave synchronous,
  • master-slave asynchronous
0


source


EDIT

ArangoDB supports customization since version 2.0.

Version 3.0 will bring VelocyPack , which is a binary JSON representation optimized for compactness, legibility and layout. It replaces the JSON form / form concept.

/ EDIT


I am the chief architect of ArangoDB.

monkegjinni is right, ArangoDB does not support sharding, but replication. Why?

Short version:

Offering support for fairly complex data models such as graphs and documents comes into conflict with how shard works. However, given the efficiency of modern SSDs and computers, we believe that almost all projects no longer need a shard. Today, a computer will easily store all data on one site. This requires load balancing replications that are supported by ArangoDB.

Long version:

There are actually scaling issues.

The first problem is to propagate a request across multiple servers to load balance the request.



ArangoDB will support this through synchronous write replication and read request propagation.

Note that most database systems follow a very similar path, that is, they support query distribution with either limited consistency guarantees, or they only allow one node to be written and propagate read requests. They have this limitation because it is impossible to distribute write requests and maintain complete consistency. And doing so would be ineffective to deny the gain we wanted to achieve through distribution.

The second problem is the distribution of data across multiple servers, allowing for larger datasets.

ArangoDB does not support distributing data across multiple servers.

We made this decision because distributing data across multiple servers always comes at a price.

This price can be very explicit. For example, it may be that the data model is very limited. This is the route where key values ​​such as Dynamo or RIAK are stored. Here, the data model and the queries supported are so simple that it is always possible to route a query to the server (or a small number of servers) that contains the requested value.

Please note that we believe this approach is applicable for some applications (for example, Amazons databases). But we think that the number of applications that really need to store so much data that they need to distribute it on a large number of servers, and therefore should restrict the access pattern to a key, is very small.

Or the price may be hidden. This is, for example, the case if the data is distributed and the database system allows common queries. In this case, the request should be distributed across all servers (because the data you are looking for can live on any of the servers). This makes the queries ineffective.

Arranging ArangoDB is rather to compress the most onto one server (well ArangoDB supports multiple servers, but to support availability). To do this, he uses two main strategies.

One strategy is to use an SSD. Note that SSD storage capacity is growing at an incredible rate (you can buy a terabyte of SSD for a lot less than a second server would cost). And endurance (the total amount of data that can be written to an SSD) is coming up to Petabytes (now that manufacturers are finally getting wear leveling algorithms), so SSD reliability is no longer an issue. And the performance of these SSDs is very good (closer to main memory than regular drives).

Another strategy is to store data efficiently. ArangoDB uses forms to store documents: a form is information that attributes and attribute types a document has - all documents with the same form share a presentation of this information. This means documents can be stored in less space than is required to represent JSON or BSON.

+13


source







All Articles