ArangoDB

Summary

ArangoDB is a graph database system developed by ArangoDB Inc. ArangoDB is a multi-model database system since it supports three data models (graphs, JSON documents, key/value)[1] with one database core and a unified query language AQL (ArangoDB Query Language). AQL is mainly a declarative language[2] and allows the combination of different data access patterns in a single query.[3]

ArangoDB
Developer(s)ArangoDB GmbH
Initial release2011; 13 years ago (2011)
Stable release
3.11.5 / November 9, 2023; 5 months ago (2023-11-09)
Repository
  • github.com/arangodb/arangodb Edit this at Wikidata
Written inC++, JavaScript
TypeMulti-model database, Graph database, Document-oriented database, Key/Value database, Full-text Search Engine
LicenseBusiness Source License 1.1 and Arango Community License
Websitearangodb.com

ArangoDB is a NoSQL database system[4] but AQL is similar in many ways to SQL,[5] it uses RocksDB as a storage engine.

History edit

ArangoDB Inc. was founded in 2015 by Claudius Weinberger and Frank Celler.[6] They originally called the database system “A Versatile Object Container", or AVOC for short, leading them to call the database AvocadoDB.[7][8][9] Later, they changed the name to ArangoDB.[10] The word "arango" refers to a little-known avocado variety grown in Cuba.[11]

In January 2017 ArangoDB raised a seed round investment of 4.2 million Euros led by Target Partners. In March 2019 ArangoDB raised 10 million dollars in series A funding[12] led by Bow Capital. In October 2021 ArangoDB raised 27.8 million dollars in series B funding led by Iris Capital.[13]

Release history edit

Release First Release Latest Minor Version Latest Release Feature Notes Reference
3.11 2023-05-30 3.11.5 2023-11-09
  • Faster query performance across search and graph.
  • Data science and analytics operational enhancements.
  • Improved user experience for database administration and management.
Release Notes
3.10 2022-10-04 3.10.11 2023-10-19
  • Native ARM support, including native support for Apple Silicon.
  • Support for computed values (persistent document attributes that are generated when a document is created or updated).
  • Parallelism for sharded graphs.
  • A graph traversal algorithm to query for all paths with the shortest value, between two documents.
Release Notes
3.9 2022-02-15 3.9.12 2023-08-23
  • Collections replicated on all cluster nodes can be combined with graphs sharded by document attributes to enable more local execution of graph queries ("Hybrid SmartGraphs", "Hybrid Disjoint SmartGraphs").
  • Language-agnostic tokenization of text ("Segmentation Analyzer").
Release Notes
3.8 2021-07-29 3.8.9 2023-03-27
  • Graph traversal algorithms to enumerate all paths between two vertices ("k Paths") and to emit paths in order of increasing edge weights ("Weighted Traversals").
  • Support for sliding window queries to aggregate adjacent documents, value ranges and time intervals.
  • Geo-spatial queries can be combined with full-text search.
  • Flexible data field pre-processing with custom queries ("AQL Analyzer") and the ability to chain built-in and custom analyzers ("Pipeline Analyzer").
  • Hardware-accelerated on-disk encryption.
Release Notes
3.7 2020-09-16 3.7.17 2022-02-01
  • Graphs replicated on all cluster nodes to execute graph traversals locally ("SatelliteGraphs").
  • Document validation using JSON Schema.
  • Wildcard and fuzzy search support for full-text search.
  • Key rotation for superuser JWT tokens, TLS certificates, and on-disk encryption keys.
Release Notes
3.6 2020-01-08 3.6.16 2021-09-06
  • Option to store all collections of a database on a single cluster node, to combine the performance of a single server and ACID semantics with a fault-tolerant cluster setup ("OneShard").
  • Parallel execution of queries on several cluster nodes.
  • Late document materialization to only fetch the relevant documents from SORT/LIMIT queries and early pruning of non-matching documents in full collection scans.
  • Inlining of certain subqueries to improve execution time.
Release Notes
3.5 2019-08-21 3.5.7 2020-12-30
  • Multi-document transactions with individual begin and commit / abort commands ("Stream Transactions").
  • Time-based removal of expired documents ("Time-to-live Index").
  • Stop condition support for graph traversals ("Pruning in Traversals").
  • Graph traversal algorithm to get multiple shortest paths ("k Shortest Paths").
  • Co-located joins in a cluster using identically sharded collections ("SmartJoins").
  • Consistent snapshot backup in cluster mode.
  • Custom text pre-processors for full-text search ("Configurable Analyzers").
  • Data masking capabilities for attributes containing sensitive data / PII when creating backups.
Release Notes
3.4 2018-12-06 3.4.11 2020-09-09
  • Integrated full-text search and information retrieval engine ("ArangoSearch").
  • Improved geo-spatial index with GeoJSON support.
  • Insert operations can be turned into a replace automatically, in case that the target document already exists ("Repsert").
  • Round-robin load-balancer support for cloud environments.
  • Query profiling to show detailed runtime information.
  • Cluster-distributed aggregation queries.
  • Native implementations in C++ of all built-in query functions.
  • Multi-threaded dump and restore operations.
Release Notes
3.3 2017-12-22 3.3.25 2020-02-28
  • Datacenter to Datacenter Replication for disaster recovery ("DC2DC").
  • Encrypted backups.
  • Deployment mode for single servers with automatic failover.
Release Notes
3.2 2017-07-20 3.2.18 2019-02-02
  • Distributed iterative graph processing with Pregel in single server and cluster.
  • Collections replicated on all cluster nodes to execute joins with sharded data locally ("SatelliteCollections").
  • Fault-tolerant microservices.
  • Support for composable, distance-based geo-queries.
  • Export utility for multiple formats.
  • Encryption of on-disk data.
  • LDAP authentication.
Release Notes
3.1 2016-11-03 3.1.29 2018-06-23
  • Value-based sharding of large graph datasets for better data locality when traversing graphs ("SmartGraphs").
  • Support for vertex-centric indexes for more efficient graph traversals with filter conditions.
  • New viewer for large graphs, supporting WebGL.
  • Binary wire format ("VelocyStream").
  • Low-latency request handling using a boost-ASIO server infrastructure.
  • Improved query editor and query explain output.
  • Audit logging.
Release Notes
3.0 2016-07-23 3.0.12 2016-11-23
  • Cluster support with synchronous replication and automatic failover.
  • Binary storage format ("VelocyPack").
  • Persistent indexes that are stored on disk for faster restarts.
Release

Notes

Features edit

  • JSON: ArangoDB uses JSON as a default storage format,[14] but internally it uses ArangoDB VelocyPack – a fast and compact binary format for serialization and storage.[15] ArangoDB can natively store a nested JSON object as a data entry inside a collection. Therefore, there is no need to disassemble the resulting JSON objects. Thus, the stored data would simply inherit the tree structure of the JSON data.
  • Predictable performance: ArangoDB is written mainly in C++[16] and manages its own memory to avoid unpredictable performance arising from garbage collection.
  • Scaling: ArangoDB provides scaling through clustering.[17]
  • Reliability: ArangoDB provides datacenter-to-datacenter replication.[18]
  • Kubernetes: ArangoDB runs on Kubernetes, including cloud-based Kubernetes services Amazon Elastic Kubernetes Service (EKS), Google Kubernetes Engine (GKE), and Microsoft Azure Kubernetes Service (AKS).[19]
  • Microservices: ArangoDB provides integration with native JavaScript microservices directly on top of the DBMS using the Foxx framework.[20]
  • Multiple query languages: The database has its own query language, AQL (ArangoDB Query Language), and also provides GraphQL to write flexible native web services directly on top of the DBMS.[21]
  • Search: ArangoDB's search engine combines boolean retrieval capabilities with generalized ranking components allowing for data retrieval based on a precise vector space model.[22]
  • Pregel algorithm: Pregel is a system for large scale graph processing.[23] Pregel is implemented in ArangoDB and can be used with predefined algorithms, e.g. PageRank, Single-Source Shortest Path and Connected components.[24]
  • Transactions: ArangoDB supports user-definable transactions. Transactions in ArangoDB are atomic, consistent, isolated, and durable (ACID), but only if data is not sharded.[25]

AQL (ArangoDB Query Language) is the SQL-like query language[26] used in ArangoDB. It supports CRUD operations for both documents (nodes) and edges, but it is not a data definition language (DDL). AQL does support geospatial queries.

AQL is JSON-oriented:

// Return every document in a collection
FOR doc IN collection 
  RETURN doc
  
// Count the number of documents in a collection
FOR doc IN collection
    COLLECT WITH COUNT INTO length
    RETURN length

// Add a new document into our collection
INSERT { _key: "john", name: "John", age: 45 } INTO collection

// Update document with key of “john” to have age 46.
UPDATE { _key: "john", age: 46 } IN collection

// Add an attribute numberOfLogins for all users with status active:
FOR u IN users
  FILTER u.active == true
  UPDATE u WITH { numberOfLogins: 0 } IN users

Editions edit

  • Community Edition: ArangoDB Community Edition is a graph database with native multi-model database capabilities written mainly in C++ and was available under an open-source license (Apache 2) until October 2023. It was then changed to "ArangoDB Community License, which limits its use for commercial purposes and imposes a 100GB limit on dataset size within a single cluster" [27]
  • Commercial self-managed: ArangoDB Enterprise is a paid subscription that includes graph-aware sharding (called “SmartGraphs”)[28] and collection replication (called “Satellite Collections”) to reduce query times,[29] and increased security.[30]
  • Cloud: ArangoDB is offered as a cloud service called Oasis, providing ArangoDB databases as a Service (DBaaS). ArangoDB Oasis provides the functionality of an ArangoDB cluster deployment while minimizing the amount of administrative effort required.[31] ArangoDB Oasis run on multiple cloud service providers, include AWS, Azure, and Google Cloud.[32]

See also edit

References edit

  1. ^ "Advantages of native multi-model in ArangoDB". ArangoDB. Retrieved 2022-07-26.
  2. ^ "ArangoDB Query Language (AQL) Introduction | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-07-26.
  3. ^ "AQL Query Patterns & Examples | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-07-26.
  4. ^ Celler, Frank (2012-03-07). "ArangoDB's design objectives". ArangoDB. Retrieved 2022-07-26.
  5. ^ "ArangoDB Query Language (AQL) Introduction | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-07-26.
  6. ^ "Variety Database". www.avocadosource.com. Retrieved 2022-07-27.
  7. ^ Ortell, Bill (2021-03-08), AvocadoDB, retrieved 2022-07-27
  8. ^ AvocadoDB explained, retrieved 2022-07-27
  9. ^ AvocadoDB Query Language Jan Steemann in english, retrieved 2022-07-27
  10. ^ ""AvocadoDB" becomes "ArangoDB"". ArangoDB. 2012-05-09. Retrieved 2022-07-27.
  11. ^ "Variety Database". www.avocadosource.com. Retrieved 2022-08-05.
  12. ^ Weinberger, Claudius (2019-03-14). "ArangoDB receives Series A Funding led by Bow Capital". ArangoDB. Retrieved 2022-07-27.
  13. ^ "ArangoDB Announces $27.8 Million Series B Investment to Accelerate Development of Next-Generation Graph ML, Providing Advanced Analytics and AI Capabilities at Enterprise Scale". ArangoDB. Retrieved 2022-07-27.
  14. ^ AvocadoDB explained, retrieved 2022-08-05
  15. ^ AvocadoDB Query Language Jan Steemann in english, retrieved 2022-08-05
  16. ^ ArangoDB, ArangoDB, 2022-08-05, retrieved 2022-08-05
  17. ^ "Cluster | ArangoDB Deployment Modes | Architecture | Manual | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-05.
  18. ^ "DC2DC Replication | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-05.
  19. ^ "Kubernetes | Tutorials | Manual | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-05.
  20. ^ "Foxx Microservices | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-05.
  21. ^ ArangoDB, ArangoDB, 2022-08-05, retrieved 2022-08-05
  22. ^ "ArangoSearch - Full-text search engine including similarity ranking capabilities". ArangoDB. Retrieved 2022-08-05.
  23. ^ "Stanford University Pregel White paper" (PDF).
  24. ^ "Pregel | Data Science | Manual | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-05.
  25. ^ "Transactions | Manual | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-05.
  26. ^ "Cluster | ArangoDB Deployment Modes | Architecture | Manual | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-11.
  27. ^ ArangoDB, ArangoDB, 2023-10-13, retrieved 2023-10-13
  28. ^ "ArangoDB SmartGraphs | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-11.
  29. ^ "ArangoDB SatelliteCollections | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-11.
  30. ^ "ArangoDB Enterprise Features". ArangoDB. Retrieved 2022-08-11.
  31. ^ "Getting Started with ArangoDB Oasis | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-11.
  32. ^ "ArangoDB Oasis". ArangoDB Oasis. Retrieved 2022-08-11.