memgraph · Josipmrden · Mar 13, 2025 · Mar 28, 2025 · Mar 28, 2025 · Apr 15, 2025
@@ -13,8 +13,8 @@ export default {
   "database-management": "Database management",
   "deployment": "Deployment",
   "clustering": "Clustering",
+  "memgraph-in-production": "Memgraph in production",
   "data-streams": "Data streams",
   "help-center": "Help center",
   "release-notes": "Release notes",
-  "memgraph-in-production": "Memgraph in production"
 }
@@ -44,14 +44,18 @@ Here are the currently available guides to help you deploy Memgraph effectively:
 ### [General suggestions](/memgraph-in-production/general-suggestions)
 A foundational guide covering universal best practices for any production deployment - recommended reading before anything else.
 
+### [Memgraph in high-throughput workloads](/memgraph-in-production/memgraph-in-high-throughput-workloads)
+Scale your write throughput while keeping up with fast-changing, high-velocity graph data.
+
+### [Memgraph in mission-critical workloads](/memgraph-in-production/memgraph-in-mission-critical-workloads)
+Suggestions on how to bring your Memgraph to production in mission-critical and high-availability workloads.
+
 ### [Memgraph in GraphRAG use cases](/memgraph-in-production/memgraph-in-graphrag)
 Learn how to optimize Memgraph for Retrieval-Augmented Generation (RAG) systems using graph data.
 
 ## 🚧 Guides in construction
 - Memgraph in transactional workloads
 - Memgraph in analytical workloads
-- Memgraph in mission critical workloads
-- Memgraph in high throughput workloads
 - Memgraph in supply chain use cases
 - Memgraph in cyber security use cases
 - Memgraph in fraud detection use cases
@@ -70,7 +74,7 @@ smooth transition to production.
 These guides focus on areas like performance benchmarking, testing, and operational readiness—offering additional tools 
 and frameworks that can help you get the most out of your Memgraph deployment.
 
-### [📊 Evaluating Memgraph](/memgraph-in-production/evaluating-memgraph)
+### [📊 Benchmarking Memgraph](/memgraph-in-production/benchmarking-memgraph)
   Learn how to properly **test Memgraph for performance and scalability**. 
   This guide walks you through performance and stress testing scenarios, benchmarking with real-world data, 
   and identifying key metrics that can help validate Memgraph’s fit for your application needs.

@@ -1,5 +1,7 @@
 export default {
   "general-suggestions": "General suggestions",
   "memgraph-in-graphrag": "Memgraph in GraphRAG use cases",
+  "memgraph-in-high-throughput-workloads": "Memgraph in high-throughput workloads",
+  "memgraph-in-mission-critical-workloads": "Memgraph in mission-critical workloads",
   "benchmarking-memgraph": "Benchmarking Memgraph",
 }
@@ -44,7 +44,7 @@ based on your specific use case.
 8. [Backup considerations](#backup-considerations) <br />
 Learn about how to preserve your data in Memgraph to prevent any data loss.
 
-9. [Importing mechanisms](#importing-mechanisms) <br /> 
+9. [Importing mechanisms](#importing-mechanisms) <br />
 Discover the best methods for importing your dataset into Memgraph, including Cypher queries, bulk loading, and integrations with other data sources.
 
 10. [Enterprise features you might require](#enterprise-features-you-might-require)  <br />

@@ -0,0 +1,236 @@
+---
+title: Memgraph in high-throughput workloads
+description: Suggestions on how to bring your Memgraph to production in high-throughput workloads. 
+---
+
+import { Callout } from 'nextra/components'
+import { CommunityLinks } from '/components/social-card/CommunityLinks'
+
+# Memgraph in high-throughput workloads
+
+<Callout type="info">
+👉 **Start here first**  
+Before diving into this guide, we recommend starting with the [**General suggestions**](/memgraph-in-production/general-suggestions) 
+page. It provides **foundational, use-case-agnostic advice** for deploying Memgraph in production.
+
+This guide builds on that foundation, offering **additional recommendations tailored to specific workloads**. 
+In cases where guidance overlaps, consider the information here as **complementary or overriding**, depending 
+on the unique needs of your use case.
+</Callout>
+
+## Is this guide for you?
+
+This guide is for you if you're working with **high-throughput graph workloads** where performance, consistency,
+and scale are critical.
+You’ll benefit from this content if:
+
+- ⚡ You’re handling **more than a thousand writes per second**, and your graph data is constantly changing at high velocity.  
+- 🔍 You want your **read performance to remain consistent**, even as new data is continuously ingested.  
+- 🔁 You’re dealing with **high volumes of concurrent reads and writes**, and need a database that can handle both without performance degradation.  
+- 🌊 Your data is flowing in from **real-time streaming systems** like **Kafka**, and you need a database that can keep up.  
+
+If this sounds like your use case, this guide will walk you through how to configure and scale Memgraph for **reliable, high-throughput performance** in production.
+
+## Why choose Memgraph for high-throughput use cases?
+
+When your workload involves thousands of writes per second and concurrent access to ever-changing graph data, 
+Memgraph provides the performance and architecture needed to keep up—without compromise. 
+
+Here's why Memgraph is a great fit for high-throughput use cases:
+
+- **In-memory storage engine**: Memgraph operates entirely in-memory, eliminating the need to write to disk on every transaction.
+  This allows it to **scale write throughput far beyond traditional disk-based databases**. Unlike systems that rely on LRU
+  or OS-level caching, where **cache invalidation can degrade read performance during heavy writes**, Memgraph offers
+  **predictable read latency** even under constant data changes.  
+
+  While many graph databases **max out around 1,000 writes per second**, Memgraph can handle **up to 50x more** 
+  (see image below), making it ideal for **high-velocity, write-intensive workloads**.
+
+  ![](/pages/memgraph-in-production/benchmarking-memgraph/realistic-workload.png)
+
+- **Non-blocking reads and writes with MVCC**: Built on multi-version concurrency control (MVCC), 
+  Memgraph ensures that **writes don’t block reads** and **reads don’t block writes**, allowing each to scale independently.
+
+- **Fine-grained locking** : Locking happens at the node and relationship level, enabling **highly concurrent writes** 
+  and minimizing contention across threads.
+
+- **Lock-free skiplist storage**: Memgraph uses **lock-free, concurrent skip list structures** for storing nodes,
+  relationships, and indices, leading to faster data access and minimal coordination overhead between threads.
+
+- **Snapshot isolation by default**: Unlike many databases that rely on **read-committed** isolation 
+  (which can return inconsistent data), Memgraph provides **snapshot isolation**, ensuring data accuracy and 
+  consistency in real-time queries.
+
+- **Inter-query parallelization**: Each read and write query is handled on its own CPU core, meaning Memgraph can 
+  **scale horizontally on a single machine** based on your hardware.
+
+- **Horizontal read scaling with high availability**: Memgraph supports [replication](/clustering/replication) and
+  [high availability](/clustering/high-availability), allowing you to distribute **read traffic across multiple replicas**.
+  These replicas can also power **secondary workloads** like GraphRAG, analytics, or ML pipelines, **without affecting
+  the performance of the main write-intensive instance**.
+
+## What is covered?
+
+The suggestions for high-throughput workloads **complement** several key sections in the 
+[general suggestions guide](/memgraph-in-production/general-suggestions). These sections offer important context and
+additional best practices tailored for performance, stability, and scalability in high-throughput systems:
+
+- [Choosing the right Memgraph flag set](#choosing-the-right-memgraph-flag-set) <br />
+  Memgraph offers specific flags to optimize streaming graph updates.
+
+- [Choosing the right Memgraph storage mode](#choosing-the-right-memgraph-storage-mode) <br />
+  Guidance on selecting the optimal **storage mode** for high-throughput use cases, depending on whether your focus is
+  analytical speed or transactional safety.
+
+- [Importing mechanisms](#importing-mechanisms) <br />
+  With multithreaded writes, learn how to avoid write-write conflicts. Connect your streaming sources to Memgraph.
+
+- [Enterprise features you might require](#enterprise-features-you-might-require)  <br />
+  Understand which **enterprise features** — such as high availability, and dynamic graph algorithms will keep your real-time use case smooth.
+
+- [Queries that best suit your workload](#queries-that-best-suit-your-workload)
+  Learn how to optimize update queries coming at the database.
+
+## Choosing the right Memgraph flag set
+
+When streaming data from systems like Kafka, the incoming payload is often **standardized**, 
+meaning that even when a node or relationship is updated, **some property values might remain unchanged**.
+
+By default, Memgraph sets the flag `--storage-delta-on-identical-property-update=true`, which **updates all properties**
+of a node or relationship during an update, even if the new value is identical to the existing one.
+This can introduce unnecessary write overhead.
+
+To optimize for **higher throughput** in scenarios where most incoming updates do not change all property values,
+it’s recommended to set:
+
+```bash
+--storage-delta-on-identical-property-update=false
+```
+
+With this setting, Memgraph will **only create delta records for properties that have actually changed**,
+reducing internal write operations and improving overall system throughput—especially important in high-velocity
+streaming use cases.
+
+## Choosing the right Memgraph storage mode
+
+High-throughput scenarios in Memgraph can run effectively on both `IN_MEMORY_TRANSACTIONAL` and `IN_MEMORY_ANALYTICAL`
+storage modes, depending on your specific workload requirements.
+
+If your workload meets the following conditions:
+
+- You are **updating properties** on existing nodes and relationships,  
+- You are **appending** new nodes and relationships to the graph,  
+- You are **not performing deletes**,
+
+then it may be worth considering switching to `IN_MEMORY_ANALYTICAL` mode.  
+This mode allows **writes to be multithreaded**, unlocking **near limitless write speeds** by parallelizing ingestion
+across CPU cores.
+
+However, keep in mind:
+
+- If you require **replication**, **high availability**, or **ACID guarantees**, you must use `IN_MEMORY_TRANSACTIONAL` mode.  
+- `IN_MEMORY_ANALYTICAL` is optimized for **bulk ingestion** and **read-only analytics**, but it 
+  **does not support transactional rollback**, as it doesn’t create delta objects during writes.  
+  Additionally, **WALs (write-ahead logs) are not generated** in this mode, meaning recovery relies solely on **snapshot creation**.
+
+Learn more about [storage modes](/fundamentals/storage-memory-usage#storage-modes) in our documentation.
+
+## Importing mechanisms
+
+Memgraph natively supports **Apache Kafka** and **Apache Pulsar** consumers to directly ingest 
+[data streams](/data-streams). However, depending on the system architecture and flexibility
+requirements, some users prefer building **custom Kafka consumers** that programmatically push data into Memgraph.
+
+### Handling write-write conflicts
+If you are ingesting data from **multiple topics** or multiple producers, it's important to 
+**handle potential write-write conflicts**—especially when running in `IN_MEMORY_TRANSACTIONAL` mode.  
+You can learn more about these scenarios in our
+[conflicting transactions documentation](/help-center/errors/transactions#conflicting-transactions).
+
+To safely manage these conflicts:
+
+- Use a driver that supports **managed transactions** with automatic retries. For example, **Neo4j drivers**
+  offer an `execute_write()` method that **automatically retries transient errors** (errors that can be safely retried without client-side intervention).
+- Wrap your **write operations inside retryable transactions** if there are **multiple concurrent writers**.
+- If you **only have a single writer**, you generally don’t need to worry about transient errors or retries.
+
+### Idempotency concept
+In addition, when working with streams, **unexpected errors may require replaying the stream**.
+To ensure **data consistency even after multiple replays**, it's crucial to make your ingestion logic **idempotent**.  
+We strongly recommend using **`MERGE` clauses** instead of `CREATE` when inserting data into Memgraph.
+`MERGE` ensures that nodes and relationships are **created only if they don't already exist**, preventing
+duplicates and keeping your graph clean no matter how many times the same event is replayed.
+
+## Enterprise features you might require
+
+For production-grade high-throughput deployments, you may need advanced capabilities to ensure
+**availability**, **data management**, and **tenant isolation**. Memgraph offers several enterprise features
+designed to support these needs:
+
+- **Replication, high availability, and automatic failover**  
+  If you require your system to be **available at all times**, Memgraph supports
+  [clustering and high availability](/clustering/high-availability), allowing you to minimize downtime and recover
+  automatically from failures.
+
+- **Node and relationship TTL (time-to-live)**  
+  In high-ingestion environments, you may need to automatically **clean up stale data** after a certain retention period.
+  Memgraph supports [time-to-live (TTL)](/querying/time-to-live) mechanisms for both **nodes** and **relationships**,
+  ensuring your graph remains manageable and efficient over time.
+
+- **Multi-tenancy**  
+  Some workloads require a **separate graph database per customer** to ensure strict data isolation and security.
+  Memgraph supports [multi-tenancy](/database-management/multi-tenancy), enabling you to manage multiple independent
+  graphs within a single Memgraph instance.
+
+## Queries that best suit your workload
+
+When ingesting data from streams, it's best to keep your Cypher queries **simple, idempotent, and efficient**. A typical ingestion query should look like:
+
+```cypher
+MERGE (n:Label) SET n += $row;
+```
+
+This approach ensures **idempotency** (safe reprocessing of the same events) and **minimizes query execution time** by keeping the transaction lightweight.  
+Keep in mind that adding **complex business logic** or **customization** to the ingestion queries will **increase query latency**, so it's always a good practice to **profile your queries** using Memgraph’s [`PROFILE` tool](/querying/clauses/profile) to understand and optimize performance.
+
+---
+
+### Dynamic labels and edge types
+
+Memgraph also supports:
+
+- [Dynamic node label creation](/querying/clauses/create#14-creating-node-labels-dynamically)  
+- [Dynamic relationship type creation](/querying/clauses/create#23-creating-relationship-types-dynamically)
+
+However, **dynamic creation is only supported with `CREATE` operations**, and **matching or merging dynamically created
+labels and types is not supported**.
+
+If your payload contains **dynamic labels or edge types** and you still need **idempotency**, you have two options:
+
+- **Programmatically construct your Cypher query strings** based on the payload to ensure correct label/type usage before sending the query to Memgraph.
+- **Optionally use the [`merge`](/advanced-algorithms/available-algorithms/merge) procedure from MAGE**
+
+<Callout type="warning">
+Note: While MAGE procedures are **written in C++ and highly optimized**, they still introduce **slightly more overhead**
+compared to **pure Cypher**, as they are executed as external modules. We recommend favoring pure Cypher when
+possible for the **highest performance**.
+</Callout>
+
+### Using `convert.str2object` for parsing nested properties
+
+When working with streamed data, sometimes your incoming payload contains **serialized JSON strings** that need to be
+transformed into property maps inside your graph.
+Memgraph provides the [`convert.str2object` function](/querying/functions#conversion-functions) to easily handle this scenario.
+
+Example usage:
+
+```cypher
+WITH convert.str2object('{"name":"Alice", "age":30}') AS props
+MERGE (n:Person)
+SET n += props;
+```
+
+This function **parses a JSON-formatted string into a Cypher map**, making it very useful for flexible ingestion pipelines
+where the payload structure might vary slightly or be semi-structured.
+
+<CommunityLinks/>