Flush to disk on commit 👉🏻 MongoDB durable writes (ACID)
Franck Pachot

Franck Pachot @franckpachot

About: 🥑 Developer Advocate at 🍃 MongoDB, 🔶 AWS Data Hero, 🐘 PostgreSQL fan,▝▞ YugabyteDB expert, 🅾️ Oracle Certified Master, and 💚 loving all databases 🛢️

Location:
Lausanne, Switzerland
Joined:
Nov 12, 2018

Flush to disk on commit 👉🏻 MongoDB durable writes (ACID)

Publish Date: Jun 28
9 4

A Twitter (𝕏) thread was filled with misconceptions about MongoDB, spreading fear, uncertainty, and doubt (FUD). This led one user to question whether MongoDB acknowledges writes before they are actually flushed to disk:

Doesn't MongoDB acknowledge writes before it's actually flushed them to disk?

MongoDB, like many databases, employs journaling—also known as write-ahead logging (WAL)—to ensure durability (the D in ACID) with high performance. This involves safely recording write operations in the journal, and ensuring they are flushed to disk before the commit is acknowledged. Further details can be found in the documentation under Write Concern and Journaling

Here is how you can test it, in a lab, with Linux STRACE and GDB, to debunk the myths.

Start the lab

I created a local MongoDB server. I uses a single-node local atlas cluster here but you can do the same with replicas:

atlas deployments setup  atlas --type local --port 27017 --force

Enter fullscreen mode Exit fullscreen mode

Start it if it was stopped, and connect with MongoDB Shell:

atlas deployment start atlas
mongosh

Enter fullscreen mode Exit fullscreen mode

Trace the system calls with strace

In another terminal, I used strace to display the system calls (-e trace) to write (pwrite64) and sync (fdatasync) the files, with the file names (-yy), by the MongoDB server process (-p $(pgrep -d, mongod)) and its threads (-f), with the execution time and timestamp (-tT):

strace -tT -fp $(pgrep -d, mongod) -yye trace=pwrite64,fdatasync -qqs 0

Enter fullscreen mode Exit fullscreen mode

Some writes and sync happen in the background

[pid 2625869] 08:26:13 fdatasync(11</data/db/WiredTiger.wt>) = 0 <0.000022>                                                                                    
[pid 2625869] 08:26:13 pwrite64(13</data/db/journal/WiredTigerLog.0000000010>, ""..., 384, 19072) = 384 <0.000024>                                             
[pid 2625869] 08:26:13 fdatasync(13</data/db/journal/WiredTigerLog.0000000010>) = 0 <0.002123>                                                                 
[pid 2625868] 08:26:13 pwrite64(13</data/db/journal/WiredTigerLog.0000000010>, ""..., 128, 19456) = 128 <0.000057>                                             
[pid 2625868] 08:26:13 fdatasync(13</data/db/journal/WiredTigerLog.0000000010>) = 0 <0.002192>                                                                 
[pid 2625868] 08:26:23 pwrite64(13</data/db/journal/WiredTigerLog.0000000010>, ""..., 384, 19584) = 384 <0.000057>                                             
[pid 2625868] 08:26:23 fdatasync(13</data/db/journal/WiredTigerLog.0000000010>) = 0 <0.002068>                                                                 
[pid 2625868] 08:26:33 pwrite64(13</data/db/journal/WiredTigerLog.0000000010>, ""..., 384, 19968) = 384 <0.000061>                                             
[pid 2625868] 08:26:33 fdatasync(13</data/db/journal/WiredTigerLog.0000000010>) = 0 <0.002747>                                                                 
[pid 2625868] 08:26:43 pwrite64(13</data/db/journal/WiredTigerLog.0000000010>, ""..., 384, 20352) = 384 <0.000065>                                             
[pid 2625868] 08:26:43 fdatasync(13</data/db/journal/WiredTigerLog.0000000010>) = 0 <0.003008>                                                                 
[pid 2625868] 08:26:53 pwrite64(13</data/db/journal/WiredTigerLog.0000000010>, ""..., 384, 20736) = 384 <0.000075>                                             
[pid 2625868] 08:26:53 fdatasync(13</data/db/journal/WiredTigerLog.0000000010>) = 0 <0.002092>                                                                 
[pid 2625868] 08:27:03 pwrite64(13</data/db/journal/WiredTigerLog.0000000010>, ""..., 384, 21120) = 384 <0.000061>                                             
[pid 2625868] 08:27:03 fdatasync(13</data/db/journal/WiredTigerLog.0000000010>) = 0 <0.002527>                                                                 
[pid 2625869] 08:27:13 fdatasync(13</data/db/journal/WiredTigerLog.0000000010>) = 0 <0.000033>                                                                 
Enter fullscreen mode Exit fullscreen mode

Write to the collection

In the MongoDB shell, I created a collection and ran ten updates:

db.mycollection.drop();
db.mycollection.insert( { _id: 1, num:0 });

for (let i = 1; i <= 10; i++) {
 print(` ${i} ${new Date()}`)
 db.mycollection.updateOne( { _id: 1 }, { $inc: { num: 1 } });
 print(` ${i} ${new Date()}`)
}

Enter fullscreen mode Exit fullscreen mode

The strace output the following when running the loop of ten updates:

[pid 2625868] 08:33:07 pwrite64(13</data/db/journal/WiredTigerLog.0000000010>, ""..., 512, 76288) = 512 <0.000066>                                             
[pid 2625868] 08:33:07 fdatasync(13</data/db/journal/WiredTigerLog.0000000010>) = 0 <0.001865>                                                                 
[pid 2625868] 08:33:07 pwrite64(13</data/db/journal/WiredTigerLog.0000000010>, ""..., 512, 76800) = 512 <0.000072>                                             
[pid 2625868] 08:33:07 fdatasync(13</data/db/journal/WiredTigerLog.0000000010>) = 0 <0.001812>                                                                 
[pid 2625868] 08:33:07 pwrite64(13</data/db/journal/WiredTigerLog.0000000010>, ""..., 512, 77312) = 512 <0.000056>                                             
[pid 2625868] 08:33:07 fdatasync(13</data/db/journal/WiredTigerLog.0000000010>) = 0 <0.001641>                                                                 
[pid 2625868] 08:33:07 pwrite64(13</data/db/journal/WiredTigerLog.0000000010>, ""..., 512, 77824) = 512 <0.000043>                                             
[pid 2625868] 08:33:07 fdatasync(13</data/db/journal/WiredTigerLog.0000000010>) = 0 <0.001812>                                                                 
[pid 2625868] 08:33:07 pwrite64(13</data/db/journal/WiredTigerLog.0000000010>, ""..., 512, 78336) = 512 <0.000175>                                             
[pid 2625868] 08:33:07 fdatasync(13</data/db/journal/WiredTigerLog.0000000010>) = 0 <0.001944>                                                                 
[pid 2625868] 08:33:07 pwrite64(13</data/db/journal/WiredTigerLog.0000000010>, ""..., 512, 78848) = 512 <0.000043>                                             
[pid 2625868] 08:33:07 fdatasync(13</data/db/journal/WiredTigerLog.0000000010>) = 0 <0.001829>                                                                 
[pid 2625868] 08:33:07 pwrite64(13</data/db/journal/WiredTigerLog.0000000010>, ""..., 512, 79360) = 512 <0.000043>                                             
[pid 2625868] 08:33:07 fdatasync(13</data/db/journal/WiredTigerLog.0000000010>) = 0 <0.001917>                                                                 
[pid 2625868] 08:33:07 pwrite64(13</data/db/journal/WiredTigerLog.0000000010>, ""..., 512, 79872) = 512 <0.000050>                                             
[pid 2625868] 08:33:07 fdatasync(13</data/db/journal/WiredTigerLog.0000000010>) = 0 <0.002260>                                                                 
[pid 2625868] 08:33:07 pwrite64(13</data/db/journal/WiredTigerLog.0000000010>, ""..., 512, 80384) = 512 <0.000035>                                             
[pid 2625868] 08:33:07 fdatasync(13</data/db/journal/WiredTigerLog.0000000010>) = 0 <0.001940>                                                                 
[pid 2625868] 08:33:07 pwrite64(13</data/db/journal/WiredTigerLog.0000000010>, ""..., 512, 80896) = 512 <0.000054>                                             
[pid 2625868] 08:33:07 fdatasync(13</data/db/journal/WiredTigerLog.0000000010>) = 0 <0.001984>                                                                 
Enter fullscreen mode Exit fullscreen mode

Each write (pwrite64) to the journal files was followed by a sync to disk (fdatasync). This system call is well documented:

FSYNC(2)                                                         Linux Programmer's Manual                                                         FSYNC(2)

NAME
       fsync, fdatasync - synchronize a file's in-core state with storage device

DESCRIPTION
       fsync() transfers ("flushes") all modified in-core data of (i.e., modified buffer cache pages for) the file referred to by the file descriptor fd to
       the disk device (or other permanent storage device) so that all changed information can be retrieved even if the  system  crashes  or  is  rebooted.
       This includes writing through or flushing a disk cache if present.  The call blocks until the device reports that the transfer has completed.
...
       fdatasync() is similar to fsync(), but does not flush modified metadata unless that metadata is needed in order to allow a subsequent data retrieval to  be  correctly  handled.   For  example,  changes  to  st_atime or st_mtime (respectively, time of last access and time of last modification
...
       The aim of fdatasync() is to reduce disk activity for applications that do not require all metadata to be synchronized with the disk.
Enter fullscreen mode Exit fullscreen mode

Since I display both the committed time and the system call trace times, you can see that they match. The output related to the traces above demonstrates this alignment:

 1 Sat Jun 28 2025 08:33:07 GMT+0000 (Greenwich Mean Time)                                                                                                    
 2 Sat Jun 28 2025 08:33:07 GMT+0000 (Greenwich Mean Time)                                                                                                    
 3 Sat Jun 28 2025 08:33:07 GMT+0000 (Greenwich Mean Time)                                                                                                    
 4 Sat Jun 28 2025 08:33:07 GMT+0000 (Greenwich Mean Time)                                                                                                    
 5 Sat Jun 28 2025 08:33:07 GMT+0000 (Greenwich Mean Time)                                                                                                    
 6 Sat Jun 28 2025 08:33:07 GMT+0000 (Greenwich Mean Time)                                                                                                    
 7 Sat Jun 28 2025 08:33:07 GMT+0000 (Greenwich Mean Time)                                                                                                    
 8 Sat Jun 28 2025 08:33:07 GMT+0000 (Greenwich Mean Time)
 9 Sat Jun 28 2025 08:33:07 GMT+0000 (Greenwich Mean Time)
 10 Sat Jun 28 2025 08:33:07 GMT+0000 (Greenwich Mean Time)
Enter fullscreen mode Exit fullscreen mode

Multi-document transactions

The previous example ran ten autocommit updates, each calling a synchronisation to disk.
In general, with good document data modeling, a document should match the business transaction. However, it is possible to use multi-document transaction and they are ACID (atomic, consistent, isolated and durable). Using multi-document transactions also reduces the sync latency as it is required only once per transaction, at commit.

I've run the following with five transactions, each running one update and one insert:


const session = db.getMongo().startSession();
for (let i = 1; i <= 5; i++) {
 session.startTransaction();
  const sessionDb = session.getDatabase(db.getName());
  sessionDb.mycollection.updateOne( { _id: 1 }, { $inc: { num: 1 } });
  print(` ${i} updated ${new Date()}`)
  sessionDb.mycollection.insertOne( { answer:42 });
  print(` ${i} inserted ${new Date()}`)
 session.commitTransaction();
 print(` ${i} committed ${new Date()}`)
}

Enter fullscreen mode Exit fullscreen mode

Strace still shows ten calls to pwrite64 and fdatasync. I used this multi-document transaction to go further and prove that not only the commit triggers a sync to disk, but also waits for its acknlowledgement before returning a sucessful feedback to the application.

Inject some latency with gdb

To show that the commit waits for the acknowledgment of fdatasync I used a GDB breakpoint for the fdatasyc call.

I stopped strace, and started GDB with a script that adds a latency of five seconds to fdatasync:

cat > gdb_slow_fdatasync.gdb <<GDB

break fdatasync
commands
  shell sleep 5
  continue
end
continue

GDB

gdb --batch -x gdb_slow_fdatasync.gdb -p $(pgrep mongod)

Enter fullscreen mode Exit fullscreen mode

I ran the five transactions and two writes. GDB shows when it hits the breakpoint:

Thread 31 "JournalFlusher" hit Breakpoint 1, 0x0000ffffa6096eec in fdatasync () from target:/lib64/libc.so.6 
Enter fullscreen mode Exit fullscreen mode

My GDB script automatically waits fives seconds and continues the program, until the next call to fdatasync.

Here was the output from my loop with five transactions:

 1 updated Sat Jun 28 2025 08:49:32 GMT+0000 (Greenwich Mean Time)
 1 inserted Sat Jun 28 2025 08:49:32 GMT+0000 (Greenwich Mean Time)
 1 committed Sat Jun 28 2025 08:49:37 GMT+0000 (Greenwich Mean Time)
 2 updated Sat Jun 28 2025 08:49:37 GMT+0000 (Greenwich Mean Time)
 2 inserted Sat Jun 28 2025 08:49:37 GMT+0000 (Greenwich Mean Time)
 2 committed Sat Jun 28 2025 08:49:42 GMT+0000 (Greenwich Mean Time)
 3 updated Sat Jun 28 2025 08:49:42 GMT+0000 (Greenwich Mean Time)
 3 inserted Sat Jun 28 2025 08:49:42 GMT+0000 (Greenwich Mean Time)
 3 committed Sat Jun 28 2025 08:49:47 GMT+0000 (Greenwich Mean Time)
 4 updated Sat Jun 28 2025 08:49:47 GMT+0000 (Greenwich Mean Time)
 4 inserted Sat Jun 28 2025 08:49:47 GMT+0000 (Greenwich Mean Time)
 4 committed Sat Jun 28 2025 08:49:52 GMT+0000 (Greenwich Mean Time)
 5 updated Sat Jun 28 2025 08:49:52 GMT+0000 (Greenwich Mean Time)
 5 inserted Sat Jun 28 2025 08:49:52 GMT+0000 (Greenwich Mean Time)
Enter fullscreen mode Exit fullscreen mode

The insert and update operations occur immediately, but the commit itself waits five seconds, because of the latency I injected with GDB. This demonstrates that the commit waits for fdatasync, guaranteeing the flush to persistent storage. For this demo, I used all default settings in MongoDB 8.0, but this behavior can still be tuned through write concern and journaling configurations.

I used GDB to examine the call stack. Alternatively, you can inject a delay with strace by adding this option: -e inject=fdatasync:delay_enter=5000000.

Look at the open source code

When calling fdatasync, errors can occur, and this may compromise durability if operations on the file descriptor continue (remember the PostgreSQL fsyncgate). MongoDB uses the open-source WiredTiger storage engine, which implemented the same solution as PostgreSQL to avoid that: panic instead of retry. You can review the os_fs.c code to verify this.

The fdatasync call is in the JournalFlusher thread and here is the backtrace:

#0  0x0000ffffa0b5ceec in fdatasync () from target:/lib64/libc.so.6
#1  0x0000aaaadf5312c0 in __posix_file_sync ()
#2  0x0000aaaadf4f53c8 in __log_fsync_file ()
#3  0x0000aaaadf4f58d4 in __wt_log_force_sync ()
#4  0x0000aaaadf4fb8b8 in __wt_log_flush ()
#5  0x0000aaaadf588348 in __session_log_flush ()
#6  0x0000aaaadf41b878 in mongo::WiredTigerSessionCache::waitUntilDurable(mongo::OperationContext*, mongo::WiredTigerSessionCache::Fsync, mongo::WiredTigerSessionCache::UseJournalListener) ()
#7  0x0000aaaadf412358 in mongo::WiredTigerRecoveryUnit::waitUntilDurable(mongo::OperationContext*) ()
#8  0x0000aaaadfbe855c in mongo::JournalFlusher::run() ()
Enter fullscreen mode Exit fullscreen mode

Here are some entrypoints if you want to look at the code behind this:

Have your opinions based on facts, not myths.

MongoDB began as a NoSQL database that prioritized availability and low latency over strong consistency. However, that was over ten years ago. As technology evolves, experts who refuse to constantly learn risk their knowledge becoming outdated, their skills diminishing, and their credibility suffering.
Today, MongoDB is a general-purpose database that supports transaction atomicity, consistency, isolation, and durability—whether the transaction involves a single document, or multiple documents.

Next time you encounter claims from ignorants or detractors suggesting that MongoDB is not consistent or fails to flush committed changes to disk, you can confidently debunk these myths by referring to official documentation, the open source code, and conducting your own experiments. MongoDB is similar to PostgreSQL: buffered writes and WAL sync to disk on commit.

Comments 4 total

  • pankaj pathak
    pankaj pathakJun 30, 2025

    Good one Frank. Thank you 😊 🙏🏼

    This is new for me as well to understand the concept of durable transactions with MongoDB.
    Couple of years back in one of my interviews I said that MongoDB doesn't support transaction hence it less likely been implemented in the domains like banking or more transaction obsessed shops.

    But now it makes more clear about its potential about the missing domains.

    What's your take on MongoDB not being used as transactional database in banking domains ?

    Please throw some light on this topic too.

    Thanks again

    • Franck Pachot
      Franck PachotJun 30, 2025

      I'm not sure the real issue is solely about transactions. It's interesting that we often use bank deposits or transfers to illustrate transaction atomicity, but in reality, banks operate differently. They may read and reserve an amount, yet the actual operation is recorded (document databases can be great for that) and applied asynchronously to the account balances.

      Banks tend to be very conservative, primarily because they aim to maintain compatibility with older systems like COBOL, DB2, Oracle, or Sybase. Transitioning to MongoDB and adopting a document model represents a modernization effort—breaking down monolithic applications into smaller, independent services, each with its own database and data model. Interestingly, some older banks are not resistant to modernization. Here is an example:

  • Nathan Tarbert
    Nathan TarbertJun 30, 2025

    Pretty cool, I’ve enjoyed all of the research you put into this and how you actually showed what’s happening under the hood

Add comment