LND: Replacement Stalling Attack

A vulnerability in LND versions 0.18.5 and below allows attackers to steal node funds. Users should immediately upgrade to LND 0.19.0 or later to protect their funds.

Background

LND has a sweeper subsystem for managing transaction batching and fee bumping. When a Lightning channel is force closed, the sweeper kicks into action, collecting HTLC inputs into batched claim transactions to save on mining fees. The sweeper then periodically bumps the fees paid by those transactions until the transactions confirm.

It is critical that the sweeper gets certain HTLC transactions confirmed before the corresponding upstream HTLCs expire, or else the value of those HTLCs can be completely lost. For this reason, a fairly aggressive default fee-bumping strategy is used, and as upstream HTLC deadlines approach, the sweeper is willing to spend up to half the value of those HTLCs in mining fees.

Sweeper Weaknesses

LND’s aggressive fee bumping could be thwarted, however, due to a couple weaknesses in the sweeper.

Fee Resets on Reaggregation

If an input to a batched transaction was double-spent by someone else, the sweeper would regroup the remaining inputs into a new transaction and reset the fees paid by that transaction to the minimum value of the fee function. If this happened many times, the sweeper would end up broadcasting transactions with much lower fees than intended, and upstream deadlines could be missed.

Broadcast Delays

Additionally, the regrouping of inputs after a double spend would be delayed until the next block confirmed. So if lots of double spends happened, the sweeper would miss out on 50% of the available opportunities to get its time-sensitive transactions confirmed. Once again, this could cause upstream deadlines to be missed and funds to be lost.

A Basic Replacement Stalling Attack

An attacker could take advantage of these sweeper weaknesses to steal funds. The basic idea is to cause the sweeper to batch many HTLC inputs together, then repeatedly double spend those inputs, causing the sweeper to keep regrouping the remaining inputs into new transactions. Each double spend prevents the sweeper’s transaction from confirming for at least 2 blocks, while also resetting the fees paid by the next sweeper transaction to the minimum, so future double spends remain cheap. After upstream HTLC timelocks expire, all remaining HTLCs could be stolen.

An attack would look like this:

  1. The attacker opens a direct channel to the victim and routes ~40 HTLCs to themselves through the victim, using the minimum CLTV delta the victim allows (80 blocks by default). The attacker intends to steal the last HTLC, so they make that one as large as possible.
  2. The attacker holds the HTLCs until they expire and the victim force closes the channel to reclaim them. At this point, the 80-block countdown to the upstream deadline starts, and the attacker needs to stall the victim for that long to steal funds.
  3. Because all 40 of the attacker’s HTLCs have the same upstream deadline, the victim’s sweeper batches all 40 HTLC-Timeouts into a single transaction and broadcasts it.
  4. The attacker sees the batched transaction in their mempool and immediately replaces the transaction with a preimage spend for one of the 40 HTLCs.
  5. The double-spend confirms, and the victim is able to extract the HTLC preimage and settle the corresponding upstream HTLC, but the remaining 39 HTLC-Timeouts are not reaggregated until another block confirms (see the section “Broadcast Delays” above).
  6. Another block confirms, and the victim broadcasts a new transaction containing the remaining HTLC-Timeouts. The fees for this transaction are reset to the minimum value of the fee function. The attacker repeats the process from Step 4, double-spending a new HTLC each time until the upstream deadline has passed.
  7. The attacker steals the remaining HTLC(s) by claiming the preimage path downstream and the timeout path upstream.

Attack Cost

In the worst case for the attacker, they must do ~40 replacements, each spending more total fees than the replaced batched transaction. We can calculate the fees of each batched HTLC-Timeout transaction as size * feerate, where size and feerate are estimated as follows:

  • size: num_htlcs * 166.5 vB
  • feerate: minimum value of LND’s fee function. By default, this is the value returned by bitcoind’s estimatesmartfee RPC.

Today, estimatesmartfee returns feerates between 0.7 sat/vB and 2.1 sat/vB depending on the confirmation target. To simplify calculations, we assume an average feerate of 1.4 sat/vB over the course of the attack. We also assume on average there are 20 HTLCs present on the batched transaction, since it starts with 40 HTLCs and decreases by 1 every 2 blocks until a single HTLC remains. With these simplifying assumptions, we get a rough cost as follows:

  • average cost per replacement: 20 HTLCs * 166.5 vB/HTLC * 1.4 sat/vB = 4,662 sat
  • total attack cost: 40 replacements * 4,662 sat/replacement = 186,480 sat

So for less than 200k sats, the attacker can steal essentially the entire channel capacity.

Optimizations

In practice, the cost of the attack is even less, since the attacker’s double spends may not confirm in the first available block, which means fewer than 40 double spends actually need to confirm. The attacker can also intentionally reduce the probability of confirmation by inflating the size of their double-spend transactions to the maximum possible while still replacing the victim’s transactions.

Additionally, a smart attacker, knowing they need fewer double spends to confirm, can reduce the number of HTLCs they route at the start of the attack. As a result, the victim’s batched transactions become smaller and the attacker can save on replacement fees.

For example, suppose the attacker can stall for 80 blocks with only 30 double spends. Then the cost of the attack is reduced by over 40%:

  • average cost per replacement: 15 HTLCs * 166.5 vB/HTLC * 1.4 sat/vB = 3,497 sat
  • total attack cost: 30 replacements * 3,497 sat/replacement = 104,910 sat

Mitigation

Changes were made in LND 0.19.0 that eliminated the reaggregation delay and the fee function reset.

These changes, combined with the sweeper’s aggressive default fee function, ensure that any replacement stalling attack costs many times more than the amount that can be stolen.

Discovery

This attack vector was discovered during code review of LND’s sweeper rewrite in May 2024.

Timeline

  • 2024-05-09: Attack vector reported to the LND security mailing list.
  • 2025-01-16: No progress on a mitigation. Reported the fee reset weakness publicly and followed up on the security mailing list.
  • 2025-02-21: Mitigation merged.
  • 2025-05-22: LND 0.19.0 released containing the fix.
  • 2025-10-31: Agreement to disclose publicly after LND 0.20.0 was released.
  • 2025-12-04: Public disclosure.

Prevention

This vulnerability was introduced during LND’s sweeper rewrite in May 2024, and I reported it before LND 0.18.0 was released containing the vulnerability. In my report, I suggested that the new sweeper be released in 0.18.0 and this vulnerability be fixed in 0.18.1, since a mitigation would require some work and the new sweeper already fixed several other vulnerabilities. Unfortunately that didn’t happen, and this vulnerability went unaddressed until I followed up again in 2025.

In hindsight, I should have done a better job at keeping the LND team accountable. I could have reported the vulnerability publicly, thereby forcing the issue to be addressed before the 0.18.0 release. The downside is that this would have delayed other important security fixes to the sweeper subsystem.

Alternatively, I could have reported the vulnerability privately (as I did) but given the LND team a deadline (say, 6 months) after which I would disclose the vulnerability publicly regardless of whether they mitigated it. This may have applied enough pressure to get the issue fixed in 0.18.1 as I originally intended.

Takeaways

  • Set disclosure deadlines to improve security outcomes.
  • Users should keep their node software updated.
LND: Infinite Inbox DoS

LND 0.18.5 and below are vulnerable to a denial-of-service (DoS) attack that causes LND to run out of memory (OOM) and crash or hang. Users should upgrade to at least LND 0.19.0 to protect their nodes.

The Infinite Inbox Vulnerability

When LND receives a message from one of its peers, a dedicated dispatcher thread queues the message for processing by the appropriate subsystem. For two such subsystems (the gossiper and the channel link), up to 1,000 messages could be queued per peer. Since Lightning protocol messages can be up to 64 KB in size, and since LND allowed as many peers as there were available file descriptors, memory could be exhausted quickly.

The DoS Attack

A simple, free way to exploit the vulnerability was to open multiple connections to the victim and spam query_short_channel_ids messages of size 64 KB, keeping the connections open until LND ran out of memory.

In my experiments against an LND node with 8 GB of RAM, I was able to cause an OOM in under 5 minutes.

The Mitigation

The vulnerability was mitigated by reducing queue sizes and introducing a new “peer access manager” to limit peer connections. Starting in LND 0.19.0, queue sizes are reduced to 50 messages and no more than 100 connections are allowed from peers without open channels.

Discovery

This vulnerability was discovered while examining how LND handles various peer messages.

Timeline

  • 2023-09-15: Vulnerability reported to the LND security mailing list.
  • 2025-03-12: Mitigation merged.
  • 2025-05-22: LND 0.19.0 released containing the fix.
  • 2025-10-31: Agreement on public disclosure after LND 0.20.0 is released.
  • 2025-12-04: Public disclosure.

Takeaways

  • More investment in Lightning security is needed.
  • Users should keep their node software updated.
LND: Excessive Failback Exploit #2

A variant of the excessive failback exploit disclosed earlier this year affects LND versions 0.18.5 and below, allowing attackers to steal node funds. Users should immediately upgrade to LND 0.19.0 or later to protect their funds.

The Excessive Failback Bug Revisited

As described in the previous disclosure, the original excessive failback bug existed in LND versions 0.17.5 and earlier. Essentially, when one of LND’s channel peers force closed the channel, LND would mark any HTLCs missing from the confirmed commitment as “failed” in the database, even if the HTLC had actually succeeded with the downstream peer. If LND then restarted before the corresponding upstream HTLC was resolved, LND would incorrectly fail that HTLC with the upstream peer. Both the upstream and downstream peers would be able to claim the HTLC, and LND would be left with a loss.

The Variant Bug

While a fix for the original excessive failback bug was included in LND 0.18.0, a minor variant of the bug remained when the channel was force closed using LND’s commitment instead of the attacker’s. In other words, the exact same attack was still possible if the attacker got the victim to force close the channel themselves. Unfortunately this is very easy to do; the attacker could simply send the victim an error message.

The Fix

The excessive failback bug variant was quietly fixed in the same way as the original bug, and the fix was included in the LND 0.19.0 release.

Discovery

This variant was discovered shortly after the original disclosure, while I was updating BOLT 5 to prevent future excessive failback vulnerabilities. I realized there were actually two cases that needed to be updated in BOLT 5, but only one of the cases had been patched in LND.

Timeline

  • 2025-03-04: Public disclosure of the original excessive failback vulnerability.
  • 2025-03-04: BOLT 5 update drafted; variant discovered.
  • 2025-03-05: Variant reported to the LND security mailing list.
  • 2025-03-20: Fix merged.
  • 2025-05-22: LND 0.19.0 released containing the fix.
  • 2025-10-31: Agreement to disclose publicly after LND 0.20.0 was released.
  • 2025-12-04: Public disclosure.

Prevention

In the previous disclosure post, I suggested that the excessive failback bug could have been prevented if the BOLT 5 specification had been clearer about how to handle HTLCs missing from confirmed commitment transactions. At the time, some Lightning maintainers were skeptical that a clearer specification would have helped.

But this variant of the bug was only discovered when I actually went and clarified BOLT 5 myself! I think this is strong evidence that a clearer specification could have prevented both variants of the bug.

A Note on Collaboration

As I noted in the previous excessive failback disclosure, it seems that at some point every Lightning implementation independently discovered and fixed bugs similar to the excessive failback bug in LND. Yet no one (including LND) thought to update the specification to help others avoid such bugs in the future.

When I finally did update the specification, good things happened. This variant of the excessive failback bug was discovered and fixed in LND. But I also noticed that Eclair might have been vulnerable to this variant and reached out to Bastien Teinturier. While it turned out that Eclair was not vulnerable, the discussion with Bastien led to the accidental discovery of a different serious vulnerability in Eclair.

This all happened from just a tiny bit of collaboration: a specification update for the common good and a short conversation with Bastien. In many ways, it is quite unfortunate that Lightning engineering talent is spread out over so many implementations. Everyone focuses on their own code first, and collaboration is secondary. Efforts are duplicated and lessons are learned multiple times. Imagine what we could accomplish with a little more cooperation.

Takeaways

  • Clear specifications benefit all Lightning implementations.
  • We should do more cross-implementation collaboration.
  • Users should keep their node software updated.
Eclair: Preimage Extraction Exploit

A critical vulnerability in Eclair versions 0.11.0 and below allows attackers to steal node funds. Users should immediately upgrade to Eclair 0.12.0 or later to protect their funds.

Background

In the Lightning Network, nodes forward payments using contracts called HTLCs (Hash Time-Locked Contracts). To settle a payment, the final recipient reveals a secret piece of data called a preimage. This preimage is passed backward along the payment route, allowing each node to claim their funds from the previous node.

If a channel is forced to close, these settlements can happen on the Bitcoin blockchain. Nodes must watch the blockchain to spot these preimages so they can claim their own funds.

The Preimage Extraction Vulnerability

The vulnerability in Eclair existed in how it monitored the blockchain for preimages during a force close. Eclair would only check for HTLCs that existed in its local commitment transaction — its own current version of the channel’s state. The code incorrectly assumed this local state would always contain a complete list of all possible HTLCs.

However, a malicious channel partner could broadcast an older, but still valid, commitment transaction. This older state could contain an HTLC that the victim’s node had already removed from its own local state. When the attacker claimed this HTLC on-chain with a preimage, the victim’s Eclair node would ignore it because the HTLC wasn’t in its local records, causing the victim to lose the funds.

The original code snippet illustrates the issue:

def extractPreimages(localCommit: LocalCommit, tx: Transaction)(implicit log: LoggingAdapter): Set[(UpdateAddHtlc, ByteVector32)] = {
  // ... (code omitted that extracts htlcSuccess and claimHtlcSuccess preimages from tx)
  val paymentPreimages = (htlcSuccess ++ claimHtlcSuccess).toSet
  paymentPreimages.flatMap { paymentPreimage =>
    // we only consider htlcs in our local commitment, because we only care about outgoing htlcs, which disappear first in the remote commitment
    // if an outgoing htlc is in the remote commitment, then:
    // - either it is in the local commitment (it was never fulfilled)
    // - or we have already received the fulfill and forwarded it upstream
    localCommit.spec.htlcs.collect {
      case OutgoingHtlc(add) if add.paymentHash == sha256(paymentPreimage) => (add, paymentPreimage)
    }
  }
}

The misleading comment in the code suggests this approach is safe, hiding the bug from a casual review.

Stealing HTLCs

An attacker could exploit this bug to steal funds as follows:

  1. The attacker M opens a channel with the victim B, creating the following topology: A -- B -- M.
  2. The attacker routes a payment to themselves along the path A->B->M.
  3. M fails the payment by sending update_fail_htlc followed by commitment_signed. B updates their local commitment and revokes their previous one by sending revoke_and_ack followed by commitment_signed.
    • At this point, M has two valid commitments: one with the HTLC present and one with it removed.
    • Also at this point, B only has one valid commitment with the HTLC already removed.
  4. M force-closes the channel by broadcasting their older commitment transaction where the HTLC still exists.
  5. M claims the HTLC on the blockchain using the payment preimage.
  6. B sees the on-chain transaction but fails to extract the preimage because the corresponding HTLC is missing from its local commitment.
  7. Because B never learned the preimage, it cannot claim the payment from A.

When the time limit expires, A gets a refund, and the victim is left with the loss. The attacker keeps both the original funds and the payment they claimed on-chain.

The Fix

The solution was to update extractPreimages to check for HTLCs across all relevant commitment transactions, including the remote and next-remote commitments, not just the local one.

def extractPreimages(commitment: FullCommitment, tx: Transaction)(implicit log: LoggingAdapter): Set[(UpdateAddHtlc, ByteVector32)] = {
  // ... (code omitted that extracts htlcSuccess and claimHtlcSuccess preimages from tx)
  val paymentPreimages = (htlcSuccess ++ claimHtlcSuccess).toSet
  paymentPreimages.flatMap { paymentPreimage =>
    val paymentHash = sha256(paymentPreimage)
    // We only care about outgoing HTLCs when we're trying to learn a preimage to relay upstream.
    // Note that we may have already relayed the fulfill upstream if we already saw the preimage.
    val fromLocal = commitment.localCommit.spec.htlcs.collect {
      case OutgoingHtlc(add) if add.paymentHash == paymentHash => (add, paymentPreimage)
    }
    // From the remote point of view, those are incoming HTLCs.
    val fromRemote = commitment.remoteCommit.spec.htlcs.collect {
      case IncomingHtlc(add) if add.paymentHash == paymentHash => (add, paymentPreimage)
    }
    val fromNextRemote = commitment.nextRemoteCommit_opt.map(_.commit.spec.htlcs).getOrElse(Set.empty).collect {
      case IncomingHtlc(add) if add.paymentHash == paymentHash => (add, paymentPreimage)
    }
    fromLocal ++ fromRemote ++ fromNextRemote
  }
}

This change ensures that Eclair will correctly identify the HTLC and extract the necessary preimage, even if a malicious partner broadcasts an old channel state. The fix was discreetly included in a larger pull request for splicing and released in Eclair 0.12.0.

Discovery

The vulnerability was discovered accidentally during a discussion with Bastien Teinturier, who asked for a second look at the logic in the extractPreimage function. Upon review, the attack scenario was identified and reported.

Timeline

  • 2025-03-05: Vulnerability reported to Bastien.
  • 2025-03-11: Fix merged and Eclair 0.12.0 released.
  • 2025-03-21: Agreement on public disclosure in six months.
  • 2025-09-23: Public disclosure.

Prevention

In response to the vulnerability report, Bastien sent the following:

This code seems to have been there from the very beginning of eclair, and has not been updated or challenged since then. This is bad, I’m noticing that we lack a lot of unit tests for this kind of scenario, this should have been audited… I’ll spend time next week to check that we have tests for every known type of malicious force-close… Thanks for reporting this, it’s high time we audited that.

As promised, Bastien added a force-close test suite a couple weeks later. Had these tests existed from the start, this vulnerability would have been prevented.

Takeaways

  • More robust testing and auditing of Lightning implementations is badly needed.
  • Users should keep their node software updated.
LND: gossip_timestamp_filter DoS

LND 0.18.2 and below are vulnerable to a denial-of-service (DoS) attack involving repeated gossip requests for the full Lightning Network graph. The attack is trivial to execute and can cause LND to run out of memory (OOM) and crash or hang. You can protect your node by updating to at least LND 0.18.3 or by setting ignore-historical-gossip-filters=true in your node configuration.

Background

To send payments successfully across the Lightning Network, a node generally needs to have an accurate view of the Lightning Network graph. Lightning nodes maintain a local copy of the network graph that they continuously update as they receive channel and node updates from their peers via a gossip protocol.

New nodes and nodes that have been offline for a while need a way to bootstrap their local copy of the network graph. A common way this is done is to send a gossip_timestamp_filter message to some of the node’s peers, requesting that they share all gossip messages they have that are newer than a certain timestamp. Nodes that cooperate with the message will load the requested gossip from their databases and send them to the requesting peer.

The Vulnerability

By default, LND cooperates with all gossip_timestamp_filter requests. Prior to v0.18.3, LND’s logic to respond to these requests looks like this:

func RespondGossipFilter(filter *GossipTimestampFilter) {
  gossipMsgs := loadGossipFromDatabase(filter)

  go func() {
    for msg := range gossipMsgs {
      sendToPeerSynchronously(msg)
    }
  }
}

LND loads all requested messages into memory at the same time, and then sends them one by one to the peer, pausing after each send until the peer acknowledges receiving the message. The peer can specify any filter, including one that requests all historical gossip messages to be sent to them, and LND will happily comply with the request. As a result, LND can load potentially hundreds of thousands of messages into memory for each request. And since LND has no limit on the number of concurrent requests it will handle, memory usage can get out of hand quickly.

The DoS Attack

Exploiting this vulnerability to DoS attack a victim is easy. An attacker simply needs to:

  1. Send lots of gossip_timestamp_filter messages to the victim, setting the timestamp to 0 to request the full graph.
  2. Keep the connection with the victim open by periodically sending pings and slowly ACKing incoming messages.

This causes LND’s memory consumption to grow over time, until an OOM occurs.

Experiment

I carried out this DoS attack against an LND node with 8 GB of RAM and 2 GB of swap. After a few minutes, the node exhausted its RAM and started using swap, and LND’s performance slowed to a crawl. After about 2 hours, LND exhausted the swap as well and the operating system killed the LND process.

The Mitigation

LND 0.18.3 added a global semaphore to limit the number of concurrent gossip_timestamp_filter requests that LND will cooperate with. While this doesn’t fix LND’s excessive memory usage per request, it does limit the global impact on memory usage, which is enough to protect against this DoS attack.

Discovery

This vulnerability was discovered while looking at how LND handles various peer messages.

Timeline

  • 2023-07-13: Vulnerability reported to the LND security mailing list.
  • 2023-12-11: Failed attempt at a stealth mitigation, which could be bypassed by using multiple node IDs when carrying out the attack.
  • 2023-12-11: Emailed the security mailing list again, explaining the problem with the attempted mitigation.
  • 2024-08-27: Proper mitigation merged.
  • 2024-09-12: LND 0.18.3 released containing the fix.
  • 2025-07-22: Gijs gives the OK to disclose publicly.
  • 2025-07-22: Public disclosure.

Prevention

This vulnerability has existed ever since gossip filtering was added to LND in 2018. The pull request that added the feature contained over 5k lines of new code and received only minor review feedback. It seems that no one was thinking adversarially about the new code at that time, and apparently no one has re-evaluated the code since then.

While it’s understandable that developers were more focused on building features and shipping quickly in the early days of the Lightning Network, I think it is long overdue that a shift is made to more careful development. Engineering with security in mind is slower and more difficult, but in the long run it pays dividends in the form of greater user trust and disasters avoided.

Takeaways

  • Update to at least LND 0.18.3 or set ignore-historical-gossip-filters=true to protect your node.
  • More investment in Lightning security is needed.