index TxOrphanage by wtxid, allow entries with same txid (p2p)

https://github.com/bitcoin/bitcoin/pull/30000

Host: glozow  -  PR author: glozow

Notes

  • An orphan transaction is a transaction with missing inputs. The p2p code uses a TxOrphanage to store orphan transactions, to be reconsidered later if/when its parent transaction(s) are submitted to mempool. There are two ways this can happen:

  • An orphan can be removed in a few different ways:

  • Different transactions can have the same txid but different witnesses, i.e. different wtxids. For example, a same-txid-different-witness transaction can have an invalid signature (and thus be invalid) or a larger witness (but same fee and thus lower feerate).

  • Prior to this PR, the TxOrphanage is indexed by Txid and, when considering a new transaction in AddTx, immediately fails if the new transaction’s txid matches that of an existing entry.

    • TxOrphanage::HaveTx takes a GenTxid to query the data structure by either txid or wtxid.

    • HaveTx is primarily called by AlreadyHaveTx which also accepts a GenTxid.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? What was your review approach?

  2. Why would we want to allow multiple transactions with the same txid to exist in the TxOrphanage at the same time? What kind of situation does this prevent?

  3. What are some examples of same-txid-different-witness orphans? (Bonus points if you can write a test case in the functional test for your example).

  4. Let’s consider the effects of only allowing 1 entry per txid. What happens if a malicious peer sends us a mutated version of the orphan transaction, where the parent is not low feerate? What needs to happen for us to end up accepting this child to mempool? (There are multiple answers).

  5. Let’s consider the effects if we have a 1p1c package (where the parent is low feerate and must be submitted with its child). What needs to happen for us to end up accepting the correct parent+child package to mempool?

  6. Instead of allowing multiple transactions with the same txid (where we are obviously wasting some space on a version we will not accept), should we allow a transaction to replace an existing entry in the TxOrphanage? What would be the requirements for replacement?

  7. Where in the code do we check whether the orphanage contains a transaction? Is the query done by wtxid, txid, or both? (Hint: there are at least 5).

  8. This PR removes the ability to query the orphanage by txid, since the TxOrphanage no longer has an index by txid. Is that okay, and why or why not?

Meeting Log

  117:01 <glozow> #startmeeting
  217:01 <dergoegge> hi
  317:01 <maxedw_> hi
  417:01 <glozow> Welcome to PR Review Club!
  517:01 <glozow> We're looking at #30000 today: https://bitcoincore.reviews/30000
  617:02 <glozow> I know we posted the notes very very late, but did anybody get a chance to look at the PR or the notes?
  717:02 <angusp> yep
  817:02 <maxedw_> yes
  917:02 <ion-> Very briefly
 1017:02 <glozow> angusp: maxedw_: ion-: ⭐ yay!
 1117:03 <glozow> Let's dive into the questions. Why would we want to allow multiple transactions with the same txid to exist in the TxOrphanage at the same time? What kind of situation does this prevent?
 1217:03 <maxedw_> When a parent comes in that a valid orphaned child can be combined with it to form a package. Prevents an attacker from preventing us getting our package in the mempool and confirmed
 1317:03 <stickies-v> hi
 1417:05 <stickies-v> mostly looked at the PR (well, mostly at TxOrphanage in general)
 1517:05 <angusp> I'm not super sure, my guess is malleability, if a messed-with tx was seen first and in the orphanage and you're indexing by txid, the honest one can't be included
 1617:05 <glozow> maxedw_: great! let's look at that a bit more closely. let's say an attacker has the parent tx and the child tx. what should they do / send to you?
 1717:05 <ion-> If an attacker constructs a malleable transaction to a valid one, and his version is received first?
 1817:05 <glozow> angusp: ion-: yep yep, malleated how?
 1917:06 <angusp> put `[b"garbage"]` in the witness ;) - or tweak the signature
 2017:06 <maxedw_> they should malleate the child and hold off sending the parent?
 2117:07 <angusp> you can't know it's an invalid tx if you've never seen the parent
 2217:07 <glozow> angusp: haha yes, was just about to link to the test as a hint. correct, they change the witness somehow
 2317:07 <ion-> the two transactions having different witness versions?
 2417:07 <ion-> as you say in the pr description?
 2517:08 <glozow> okay so the attacker sent you a malleated version of the child, and nobody has sent you the parent yet. what happens if an honest peer sends you the real child tx?
 2617:08 <angusp> in the current code?
 2717:08 <glozow> angusp: yes
 2817:08 <ion-> it will be rejected i guess
 2917:09 <maxedw_> before pr you wouldn't put it in the orphanage
 3017:09 <glozow> still walking through what exactly we're trying to fix in the PR
 3117:09 <angusp> the line `if (m_orphans.count(hash)) return false;` would prevent the honest/real child tx from being accepted
 3217:09 <glozow> maxedw_: angusp: yes exactly!
 3317:10 <angusp> so then the honest peer would have to rebroadcast the real child later when it's not an orphan? (Or is there a mechanism for my peer to re-request it later?)
 3417:10 <glozow> we'd call `AddTx` here https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L4635-L4637 but it'd be dropped at the start of the function
 3517:10 <glozow> angusp: good question
 3617:11 <glozow> TLDR yes you would need to download it again from that peer or from somebody else
 3717:11 <glozow> however let's look at what the code does
 3817:12 <glozow> as you can see, after `AddTx`, we call `ForgetTxHash()` to forget about all the announcements we got for this tx: https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L4635-L4642
 3917:13 <glozow> which means that we won't download it again until somebody sends us the inv or tx again
 4017:13 <glozow> and if, at that point, we still have the fake orphan in the orphanage, the same thing happens
 4117:14 <glozow> Let's say we never receive the parent transaction from anybody. When will we finally delete the fake tx from orphanage?
 4217:14 <maxedw_> 20 minute timeout?
 4317:14 <angusp> after time / random when orphanage is full?
 4417:14 <maxedw_> or when we know it's invalid
 4517:14 <glozow> maxedw_: angusp: yep!
 4617:14 <ion-> expiration 20 min
 4717:14 <maxedw_> (if it's invalid)
 4817:15 <angusp> > or when we know it's invalid
 4917:15 <angusp> Can we ever know an orphan is invalid?
 5017:15 <glozow> maxedw_: yes, that's correct. but we wouldn't figure that out until we recieved the parent
 5117:15 <glozow> angusp: only if we reconsider it after getting the missing parent(s)
 5217:15 <glozow> let's explore this as well: while we have the malleated child in the orphanage, what happens if we receive the (real) parent?
 5317:16 <maxedw_> low fee or not?
 5417:16 <ion-> Accpet it and invalidate the child
 5517:16 <glozow> maxedw_: let's start with not low fee
 5617:16 <glozow> (well, the answer is the same but the codepaths are slightly different)
 5717:17 <maxedw_> parent accepted, child kicked out?
 5817:18 <maxedw_> was there an example where it was malleated so it was still valid but larger and so fee less?
 5917:18 <glozow> maxedw_: yep. When we accept the parent, we queue up the orphan for reconsideration in `ProcessOrphanTx`: https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L3366
 6017:18 <glozow> maxedw_: yeah, so let's answer the other option: what happens if the parent is low feerate (i.e. needs bumping from the child)?
 6117:20 <maxedw_> If it's really low feerate it could get dropped out of the mempool and if it's just too low to get in a block it could be delayed?
 6217:20 <glozow> maxedw_: it's so low feerate it doesn't get accepted to mempool
 6317:21 <maxedw_> then I spose it gets discarded?
 6417:21 <glozow> ok so the parent gets rejected, and we try to submit it with the orphan transaction as a package here: https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L4669-L4673
 6517:21 <glozow> it's the same result - the child has an invalid signature, so it gets removed from the orphanage
 6617:22 <glozow> and, yes, we discard the parent after
 6717:22 <glozow> Ok! We covered the next question "What are some examples of same-txid-different-witness orphans?", we mentioned a bad signature and a really large witness for a lower feerate
 6817:23 <ion-> segwit v0 compared to v1 maybe?
 6917:24 <maxedw_> can an attacker make a witness larger knowing only the valid witness and transaction?
 7017:24 <ion-> Depending on the sighash used?
 7117:25 <glozow> maxedw_: certainly they can make the witness larger, though they might not make a valid tx
 7217:25 <maxedw_> with P2WSH there could be multiple different WitnessScripts that are possible, not sure if there is something that could be done there
 7317:25 <angusp> The code you linked to here https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L4639-L4641 we forget both the txid and wtxid -- is that still OK in the attack case we've covered? You're 'forgetting' the txid which could be malleated
 7417:25 <glozow> in any case, the feerate would be checked before the script
 7517:28 <glozow> angusp: hm. the intention there is "ok we've already downloaded this tx so there's no point in continuing to try to download it"
 7617:28 <glozow> I suppose... anything with this txid is also going to be an orphan
 7717:29 <glozow> And we aren't going to add anything with the same txid to the orphanage anyway
 7817:30 <glozow> So I guess this is consistent with current behavior?
 7917:30 <angusp> yeah, makes sense
 8017:32 <glozow> Ok next question. Instead of allowing multiple transactions with the same txid (where we are obviously wasting some space on a version we will not accept), should we allow a transaction to replace an existing entry in the TxOrphanage?
 8117:32 <glozow> ion-: I don't think the version is in the witness?
 8217:33 <ion-> nope!
 8317:33 <glozow> so that is not a way to change the wtxid without changing the txid
 8417:34 <angusp> Also think no, because you know nothing about the parent tx, you can't really pick which will be valid or not (between an honest tx and a maleated one)
 8517:34 <maxedw_> the only reason I could think to replace would be if it was a higher fee but you would also have to know it was valid which you couldn't do without the parent
 8617:35 <angusp> maxedw_: Higher fee should change the txid because the amounts change?
 8717:36 <maxedw_> smaller witness could do it?
 8817:36 <glozow> angusp: maxedw_: nice, good answers. for "higher fee" you can go by size i suppose
 8917:36 <glozow> (going for feerate not fee)
 9017:37 <glozow> somebody gave a good suggestion to not have duplicate txids from the same peer
 9117:37 <angusp> do orphaned txs not get broadcast to other peers?
 9217:37 <glozow> (because presumably if a peer is going to send you duplicates, they're either replacing the previous one or sending garbage)
 9317:37 <glozow> angusp: no they don't. we only broadcast after we submit to mempool
 9417:38 <angusp> gotcha
 9517:38 <glozow> but other than that, I agree. there's not really a metric we can use to choose one tx over the other
 9617:39 <ion-> Could we use a sometimes change and sometimes not approach?
 9717:40 <glozow> ion-: you mean like flip a coin?
 9817:40 <ion-> yes, to make the attackers work more difficult
 9917:40 <glozow> doesn't that make it easier? an honest peer will only send 1 tx. an attacker can send many, and has a 1/2 chance of displacing the tx each time
10017:41 <glozow> Next question: Where in the code do we check whether the orphanage contains a transaction? Is the query done by wtxid, txid, or both? (Hint: there are at least 5).
10117:42 <maxedw_> TxOrphanage::AddTx - wtxid (txid only used for log messages)
10217:42 <maxedw_> TxOrphanage::EraseTxNoLock - wtxid
10317:42 <maxedw_> TxOrphanage::HaveTx - wtxid
10417:42 <maxedw_> TxOrphanage::GetTxToReconsider - wtxid
10517:42 <maxedw_> TxOrphanage::EraseForBlock - uses outpoint which is txid?
10617:43 <glozow> Yes nice, lots of queries within txorphanage. There are at least 4 more (hint: I'm looking for lines in net_processing.cpp)
10717:44 <angusp> Presumably it was done by both Txid and Wtxid as the existing code also had m_wtxid_to_orphan_it -- I'm not really familiar enough to know where other than doing a code search!
10817:44 <maxedw_> I did think I should have looked in net_processing too!
10917:44 <glozow> :) and to clarify the question, let's talk about queries in the code before this PR
11017:45 <glozow> Hint: you can see that `m_orphanage.HaveTx()` is called in `AlreadyHaveTx` here: https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L2298
11117:45 <glozow> Where is `AlreadyHaveTx` called?
11217:46 <maxedw_> PeerManagerImpl::ProcessInvalidTx looks to be one
11317:46 <maxedw_> I checked with the PR..
11417:46 <angusp> 1) When receiving a new tx (/package?) from a peer 2) getting a new block
11517:47 <glozow> angusp: yes on the first one! that ones right here: https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L4531
11617:47 <glozow> (and we check only by wtxid)
11717:47 <glozow> maxedw_ mentioned the getting a new block one, that's `EraseForBlock`
11817:48 <angusp> 3) If removing rejecting a tx that's a parent of an orphan we have
11917:49 <glozow> Actually to note "uses outpoint which is txid" note that that's the txid of the parent, not the tx in the orphanage
12017:50 <glozow> angusp: aha, i assume you mean this line: https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L4632
12117:50 <maxedw_> glozow: that makes sense
12217:50 <glozow> angusp: for that (3), is that by txid, wtxid, or both?
12317:52 <glozow> hint: we're looking at parents using the prevouts of the tx
12417:52 <angusp> I think both - or rather, Wtxid if has witness, else txid
12517:52 <angusp> 50/50 haha
12617:53 <glozow> it's by txid, but I'll give you partial credit for "both" because txid == wtxid sometimes
12717:54 <glozow> There's 2 more `AlreadyHaveTx` callsites i'm looking for in net_processing.cpp. Can we find them?
12817:55 <angusp> Hrm, so how does that work when you switch the orphanage to being Wtxid indexed?
12917:55 <glozow> angusp: good question!!
13017:55 <glozow> (that's also Q8, which is the only question left)
13117:56 <glozow> "This PR removes the ability to query the orphanage by txid, since the TxOrphanage no longer has an index by txid. Is that okay, and why or why not?"
13217:56 <angusp> (I can find the other `AlreadyHaveTx` calls but not sure what the code around them is doing!)
13317:56 <glozow> angusp: ok no problem i'll just list them here
13417:57 <glozow> https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L4226 is when we receive an `inv` message for a transaction
13517:57 <glozow> and https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L6288 is when we are sending a `getdata` (in response to an `inv`) and we want to make sure that we aren't about to request a tx we already have
13617:58 <glozow> these can be by txid or wtxid. however in the vast majority of cases, we should be getting invs by wtxid and sending getdata by wtxid
13717:59 <glozow> unless we are requesting parents of an orphan :P
13818:00 <glozow> Aho, we are out of time already! lmk your answer to the last question as a review comment on the PR
13918:00 <glozow> #endmeeting