Open p2p connections to nodes that listen on non-default ports (p2p)

https://github.com/bitcoin/bitcoin/pull/23542

Host: mzumsande  -  PR author: vasild

The PR branch HEAD was 36ee76d at the time of this review club meeting.

Notes

  • Bitcoin Core uses port 8333 as the default port on mainnet (18333 on testnet). This means that nodes will listen on the default port for incoming connections, unless another port is specified using the -port or -bind startup options.

  • However, nodes that listen on non-standard ports are unlikely to receive incoming connections, because the automatic connection logic disfavors these addresses heavily.

  • In preparation for this PR, PR #23306 changed the address manager behavior such that an addrman entry is now defined by both IP and port, so that multiple entries with different ports and the same IP can coexist.

  • This PR changes the logic for automatic outgoing connections by dropping the preferential treatment for the default port. It doesn’t treat all ports as equal though: A list of “bad ports” is introduced that are still disfavored for outgoing connections.

  • Later commits also adjust the address gossip relay logic to include the port of an address in a hash that is used to determine which peers to relay an address to.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?

  2. What were the historical reasons for the preferential treatment of the default port?

  3. What are the benefits of removing this preferential treatment with this PR?

  4. Before this change, automatic connections to peers listening on non-default ports were discouraged, but not impossible. Under what circumstances would a node still connect to such a peer?

  5. After this PR, the default port still plays a role in bitcoin core. Where is it still used? Should it be a long-term goal to abandon the notion of a default port entirely?

  6. The PR introduces a list of “bad ports” that was taken from internet browsers. Do you agree with having a list like this in general? Are there any reasons to deviate from the list used by browsers?

  7. What is the reason for allowing callers to pass salts to CServiceHash and then initializing it with CServiceHash(0, 0) in commit d0abce9?

Meeting Log

17:00 #startmeeting

  117:00 <svav> Hi
  217:00 <lightlike> hi
  317:00 <kouloumos> hi
  417:00 <stickies-v> hi!
  517:00 <glozow> hi
  617:00 <willcl_ark> Hi
  717:00 <ziggie> hi
  817:00 <lightlike> Today's Review Club will be about PR 23542 ("Open p2p connections to nodes that listen on non-default ports")
  917:01 <michaelfolkson> hi
 1017:01 <lightlike> See https://bitcoincore.reviews/23542 for the notes
 1117:01 <dergoegge> hi
 1217:01 <lightlike> Is anyone here for the first time?
 1317:01 <sipa> hi
 1417:01 <jnewbery> hi
 1517:02 <larryruane> hi
 1617:02 <schmidty> hi
 1717:02 <sipa> this meeting seems to be hi-ly attended
 1817:02 <lightlike> OK - who got the chance to review this week's PR (y/n)?
 1917:02 <bitplebpaul> y
 2017:02 <glozow> y
 2117:02 <svav> n but I read the notes and looked at the code
 2217:03 <stickies-v> y
 2317:03 <dergoegge> n
 2417:03 <willcl_ark> light y
 2517:03 <sipa> I read through it in an earlier iteration.
 2617:03 <ziggie> n
 2717:03 <kouloumos> n
 2817:03 <effexzi> Hi every1
 2917:04 <lightlike> and what's your impression? Concept ACK / NACK?
 3017:04 <emzy> hi
 3117:04 <emzy> n
 3217:05 <Kaizen_Kintsugi_> hello
 3317:05 <stickies-v> tACK 36ee76d - properly being able to use different ports seems a great idea to make the network more resilient
 3417:05 <svav> Concept ACK - it seems a good idea in terms of not being able to easily shut down the network
 3517:05 <sipa> concept ack
 3617:05 <Kaizen_Kintsugi_> y
 3717:05 <michaelfolkson> I think if you are Concept ACK of #23306 you have to be a Concept ACK of this PR. And #23306 is merged :)
 3817:06 <lightlike> michaelfolkson: yes, there's a point to that.
 3917:06 <sipa> Agreed (especially as I wrote 23306, :p)
 4017:06 <lightlike> ok, lots of concept ACKs - let's move to the first q:
 4117:06 <lightlike> What were the historical reasons for the preferential treatment of the default port?
 4217:06 <svav> to prevent the Bitcoin P2P network from being leveraged to perform a DoS attack on other services, if their IP/port would get rumoured.
 4317:07 <ziggie> how does addrman disfavour other ports right now, does disfavour mean no chance to get a connection to another port than 8333, or is there a way ?
 4417:08 <glozow> in the past, i imagine addrs were also gossiped more freely, i.e. with fewer rate limits?
 4517:08 <bitplebpaul92> ziggie i believe there are ports that won't ever be attempted, like port 22 (ssh)
 4617:08 <sipa> ziggie: addrman actually doesn't care about ports; it's the outgoing connection logic that favors standard ports
 4717:08 <lightlike> zigger: addrman doesn't disfavor ports - the connection logic in net.h does.
 4817:08 <stickies-v> based on sipa 's answer I found somewhere, another reason could to be make it harder for an attacker to fill people's addrtable with many IP/port combinations of the same node, which could potentially be used for eclipse attack
 4917:08 <lightlike> sorry, net.cpp
 5017:08 <sipa> svav: That's the folklore reason, not actually the historical reason ;)
 5117:08 <svav> Doh!
 5217:08 <stickies-v> oh - I think mine is folklore too
 5317:09 <glozow> yeah, sipa's description on #23306
 5417:09 <bitplebpaul92> by folklore do you mean a sort of revisionist history?
 5517:09 <sipa> Though I don't know to what extent this is public. I recently saw some (alleged) leaked satoshi emails that justified this preference, and it only mentioned the concern about eclipse attacking (before that term existed).
 5617:09 <bitplebpaul92> I've never come across the term folklore in this context
 5717:10 <lightlike> maybe it was also about reputational concerns? Bad publicity if bitcoin nodes connect to you on various ports, even if this is not DOS-worthy?
 5817:10 <michaelfolkson> Ok so Satoshi's concern was eclipse attacking but he/she was wrong to be concerned about that
 5917:10 <sipa> The explanation of worrying about non-Bitcoin services being DoS'ed by Bitcoin... I don't know where it came from.
 6017:10 <sipa> michaelfolkson: I don't think he was! The pre-addrman IP address table was certainly vulnerable to that.
 6117:11 <sipa> (but to many other concerns too)
 6217:11 <michaelfolkson> Wrong to be concerned about that with regards to supporting different ports
 6317:11 <emzy> Maybe only have a allowed range of port like >1024
 6417:12 <lightlike> emzy: in a way, that is what the blacklist does
 6517:12 <sipa> emzy: The PR does introduce a "bad ports" concept
 6617:12 <glozow> I was trying to figure out if this was a common concern - your software being used to DoS services and thus you ban certain ports that correspond to those services, and it seems like it's indeed a thing? https://jazzy.id.au/2012/08/23/why_does_chrome_consider_some_ports_unsafe.html
 6717:12 <sipa> lightlike: Yeah, the reputation aspect is a weak but real concern perhaps - that's also the reason why the PR has this bad ports concept.
 6817:12 <emzy> You see me unprepared :)
 6917:13 <michaelfolkson> It does seem simpler and easier to remember if everyone uses the same port. UX
 7017:13 <michaelfolkson> (Not that that is a strong enough rationale to demand everyone does)
 7117:13 <lightlike> michaelfolkson: yes, but there are also some advantages if not everyone does, which leads to the next question:
 7217:13 <lightlike> What are the benefits of removing this preferential treatment with this PR?
 7317:14 <svav> It’s not obvious that a Bitcoin node is running on an IP address.
 7417:14 <glozow> Hopefully over time we move towards a healthy balance of 8333 and non-8333 nodes to make Bitcoin connection traffic a bit less easily identifiable?
 7517:14 <stickies-v> it allows people that can't/don't want to listen on 8333 to still receive incoming connections, increasing the number of available nodes to connect to for the entire network
 7617:15 <svav> What is the answer to Q2 if it's not prevention of using Bitcoin for DoS attacks?
 7717:15 <glozow> stickies-v: ah indeed, if there are currently a bunch of under-utilized nodes listening on non-8333 ports
 7817:16 <glozow> are there? o.O
 7917:16 <lightlike> glozow: I think that incoming Bitcoin connection traffic would still be identifiable without too much effort. But blocking it is not as easy as just blocking a single port.
 8017:16 <stickies-v> glozow is the 8333/n-8333 a healthiness indicator for the network though? I think the network doesn't really care about the balance itself - it just allows more people to participate?
 8117:16 <bitplebpaul92> can ISP's ban specific ports?
 8217:16 <stickies-v> glozow I'm not sure about numbers, but I'd imagine there are / could be in the future?
 8317:16 <sipa> svav: The historical reason, as far as I know, was concerns about someone being able to listen on 1000s of ports on the same machine, rumouring all of those as separate addrs, and thereby sort of cheaply eclipse attacking the network.
 8417:17 <svav> sipa: ok thanks
 8517:17 <sipa> (and it doesn't apply anyone since addrman, which buckets based on source range of IP anyway; it doesn't treat multiple ports on the same IP any different anymore from multiple IPs in the same range)
 8617:17 <sipa> *anymore
 8717:17 <willcl_ark> With bitcoin traffic so easily identifiable on the wire I do wonder how much benefit it can bring to someone being censored at e.g. ISP level on port 8333 though... However if people have a simple local block on the port, I suppose it can help a little
 8817:17 <glozow> stickies-v: er, i probably shouldn't have used the word "healthy," just like... varied
 8917:17 <lightlike> not sure if ISP's are in the business of doing this, but local network administrators (e.g in public netowrks) certainly can and do.
 9017:18 <glozow> lightlike: so theoretically an ISP or local network admin drops stuff that's going to a 8333 port?
 9117:18 <sipa> Also don't forget that ISPs aren't free from government intervention/regulation.
 9217:18 <willcl_ark> Yeah, much easier for a gov to say "block port 8333" than the vague "block all bitcoin traffic"
 9317:19 <willcl_ark> ...but perhaps not that much harder (without fully encrypted traffic)
 9417:19 <sipa> Costs matter.
 9517:19 <sipa> And BIP324 (v2 p2p transport with opportunistic encryption) will make it more expensive still.
 9617:20 <emzy> I can think of an easy eclipse attack with configurable ports. Run 10 bitcoind on the same random port and filter the internet connection of the victim to that port.
 9717:20 <willcl_ark> Was just trying to look up where that got to :)
 9817:20 <ziggie> how are tor/2pp/ip4/ip6 connection favoured for incoming connection, are they regarded with the same importance ?
 9917:20 <stickies-v> and perception matters. It's much easier to claim a network needs to close certain ports for security reasons (without specifically targeting use cases), than to specifically target bitcoin packets (which you have to be specific about)?
10017:20 <ziggie> *i2p
10117:21 <sipa> all of tor and all of i2p are treated as one or a few "network groups".
10217:21 <ziggie> sipa thanks
10317:21 <larryruane> basic question... doesn't ability to connect to alternate ports already exist because it's used by the functional tests (regtest)? Does this PR enable such for non-regtest? (seems like it's doing a lot more than that)
10417:21 <sipa> ipv4 and ipv6 consist of many network groups; if you use asmap, those groups are the AS numbers of providers
10517:21 <bitplebpaul92> kazhakstan and ISP & government world has been interesting re. the protests there
10617:22 <glozow> larryruane: oh, interesting. but those are manual connections right?
10717:22 <sipa> larryruane: It's not that functionality to connect to custom ports doesn't exist (it has always existed), and for manual connections you can do whatever you like. The change is that this PR stops the *automatic* outgoing connection selection mechanism from *disfavoring* non-8333.
10817:22 <lightlike> larryruane: the ability was always there (and it is possible to connect to other ports via manual connections) it's just the automatic connections, where we wouldn't connect (although we technically could)
10917:22 <svav> Someone explain this please - If you don't have a standard port for Bitcoin, isn't this going to make it difficult for the network to function, because no-one knows a standard port that will be used??
11017:22 <stickies-v> larryruane ThreadOpenConnections allows you to specificy manual addresses to connect to which comes before this non-default port logic: https://github.com/bitcoin/bitcoin/blob/1e8aa02ec5cc2819c67ef40a7573c4b23a4c11cc/src/net.cpp#L1877
11117:23 <larryruane> thanks!
11217:23 <sipa> svav: That's the bootstrap problem, and it's an annoying problem, but we do have some mechanisms for it. It isn't particularly made harder by not having a standard port though.
11317:23 <jnewbery> sipa: I believe the tor/i2p network group is based on the first 4 bits of the address so each address is in one of 16 netgroups
11417:24 <lightlike> svav: if you are on a non-standard port, you also advertise your own address with it in addr gossip relay, so others will know to connect to you on that port.
11517:24 <sipa> jnewbery: that sounds right
11617:24 <stickies-v> svav I think another way to look at it is that the IP address is as unknown as the port, so if you know one you should be able to know the other through the same communication?
11717:25 <svav> ok I see
11817:25 <sipa> In IPv4 it's kind of possible to literally trying to connect to every IP address on a particular port (certain botnets have done that), which would be a... very naive way of bootstrapping that's technically made impossible by using random ports. On the other hand... don't do that.
11917:26 <lightlike> but this means if you for some reason chose a new random port every second day, you'll likely not get many incoming connections - so that would not be advised
12017:26 <sipa> stickies-v: One thing is that DNS seeds can only convey IP addresses, not ports. But there are alternatives (DNS seeds also can't relay torv3).
12117:27 <lightlike> great, moving on the next question:
12217:27 <stickies-v> sipa oh right lightlike did comment that on the PR. Would a straightforward solution then not be to upgrade the seeders to relay ports too? Is there anything technically complicating that?
12317:28 <lightlike> Before this change, automatic connections to peers listening on non-default ports were discouraged, but not impossible. Under what circumstances would a node still connect to such a peer?
12417:28 <emzy> So the dns seeds would be only good for default port nodes. I think not many people would change the default port. So no problem.
12517:28 <sipa> stickies-v: That would be very hard, actually, because the DNS system isn't designed for resolving ports, only IPs. But there are alternatives to using DNS in the first place.
12617:28 <glozow> after 50 invalid addresses?
12717:29 <stickies-v> ah right I didn't think of DNS limitations, thanks. interesting
12817:29 <sipa> It's the Domain name system, not the Service name system.
12917:30 <lightlike> glozow: correct! and this behavior is kept the same for the "bad port" list, so if nothing else works for 50 tries, we'll also try a "bad port"
13017:30 <emzy> DNS seeds sould be still good enough to get some good nodes.
13117:30 <ziggie> can I somehow dump all my know ipaddress with their specific ports with bitcoin-cli ?
13217:30 <bitplebpaul92> +1 ziggie
13317:31 <glozow> so our treatment of "bad ports" is treated how we used to treat non-8333, and non-bad non-8333 and 8333 is treated the same as how we used to treat 8333
13417:31 <lightlike> it would just be bad if DNS nodes listed IPs that are listening on non-default ports (so that other nodes would try to connect to them on the default port and fail). But I think this is not he case with the current seeder software.
13517:31 <jnewbery> ziggie: getnodeaddresses 0
13617:33 <lightlike> glozow: yes, that sounds right!
13717:33 <lightlike> moving on: After this PR, the default port still plays a role in bitcoin core. Where is it still used?
13817:33 <sipa> From my bitcoin-seeder software, in db.h:
13917:33 <sipa> bool IsGood() const {
14017:33 <sipa> if (ip.GetPort() != GetDefaultPort()) return false;
14117:34 <willcl_ark> As our default listen port?
14217:35 <glozow> Guess: if no port is provided, we connect using the default?
14317:35 <stickies-v> lightlike I think it also defines the default port of the rpc?
14417:35 <svav> Is it still used to help new nodes get onto the network somehow?
14517:35 <lightlike> willcl_ark, glozow: yes! that is not changing with this PR.
14617:36 <lightlike> stickies-v: the rpc default is different from the p2p default port.
14717:36 — stickies-v is clearly not an RPC poweruser
14817:36 <glozow> is that this? https://github.com/bitcoin/bitcoin/blob/1e8aa02ec5cc2819c67ef40a7573c4b23a4c11cc/src/net.cpp#L427-L428
14917:38 <lightlike> glozow: I think that code just gives you the default p2p port, depending on what you connect to (a string, or an IP address)
15017:39 <lightlike> but as mentioned before, the default port is also added to the DNS seeder results we get, to be able to connect to theses addresses and save them to addrman
15117:40 <lightlike> related q: Should it be a long-term goal to abandon the notion of a default port entirely?
15217:41 <bitplebpaul92> i would think no
15317:41 <glozow> mmmaybe not? We have different default ports for testnet vs mainnet, would it be bad if we didn't have those distinctions?
15417:41 <emzy> lightlike: I think that will make DNS seeds not work anymore (ipv4/ipv6).
15517:42 <lightlike> emzy: yes, I agree, at the very least we'd need an alternative to the DNS seeds before doing something like this.
15617:42 <stickies-v> I'm not sure there's a need for that - it wouldn't really be user friendly to make everyone (including people who don't know what a port is) define which port they want to use?
15717:42 <bitplebpaul92> if a node couldn't find peers, a default port would still be useful as a last-resort?
15817:42 <sipa> @glozow Network magic will still make inter-network connections fail immediately anyway.
15917:42 <glozow> sipa: aha, thanks
16017:43 <stickies-v> hmm it could just be a random port instead of user defined port of course. Still not sure there's a clear benefit to that
16117:44 <kouloumos> sipa mentioned that there are alternatives to using DNS, what those could be?
16217:44 <lightlike> bitplebpaul92: a port alone won't help you find peers as a last resort, you'll also need an address from a peer.
16317:44 <bitplebpaul92> right
16417:45 <willcl_ark> We could switch to BBS :P
16517:45 <sipa> or back to IRC seeding
16617:45 <svav> Do we know a reason why this PR (and 23306) was felt necessary at this stage? Is it just to make Bitcoin more resilient? Is there any reason to feel default ports make it vulnerable?
16717:47 <sipa> It's just an terrible gratuitous privacy leak today.
16817:47 <sipa> Using port 8333 is yelling "bitcoin node here!!!"
16917:47 <lightlike> svav: I thing that one reason is that all further attempts to obfuscate bitcoin traffic are a bit moot if everythin just goes over 8333
17017:47 <sipa> And it's practically impossible to use any other port die to the discouragement rule.
17117:48 <lightlike> next q: The PR introduces a list of “bad ports” that was taken from internet browsers. Do you agree with having a list like this in general? Are there any reasons to deviate from the list used by browsers?
17217:48 <stickies-v> have we had any/significant amount of reports from people unable to use port 8333 or is that more of a preventative thing? difficult to measure of course, just wondering how big of a role that played in the prioritization
17317:48 <sipa> And, after realizing how little of a change the previous PR was (the one permitting multiple ports per ip), there was little reason not to go for itm
17417:48 <svav> OK I see
17517:49 <bitplebpaul92> lightlike the rational of avoiding ssh ports and other ports where attempted communications might result in a banned IP address make sense to me
17617:49 <sipa> @[stickies-v] Nobody even tries. It requires a custom config that is equivalent to "I only want scrapers/spy node connections".
17717:49 <sipa> And it isn't that people necessarily actively want to run on a different port.
17817:50 <sipa> It's us that should be working on reducing the friction for doing so.
17917:50 <lightlike> I agree. there is issue https://github.com/bitcoin/bitcoin/issues/24284 with a suggestion to also include ports used by browsers (which are obviously not on the browser's lists) that may make sense
18017:50 <svav> Re security leak, you can see 8333 means Bitcoin node here, but once you know that, are you then easily able to further compromise the node? I mean is it easy to start reading node traffic?
18117:50 <michaelfolkson> Are there other protocols using particular ports who are going to be annoyed if a few Bitcoin users use those ports?
18217:51 <sipa> @svav If you're under an authoritarian regime, you may not want people to know you're running a Bitcoin node in the first place
18317:52 <sipa> That on itself is an issue already, even ignoring what's possible with that information.
18417:52 <sipa> @svav And yes, reading traffic is trivial.
18517:52 <michaelfolkson> Not sure how one tries to discourage other protocols from using "your" protocol's port. Other than loudly trying to claim it as your protocol's port
18617:52 <sipa> (but even doing that at scale may be costly to attackers)
18717:52 <baraclese> I use a socks5 proxy for my bitcoin node at home
18817:52 <lightlike> michaelfolkson: there may be webadmins (e.g. in organisations) that monitor specific ports and may become annoyed.
18917:53 <sipa> The thing with 80 and 443 (http and https) is that they are very commonly "public services" that always get connections from everywhere.
19017:53 <sipa> That's not true for 22 (ssh) for example.
19117:54 <sipa> Also, I think we want to keep the possibility of disguising Bitcoin P2P traffic as https traffic in the future.
19217:54 <lightlike> so 80 and 443 may be particularly good choices to run a bitcoin node, because the traffic isn't looked into deeply anyway if everyone uses them?
19317:54 <sipa> (after BIP324 encryption)
19417:55 <sipa> Quite possibly.
19517:55 <sipa> It'd be even better if the traffic actually can't be distinguished by third parties from actual https traffic.
19617:56 <lightlike> moving on to the last question, which is about the second part of the PR (addr relay):
19717:56 <lightlike> What is the reason for allowing callers to pass salts to CServiceHash and then initializing it with CServiceHash(0, 0) in commit d0abce9?
19817:57 <stickies-v> we want the randomness to be deterministic, so by passing the same (0, 0) salts the same IP:port should lead to the same hash consistently
19917:57 <glozow> We always use the same salt so that, if we get the same address again (within the 24hr time slot), we relay it to the same "random" peers, so there's no advantage to sending us the same address twice
20017:58 <bitplebpaul92> after 24 hours what changes? a nonce?
20117:59 <lightlike> stickies-v glozow : exactly! If we'd use a different salt, we'd send a given address to different peers in that 24h window, which is not what we want.
20218:00 <glozow> bitplebpaul92: the hash changes, which means the peers we select also changes
20318:00 <lightlike> alright, thanks for participating everyone!
20418:00 <lightlike> #endmeeting
20518:00 <willcl_ark> Thanks!
20618:00 <bitplebpaul92> ah
20718:00 <glozow> we're using the hash to select 1-2 random peers to forward the address
20818:00 <glozow> thanks lightlike!
20918:00 <bitplebpaul92> thanks lightlike and everyone
21018:00 <emzy> Thanks lightlike and all!
21118:00 <jnewbery> thanks lightlike! Great meeting!
21218:00 <ziggie> Thanks lightlike for hosting
21318:00 <sipa> thanks @lightlike!
21418:00 <stickies-v> bitplebpaul92 we use integer divsion on https://github.com/vasild/bitcoin/blob/36ee76d1afbb278500fc8aa01606ec933b52c17d/src/net_processing.cpp#L1781 which causes the hashed message to change only every 24 hours}