Topic: Scalability and transaction rate
I’m curious about the developers feelings on scalability. For example, could the system handle a million users, doing say 5 transactions each per day. 5 million transactions per day is roughly 35,000 transactions per 10 minute period?
Is there a bottle neck in propagating 35,000 transactions to a million nodes for block generation? Or has that issue been designed for?
Large transaction rates are a key future bottleneck of the current bitcoin system.
Is this a completely connected network? Or is there some sort of ordered propagation scheme? Didn’t read that section of code yet.
As far as I can tell, it’s a completely connected network. And it kind of has to be for the security to work – the main idea is that an attacker needs nearly 50% of the computing power of the entire network to inject false transaction information into (or delete honest transactions from) the network. If there was a propagation system, you’d only need to to have about 50%+ of whatever the lowest rung is, and then the rung would be duped into transmitting your false information, possibly beating out the other lowest rungs to dupe the next higher up rung, and so on.
The bottleneck was my first reaction when I read about bitcoin. For one thing, the way the blocks are currently set up, it takes linear time with respect to the total number of transactions (and I think more than that in memory) to verify one transaction. So after 10 years of bitcoin, you need to either go through all 10 years worth of transactions, or have a snapshot of all the account balances of every user ever and work from there. And since the blocks form a straight chain, all of those 35,000 transactions need to go into the same block (if block generation continues to average one per ten minutes). There’s currently no way to split that up and have the network work on 350 blocks with 100 transactions each simultaneously, because each block needs the hash of the previous block. That seems particularly silly with the way bitcoin transactions are verified. If everyone waits for one to three blocks to verify a transaction before spending their bitcoins, that means that none of the 35,000 transactions in one ten minute period can depend on each other, and most shouldn’t even depend on the previous block. And as far as I can tell, there’s no way for the network to speed up block creation in response to transaction overload – if the network ever hit the point where transactions were being made faster than they were being hashed into blocks, the problem would snowball until enough people quit bitcoin out of frustration that the number of transactions dropped below block creation speed. And if block creation were to speed up (via difficulty decrease) then the reward for hashing a new block would have to go down to maintain the planned rate of bitcoin introduction. The nodes would have to know that during the Great Bitstorm of August 2013 blocks only netted the creator 2.5 bitcoins – and they’d have to figure it out during the Great Bitstorm of August 2013.
Also, there’s a potential problem if block creation speed were to exceed the speed of information on the network. Chains could be built faster than they would be distributed to the rest of the network, which could mean that two different network segments could work on two different chains simultaneously.
And of course it’s worse than 35,000 transactions / ten minutes – we’d probably see something like 50-60k+ transactions during peak hours (depending on how geographically distributed the network is) and much less during the off hours. There would be a need to dynamically throttle proof-of-work difficulty and coin reward hour-by-hour. Since bitcoin is computer based, there’ll probably eventually be a fair number of automated systems buying and selling bitcoins – a sudden sharp fluctuation in stock markets or currency exchange rates could set those off all at once. (Although the verification system might help with this, since you can’t do microsecond scale trading with bitcoins.)
It’s not a completely connected network in the sense that every client isn’t connected to every other client. If by completely connected you mean there’s x degrees of separation between everyone, then yes, that’s how it is. Data gets “spread” throughout the network as time goes on, not instantaneously (though it is pretty fast).
This was mostly a thought experiment for satoshi, as I understand it. If the userbase continues to grow substantially, then obviously a great amount of work will go into optimizing and revising the algorithm to support more advanced and intelligent distribution.
Thank you both!
I understand that all transactions and all validated blocks need to get to every node. I was just curious about the implementation of the broadcast mechanism.
At worst case a million nodes would each have a million minus one open connections (bad). A better case would have each node broadcast to log(n) nodes who relayed to log(n) node, and to complete the broadcast in log(n) time.
Do you have any details of the thought experiment quantumplication?
Thanks for your analysis Hepatizon! It was about what I was expecting.
However, I don’t think this is quite correct.
It appears that each new transaction has direct links to each prior transaction being used as an input. To validate a transaction you only need to validate the immediate dependancies. You don’t need to validate those dependancies’ dependancies as they were previously validated as part of the block chain. That means closer to constant time rather than linear.
I’m new though so I could be reading it wrong. Please correct me if I did.
The rest of your critique seems well analyzed and a bit frightening. Thanks again!
You’re right – I keep forgetting that the actual coins are tracked rather that just the account balance. So yeah, you only need to go back find whoever last owned the coins in question and verify that they’re the same person sending them. Which is technically linear time in that if the first coin ever was hoarded for ten years you really would need to go through the entire ten year block chain (unless you indexed it, which would be easy as all the coins are numbered) to see who had it – but in practice it would be closer to constant time because most people won’t hoard bitcoins. (Unless they figure out that deflation is all but guaranteed…) For some reason I keep thinking that you would need to verify the account balance of the sender, and that would require verifying the account balance of everyone who has sent the sender money, and everyone who sent them money, and so forth.
Er, by “thought experiment”, I meant “proof of concept”. Satoshi seems to be pretty intelligent, and I don’t think he was naive enough to think he would come up with the final specifications for the algorithm in the first go. In ANY project, there’s always new things that crop up, and a well designed system is not one that accounts for every possibility, it’s one that can adapt and change cleanly.
I didn’t mean anything other than how does the network communicate now?
When I run the network I see 20ish connections. I’m assuming that is not every node running. How does the algorithm decide who talks to who?
There are several know distributed hash table topologies. Was wonder if it was using one of those.
Granted, it could be everyone talks to everyone at the moment, and to be optimized in the future. I’d find that a reasonable answer too.
AFAIK, it chooses some random connections to connect to from the IRC bootstrapper, but I’m not sure.
Actually communicating through federated irc servers would probably work as well as anything. 🙂
Based upon my understanding of economics there is only one way to resolve the scalability problem: the pricing system. Internet bandwidth, CPU time, hard disk space, etc are all valuable resources and expecting individuals to “donate time” or make a profit in “bit coin mining” is to arbitrary and leaves little room for innovation.
Effectively, if you want to post a transaction there must be a fee awarded to the individual who gets it posted in a block. There should also be some kind of fee for propagating the transaction to all of the nodes to begin with. Thus propagation isn’t automatic it comes at a price. Each node that passes along your transaction to other block creators should get a cut of the transaction fee. In this way the network can load balance based upon economic incentive and the “most valuable” transactions get highest priority at a price.
I used to have similar thoughts, but I decided I was wrong. It turns out that BitCoin is peer to peer in the sense that tier-1 internet providers are peer to peer. It is not P2P in the same sense that bittorrent is P2P.
Satoshi points out that he sees the NEED for at most 100,000 nodes verifying transactions and logging blocks. Others would use those nodes in a more client/service fashion. If I was guessing that number I would put it much lower (3<X<100) independent interested parties. Anyone can choose to be one of those X parties, or can choose to validate some or all of the work of those X parties, but only a small number are really REQUIRED.
Because in effect, the entire block list is mostly a distributed notary service. And though that requires rigor, it is not a particularly hard job.
So who would choose to do that job for free?
Knightmb for one. I’m have no doubt that many others like NewLibertyStandard will as well. Why Knightmb specifically? Well he has lots of coins and his business model and personal wealth relies on a well functioning honest system. Without that, his wealth and business simply evaporate. It is altruism and cooperation though self interest.
If you want a real world example consider the internet itself. At its amorphous core are about 7 tier-1 service providers. They are called tier-1 only because each has a mutually self interested “peering agreement” with the other six providers. That means these huge services providers move phenomenal amounts of date back and forth with zero accounting and zero payments among each other. They make all their money by selling their service to NON-tier-1 providers who make their money from selling it to you.
There is nothing to stop anyone from becoming a tier-1 provider. All you have to do is convince the others that it is in their self interest to peer with you. So for a example, Google is thought to pay zero in bandwidth to transmit youtube videos. There is enough traffic both ways, that it is in everyones self-interest to peer with them.
I expect that the same will happen with BitCoin. All the internal accounting and transfers will be free. (with the exclusion of transaction fees meant to avoid spam and mischief). Value will be created at the edges where people exchange bitcoins for something else of value.
Was the original question ever answered? Can the clients handle 35000 transactions per block? What about 100000? What is the bytesize of a transaction in the block? How much bandwidth is needed for the network to push these around in 10 minutes?
I am convinced that bandwidth, disk space, and computation time necessary to distribute and “finalize” a transaction will be prohibitively expensive for micro-payments. Consider for a second that the current banking industry is unable to provide a reasonable micropayment solution that does not involve depositing a reasonable sum and only allowing a withdraw after a reasonable sum has been accumulated.
Besides, 10 minutes is too long to verify that payment is good. It needs to be as fast as swiping a credit card is today.
Thus we need bit-banks that allow instant transfers among members and peer banks. Anyone can open a bit-bank but the system would, by necessity operate on some level of trust. Transfers in and out of the banks and peer-to-peer would still be possible but will be more costly. Thus, a bit bank could make money by enabling transfers cheaper and faster than the swarm with the added risk of trusting the bank. A bank has to maintain trust to make money.
The current system where every user is a network node is not the intended configuration for large scale. That would be like every Usenet user runs their own NNTP server. The design supports letting users just be users. The more burden it is to run a node, the fewer nodes there will be. Those few nodes will be big server farms. The rest will be client nodes that only do transactions and don’t generate.
Quote from: bytemaster on July 28, 2010, 08:59:42 PM
Besides, 10 minutes is too long to verify that payment is good. It needs to be as fast as swiping a credit card is today.
See the snack machine thread, I outline how a payment processor could verify payments well enough, actually really well (much lower fraud rate than credit cards), in something like 10 seconds or less. If you don’t believe me or don’t get it, I don’t have time to try to convince you, sorry.
I fully believe you and came to conclusion you did. I read the snack machine post after posting by earlier comment.