# But how does bitcoin actually work?

## What does it mean to have bitcoin?

Many people have now heard of Bitcoin, that it is a fully digital currency, with no government to issue it and no banks needed to manage accounts and verify transactions and yet many people, including among those, Bitcoin holders don’t know how to answer this question, at least not in full. So we are going to talk through it step-by-step until you have a clear picture of how it works, describing the process in a way as if you are thinking of this for the first time, like creating your own cryptocurrency.

Warning: from now on it can get a bit technical.

## Digital Ledger

To start off we will simulate a small situation we would like to simulate, image you and your friends are using an excel file to keep a track of all the times someone gives or receives money from another. Your file may look something like:

Ledger | |||
---|---|---|---|

Gera | pays | Juan | $20 |

Juan | pays | Dami | $40 |

Because you trust your friends there is no need of a central authority, or any third party to validate the transactions. We use the system of trust among each other to validate the transactions. Now if you want to scale your ledger, and you start trusting all the members less and less you may want to find a way to solve the trust issue. Here is where cryptography enters the picture.

Ledger - Trust + Cryptography = Cryptocurrency

A cryptocurrency, like any other digital payment system allows you to send and receive values. The difference between a cryptocurrency and the more conventional systems is in the nature of the transaction validation process. In the case of cryptocurrencies there is no bank, no central authority validating any transactions, but instead a decentralized trust-less verification based on cryptography.

If you keep evolving your ledger and make it publicly available, let’s say like for example on a website, anyone would be able to just go and add new lines. Now this is a big problem, as anyone is able to add an entry in the ledger, what prevents someone, let’s say Gera, to add a line saying `Juan pays Gera $100`

without Juan actually knowing or validating it.

This is where the first bit of cryptography comes in: Digital Signatures.

## Digital Signatures

Like a handwritten signature, the idea here is that Juan should be able to add something next to a transaction that approves he has seen it, and approved it. It should also be “impossible” for anyone else to forge his signature. Now for any digital system, a handwritten signature would not only be impractical, but also very insecure, as anyone would be able to simply replicate the image and use it to sign other transactions, so how do we prevent forgeries?

The idea here is now that anyone who wants to use our ledger would need to generate what’s called a public (pk)/private (sk or secret key) key pair, each of which is an array of bits, lots of them.

In the real world, your signature looks the same no matter what document you are signing, a digital signature, on the other hand, is much stronger, because it changes for different documents/messages. The signature is also an array of bits, commonly 256 bits, and altering the message even slightly, will completely change the signature on that message.

Formally: `sign(message, sk) = signature`

It is important here to mention, as your sk (private key) is how you sign the documents, that key should never be shared, it’s a secret, and so it must remain.

In addition to the `sign`

function we need a way to validate if a given signature is correct, for that we use a verification function, which would take the message, the signature, and the public key, accepting or rejecting the signature.

Formally: `verify(message, signature, pk) = T/F`

We won’t focus on the implementation details of each of these functions, there are algorithms and libraries that resolve this problem and guarantee that it is completely infeasible to find a valid signature if you don’t know the secret key. The only way you could possibly do it would be to brute force the signature and good luck with that! The signature is 256 bit long, so 2^^256 combinations… You can do the math…

Let’s see how our Ledger looks now:

Ledger | ||||
---|---|---|---|---|

Gera | pays | Juan | $20 | 001100001…. |

Gera | pays | Juan | $20 | 001100001…. |

If you pay attention to our new ledger, you may have noticed a problem, even though we cannot determine the signature for a new different transaction, we can always copy the exact same line, duplicating the signature as well, and it would work, as all the information contained in the transaction is the same, thus the signature won’t change. So how do we prevent this? Easy… we record each transaction with a unique identifier, something like this:

Ledger | |||||
---|---|---|---|---|---|

1 | Gera | pays | Juan | $20 | 001100001…. |

2 | Gera | pays | Juan | $20 | 100100101…. |

## Proof of Work

Let’s go now a bit off-topic to introduce a few new things. First, let’s define a hash function. A hash function is a function that takes any kind of message or file and outputs a string of bits with a fixed length, like 256 bits. The interesting thing is that this function will produce the exact same result for the same given content, but any slight alteration in the content would result in a completely different output. In fact, if you use a hash function, like say SHA256, you not only guarantee the above, but you can be pretty much certain that no one would be able to inverse the function and produce the input from knowing the output. Interestingly enough, there is no mathematical proof that it is hard to compute the function in the reverse direction, though no one has been able to do so.

Now that we clarified now, let’s look into how we can prove that a particular list of transactions is associated with a large amount of computational effort.

Imagine someone shows you a list of transactions, and they say “I found a special number so that when you put this number at the end of the list of transactions and apply SHA256 to the entire thing, the first 30 bits of the output are zeros”.

How hard do you think it was for them to find that number? For a random message, the probability that the hash happens to start with 30 successive zeros is 1 / (2^^30) or 1 in a billion. And because SHA256 is a cryptographic hash function, the only way to find the special number would be by just guessing and checking.

The amount of work you need to do to find this special number, which can be quite intensive, it’s what is called “Proof of Work”. And importantly, is that all this work is intrinsically tied to that list of transactions, remember that any small change into the input data would completely change the output of the hash function, requiring to do all the work to find the special number once again.

## Distributed Ledger

So far we worked under the assumption that the ledger is some public place, like a website or spreadsheet which is publicly available. But the problems with this approach are many including… who owns the website? who controls the rules of adding new lines?

In order to avoid falling into a central authority we will let everyone keep their own copy of the ledger. But how would this work?

Similarly to the previous case, when someone wants to make a transaction, like “Gera pays Juan $100”, instead of going into a website to make the change, the transaction is instead broadcasted into the world for people to hear and record it into their own copies of the ledger.

This concept is beautiful, but it generates a lot of problems, how can we make sure everyone has the same copy of the ledger? what happens if someone misses one of the transactions? How can we make sure we maintain the order of the transactions? This is not an easy problem, and it’s what’s addressed in the Bitcoin original paper.

On a very high level, Bitcoin proposes to trust whichever ledger has the most computational work put into it, and this idea is now not only at the heart of Bitcoin but any other distributed cryptocurrency. The way we define computational work is through the Proof of Work we explained before.

Let’s look into details on how this works. Let’s go back to our distributed ledger example. Everyone is broadcasting transactions, and we want everyone to agree on what the correct ledger really is. For that, it is necessary to organize the ledger in a series of small chunks called blocks, where each block consists of a list of transactions, together with a proof of work (our special number).

In the same way, a transaction is valid if it’s signed by the sender, a block is only valid if it has a proof of work. In addition, so that we can put all the blocks in the right order, we will make so that each block contains the hash of the previous block, so that way if you need to change a block, or change the order of it, you would need to calculate the block’s hash, which changes the next block… and so on, and so on. This would require calculating the proof of work for each of these blocks, resulting in a tremendous amount of computational power needed to do so.

Because the blocks are chain together like this, instead of calling it a ledger, it is commonly called a “Blockchain”.

As part of the design of our Blockchain, we will allow anyone in the world to be a “block creator” or “miner”. A miner is someone who’s listening for transactions being broadcast, collect them into a block, and then do the task of finding the special number to add the proof of work to the block. After they find it, they broadcast the new block. Every time a block is created, the miner is allowed to include a special transaction at the top in which he/she gets X amount of cryptocurrency out of thin air. This is called the block reward, and it’s how new cryptocurrency is created into the ledger. One special note about this transaction is that it has no sender, and thus does not have to be signed.

Now that we introduce blocks, we can make a special use case of our blockchain. Someone who’s using our system, instead of listening for transactions can now just listen for new blocks being broadcast by miners, and thus updating your own copy of the blockchain.

This already reduces the complexity of putting the chain together, however, we could phase a scenario where we hear 2 distinct blockchains with conflicting transaction histories. In this case, you simply ignore the shorter one and you keep always the largest chain, the one with the most work put into it. If there is a tie, just wait until you hear of an additional block.

When everyone follows these rules, we have a trustworthy system for decentralized consensus. But is it safe?

Think of the scenario where someone wants to introduce a fraudulent block, let’s say the transaction “Gera pays Juan $40”, Gera then adds the block to his copy of the blockchain, but he does not broadcast for everyone else, just sends the block to Juan. So everyone else but himself and Juan, still thinks that Gera owns those $40.

Now if Gera wants to keep this going, he will have to work harder than the rest of the mining network to keep on generating new blocks so that Juan does not discard his history in favor of another longer chain. This is now pretty much impossible to do so unless you own 51%+ of the network.

Notice this means that you shouldn’t necessarily trust a new block that you hear immediately, instead, you should wait for several new blocks to be added on top of it. If you still haven’t heard of any longer blockchains, then you can go ahead and trust that block.

## Particularities of Bitcoin

So far we have covered the main ideas of a distributed ledger and cryptocurrencies in general, which is in a way more or less how Bitcoin protocol works. There are a few particularities when it comes to bitcoin though which are interesting, for example, Bitcoin periodically changes the number of zeros required for the proof of work, incrementing them each time so that it should take, on average, around 10 minutes to find a block. Additionally Bitcoin also changes the reward in Bitcoins each miner gets when creating a block, reducing it by half each time. This is a very interesting event, and it’s what happened recently when everyone was talking about halving.

Another interesting point is that since rewards are decreasing over time, there is a maximum amount of Bitcoins that could ever be in existence, which is 21 million. So would miners stop generating money? Not necessarily, miners can also opt for adding a transaction fee, to each transaction that will enter in a block.

## Next…

I hope I was able to clarify for you how blockchain and Bitcoin work under the hood and that you have learned as much as I did when researching the topic. In the next series of posts related to crypto, we will design and implement our own blockchain in Python, starting with the very basics, to adding cryptography and a client application until we have a great functional demo.

Thanks so much for reading!