**Why blockchain?**

Before we can look into how blockchain works, we must first understand what problem itâ€™s trying to solve.

Blockchain is a distributed ledger that ensures trust between a number of concurrent users who donâ€™t trust each other, or a third party ledger.

**How does a blockchain work?**

There are several different variations of blockchain protocols, for this Iâ€™ll be using a very simple protocol; way to simple to be deployed, but useful for education.

A blockchain data structure is essentially a linked list of blocks.

Each block contains a number of transactions, the hash of the previous block as well as a proof of work.

About the proof of work.

The proof of work is what is ensuring the integrity of the blockchain. In order to append a block to the chain, you must have found an integer $p^*$ such that the proof of the previous block $p$ concatenated with $p^*$ has a number of leading zeros.

Exactly how many zeros depends on the difficulty scale of the blockchain. On production level blockchains, the difficulty automatically adjusts depending on the number of users who are trying to find a suitable proof, or â€˜mineâ€™ blocks.

Okay, thatâ€™s fine, but what happens when two nodes conflict on what blocks should go into the chain?

If thereâ€™s a conflict about which blocks should go in the chain, and the nodes end up with different chains, the node with the longest chain is the authoritative one.

And thatâ€™s all thereâ€™s to it.

Implementing a blockchain

What we will be implementing:

**Core data structures**

Proof of work Simple networking Consensus algorithm As thereâ€™s about 300 lines of code, I wonâ€™t be showing everything here; just the important bits. You can see all the code on Github.

Core data structures

The main object weâ€™re going to interact with is the Blockchain class which is going to contain the chain itself as well as the uncommitted transactions which will be baked into the next block.

```
class Blockchain():
def __init__(self):
self.current_transactions = []
self.chain = []
# genesis block
self.new_block(previous_hash=1, proof=100)
```

We also have a genesis block. The reason we have that is so that when we add our first real block, we can reference the hash of the previous block which would be the genesis block.

The attributes of the genesis block arenâ€™t important. The only thing that matters is that itâ€™s a valid block.

Creating a new block.

```
class Blockchain():
...
def new_block(self, proof, previous_hash=None):
block = {
'transactions': self.current_transactions,
'proof': proof,
'previous_hash': previous_hash or self.hash(self.chain[-1]),
}
# Reset the current list of transactions
self.current_transactions = []
self.chain.append(block)
```

Adding a new block is relatively simple, but there are a few things we need for it to work properly.

Firstly, we need to embed the transactions which is what we actually care about.

Secondly, we need the proof that this block is valid; that is the integer p^*pâˆ— that makes the hash of the previous proof concatenated with the current proof have a certain number of leading zeros.

Thirdly, we need to include the previous hash. This is so we can check if the chain has been corrupted later on.

As a bit of cleanup, we need to empty the transactions cache so that we donâ€™t commit the same transactions in multiple blocks.

Adding transactions to future blocks is also really easy:

```
class Blockchain():
...
def new_transaction(self, payload):
self.current_transactions.append(payload)
```

Where the payload is whatever content you want to store in your blockchain; whether it is actual financial transactions, posts on a social network, or pledges to sell oneâ€™s soul; it doesnâ€™t matter.

**Proof of work**

Proof of work is what ensures that nobody tampers with the blockchain; atleast without having access to more than 50% of the resources on the network.

```
class Blockchain():
...
def proof_of_work(self, last_proof):
# use random guessing to enable concurrent operation
proof = random.randint(1, 2**32)
while self._valid_proof(last_proof, proof) is False:
proof += 1
return proof
def _valid_proof(self, last_proof, proof, leading_zeros=4):
guess = f'{last_proof}{proof}'.encode()
guess_hash = hashlib.sha256(guess).hexdigest()
return guess_hash[:leading_zeros] == "0"*leading_zeros
```

As stated previously, the goal of proof of work is to find an integer, proof, $p^*$ such that $\text{hash}(p\Vert p^*)$ has a certain number of leading zeros. Where $p$ is the proof of the previous block.

Since weâ€™re using a very strong hashing algorithm; sha256, itâ€™s difficult to calculate a $p^*$ such that the hash starts with a number of zeros.

In fact, a viable approach is just to try a bunch of different numbers and hope you will get one right eventually.

We can do this very simply by incrementing $p*$ due to avalanche property of hash algorithms which causes the output to be wildly different even when thereâ€™s just a small change in the input such as incrementing a number.

When we eventually will have multiple nodes working on finding the same $p*$ itâ€™d be silly to have them all start at zero and doing redundant work by checking solutions which are already known not to work.

To combat this, we start the process at a random number between $1$ and $2^{32}$ which is the upper limit for a 32 bit unsigned integer.

**Networking**

Itâ€™d not be very interesting to have a blockchain if thereâ€™s just one node contributing to it.

You have to be a special kind of paranoid if you donâ€™t even trust yourself.

In order to make the blockchain work over the network with multiple nodes, we need a data structure to keep track of all the nodes on the network.

```
class Blockchain():
def __init__(self):
...
self.nodes = set()
```

For this, we simply use a set which solves the problem of duplicate nodes.

We donâ€™t need anything fancy to manage nodes.

```
class Blockchain():
...
def register_node(self, address):
parsed_url = urlparse(address)
self.nodes.add(parsed_url.netloc)
...
@property
def get_nodes(self):
return self.nodes
```

As I said in the beginning, this is very simple networking.

In the Github repository, thereâ€™s also included a lightweight Flask wrapper which exposes an api, so the blockchain is actually (barely) usable.

Consensus algorithm

The consensus algorithm ensures that the nodes agree on which blocks are committed to the chain.

We use a relatively simple algorithm where the longest chain is the authoritative one.

```
class Blockchain():
...
def resolve_conflicts(self):
new_chain = None
# only look for chains longer than current chain
current_length = len(self.chain)
for node in self.nodes:
# uses the Flask API available in the repository
response = requests.get(f'http://{node}/chain')
content = response.json()
if response.status_code == 200:
length = content['length']
chain = content['chain']
# check if other chain is longer than our chain
if length > current_length:
current_length = length
new_chain = chain
# replace chain if other chain is longer
if new_chain:
self.chain = new_chain
return True
return False
```

So the algorithm for resolving any conflicts is very simple; it just checks the chains of the other nodes on the network, and if one of their chains is longer than our own, we replace our own with the longer version.

And thatâ€™s really all thereâ€™s to it.

You now have the core of a functioning blockchain.

It may seem a bit daunting, but once youâ€™ve gotten some time to digest it all, you will likely find that blockchains arenâ€™t that complicated after all.

For the full implementation, check out the Github repository.