fulmanski.pl: tutorials

Chapter 2

The tools

Initial version: 2024-02-10
Last update: 2024-03-06

As necessity is the mother of invention, but every invention needs some tools to allow ideas to emerge and last in the form of something usable and practical.

In this chapter you will see briefly what may be helpful and what is really needed in implementing real blockchain.

Table of contents

Need for dispersion
Details of ownership
- What does it mean to be an owner?
- Concepts you need to prove your rights to something
Blockchain
Summary

Need for dispersion

To make blockchain operationess independently on politics, governments, institutions or dictators will, insensitive to ad hoc attempts at interference dictated by the desire to support their ideas and views the key feature is to get rid of single point of control. You should spread your system to as many nodes as you only can. None of them should control others. In simply words, you should implement it such a way that allows to use the potential hidden in the wisdom of the crowd discussed in previous chapter.

Among advantages of a multicomponent (multi-node) system over single one (single node) you may list:

higher computing power,
ability to scale (increase processing power) naturally,
potentially higher reliability,
possibly cost reduction.

To be fair you shouldn't forget that multi-node system is much more difficult to design and implement. There are many traps and issues related mostly to communication between nodes. Oh, I should rather say: failures in communication. All the problems begins when communication fails or became unreliable. In short, your system by design should be prepared not for that problems occur but rather they are its inherent element according to well known Murphy's law "Anything that can go wrong will go wrong, and at the worst possible time."

As an example of such a system, with high degree of distributivity, you may want to consider an Internet. Personally I'm not sure if this is a correct analogy. Undoubtedly Internet consist of many independent nodes and, considered as a one system, offers tremendous scalable computing power with high reliability. On the other side is not focus on one particular goal. Rather it is a union of multiple "services" with limited range. In blockchain you assume existence of many nodes communicating with each other. In Internet each node communicate with just a few other nodes which offer services it needs. So from the perspective of one node communication is very limited and does not require any "synchronization" – data you exchange with one social media service are not dependent on data you exchange with another one service. In blockchain you have to communicate with many nodes in order to agree on one common version of the truth.

(De)centralized systems

You can design multi-node system in many different ways taking into account various factors and prefer some features or functionality over others. One of the crucial choice concerns the way in which components of the system are organized and related to one another. Roughly speaking, you chave a choice between purely distributed (decentralized) or centralized architecture.

Decentralized systems is a system whose components form a network without having any central element of coordination or control. This type of architecture is very often compared with centralized systems where, contrary, all the components are located around and connected with one central component. However the latter is something you want to avoid for the reasons already mentioned.

The major advantages of decentralized systems over centralized systems are:

The functioning of the whole system does not depend on the condition of the central element. This way, there is no single point of failure. Of course you can prevent dependability from single point introducing secondary main node, performing a role of a backup for central node, ready immediately to take over all control responsibilities. However this introduce more problems than it solves: now all nodes must be synchronized with main node and this one in turn must be constantly synchronized with its backup.
The functioning of the whole system does not strictly depend on the condition of the network connecting its components. Very often nodes of a decentralized system can work to some limited extent even though some communication failures occur.

The disadvantages of decentralized systems compared to centralized systems are:

High communication and coordination overhead. Because there is no single, well known, reference point acting as a source of truth a lot of messages must flow between nodes to establish one common version of events, one mutually acceptable form of data.
Higher complexity in terms of algorithms, protocols and coding which may reflect overall system security.

False-decentralized systems

As you can see, decentralized system is much more desired but at the same time is much more difficult to design, develop and maintain. This is why sometimes you can find systems pretending to be decentralized.

Very often it is said there is no central node in such systems. But right after that, written with tiny font, there is a note that any node of the system can be chosen as a central node in some kind of voting procedure and then control the whole rest. When it fails another round of democratic elections begins to select again node which will act as a main node. The creators of such systems call them decentralized because there is no one dedicated node to be a central node, as every node can be selected as such a node. And nobody in advance knows which will perform this role. However when selected, the system acts as a purely centralized system.

The worse thing is that in case of disaster (which is why you favour decentralizarion over centralization because it promise that your business will survive) false-decentralized systems will behave like centralized and thus may cause destruction of your business. If you ever have any doubts whether system is decentralized ask yourself if you can find a single part of it which when being malfunctioned will disrupt the whole system, at least for the time needed to select a replacement for unreachable current main node. If the answer is positive, then the system is not decentralized.

Peer-to-peer systems

According to Cambridge Dictionary the noun peer means a person who is the same age or has the same social position or the same abilities as other people in a group [CD_peer]. By analogy, peer-to-peer systems are consist of individual, independent components (nodes) of equal rights and roles regardless of their real processing power, storage capacity, network bandwidth, geographical location or any other physical or logical properties. Although nodes may differ with respect to the resources they contribute, all of them have the same functional capability and responsibility and all of them may be at the same time both suppliers and consumers of resources they share.

If all nodes are equal then any interactions between them are carried directly – there is no need, and saying the truth no way, to designate a middleman node. In consequence you can expect shorter processing time and lower costs of operation. However you should keep in your mind that removing the middleman (central node) is by many considered a serious threat because they loose control on transferred data.

Peer-to-peer model is important to you because it does not favour the rich over the poor. In particular, no one can take control over the system only because it has a powerful (however this term is understood) node. The only option (if at all possible) is to control more than half of all nodes.

In conclusion of this section you can state that one of the tools you need is a distributed, decentralized, peer-to-peer system.

You need a distributed, decentralized, peer-to-peer system.

More about distributed systems you can find in [tan2007].

Details of ownership

Probably intuitively you understand what an ownership is. Computers don't have intuition and you have to precisely, algorithmically, instruct them what to do in order to confirm and manage ownership.

What does it mean to be an owner?

Ask yourself above question. What is the answer? I would say that for me to be an owner means I can prove my identity and show prove that uniquely identified object is assigned to me. So I need to be able to:

confirm owner identity,
confirm object identity,
confirm mapping of the owner to the object.

In broader sens, to decide about ownership, you should think about the whole spectrum of activities not limited to confirming but also managing, assigning, revoking, etc.

Concepts you need to prove your rights to something

To be able to complete above listed steps you need a set of tools allowing you to perform what at the level of the system designer or programmer is known under the following names:

identification,
authentication,
and authorization.

The following terms shows three related but very different security concepts or components of secure systems. I don't expect you are security expert but you should distinguish them to understand their role, especially that their common usage differs from precise technical meaning and thus distort their importance for system security.

Identification means claiming to be someone or something. I can say I am Piotr Fulmański (which is true) or, I might as well say I am Elton John (which is not true). I can say that the watch I have on my hand is Garmin Fenix 5 (which is true) or, I might as well say it is iWatch Ultra 2 (which is not true). Claiming plays only informative role and does not prove anything; it may be true as well as false. It simply narrows the area of search (consideration) to a specific object so you could focus on it in further processing.

Authentication means proving that object really is what someone claim to be. In real life this process is complex and object dependent. To authenticate a person you use different tools than authenticating watch.

In digital word to authenticate means to provide, or to show, some secret data that only you may know: passwords, PINs, one-time verification codes. Sounds simple but generates a new set of problems. If you present secret it wouldn't be a secret any more. So how to show that you know secret without revealing it? You do this with the help of cryptography.

Authorization means getting an access to something due to the previously authenticated identity. This proces seems to be simple as it involves checking if identified and authenticated person (object) has assigned rights to identified object which it is trying to use.

Technically it can be done as a simple table with owner in one column, assigned object in second column and right to this object in third column.

The three above are required to prove your rights to something. To prove your ownership you need to identify yourself and object as well as authenticate yourself and object so it would be known for others what objects are involved in ownership relation. The use of ownership requires furthermore your authorization to ensure that you perform only legal actions, that is actions assigned to you and possible to be executed on previously positively identified object.

Blockchain

Applying wisdom of the crowd

The problem of having a perfect ledger as the single source of truth probably can be solved on many different ways. One of them, which is a topic of our discussion, imitates the procedure in court hearings. Applying it to the use of a ledger for maintaining ownership (proving, checking, changing, etc.) results in the following strategy:

Single ledger, even well secured, can be forged. Like in a court, even a trustworthy witness can always be bribed. Because it's much more difficult to bribe many witnesses, so it should be much more difficult to forge many ledgers. For this reason you should use a purely distributed peer-to-peer system of many ledgers to minimize chances of forged "it" when some of its nodes will be compromised. In such a case, the question is what does it mean it, which register among many possible is correct? Well, like in real life, the (single) truth is that on which the majority of nodes agrees. Because there are many parallel registers we can agree that true is only this part which is exactly the same on most of the nodes. If most of the nodes certify correctness of some block of data, you treat them as witnesses of truth as we do in real life court; the more witnesses you have, the more truth is version of events they present.

Problem

The main problem with a distributed peer-to-peer system of ledgers is how to maintain integrity and achieve trust in the situation when the number of participants (nodes) is not known and dynamically change and they reliability and trustworthiness is unknown.

Blockchain is just a method to achieve and maintain integrity in purely distributed peer-to-peer system of ledgers. It's a tool whose existence is justified by a real need. If humanity find something better, probably blockchain will disappear.

You should understand an integrity as an ability to make true statements about ownership both to prove it and, if you are the lawful owner, transfer it to others.

Conclusion

At this moment you know enough to put all informations together and in a neat way conclude what blockchain, regardless of actual implementation or particular technical solutions, is.

Blockchain is a distributed, peer-to-peer, register-like data structure used to maintain ownership (particularly to make true statements about true ownership) or in a broader sens, to preserve the order and sequence of events.
The set of nodes constantly evolve. Nodes can appear and disappear in random times and their number vary through time.
Each node of the system stores the whole register and has full access to all its data.
Register is open for reading for everyone.
Writing is possible through cryptographic functions. However it is not possible to identify entity involved in transaction, you can always prove that you took a part in a transaction (if it is really true).
You need a special blockchain-algorithm to get the final verdict on what is true based on various parallel register versions stored each on individual node.
Consistency of blockchain data refers to ability (possibility) to find such a part of its data that is identical on most of nodes.
Mechanism of finding consistent block of data is known under the consensus name and would be discussed detailed in one of subsequent chapter.

Summary

In this chapter I have tried to explain you what are the most indispensable "tools" you need to implement blockchain and why are they so important. I have discussed distributivity as a tool allowing you to get robustness and independence on any centralization attempts. Another one tool discussed here is cryptography allowing you to preserve integrity of data and their invariance – blockchain stores what you intended to be store and if you read your data back you get what you have written before.

Finally you got a concise list of blockchain features to set up general frames in which you will operate in the subsequent chapters. These frames tells you what you can (not) do with blockchain.