Bitcoin blockchain. Search. Concealment. Financial analysis.
Yury Marchenko
[email protected]??Provided??353K+ trademark searches in 37 countries??in 2024
Full version on: https://btcunpack.com/
At first glance, Blockchain is an impassable swamp, which is impossible to understand, or to track someone in this frenzied stream of figures incomprehensible to an ordinary person. Most people who discuss Blockchain have, to put it mildly, a limited notion about it. The interests of serious people in Blockchain can be divided into three categories: search, concealment and financial analysis. To demonstrate, we will use one of the largest and most popular blockchains - Bitcoin Blockchain.
In order to be able to proceed with any of the tasks described above, we need to bring our Blockchain into a clear and convenient representation. This description is not intended to explain the basic mechanisms of the Bitcoin blockchain system: you can read about them in specialized literature and in the description of the system itself.
The first technical task that we face is to transform the entire blockchain into a form that is convenient for our programs to interact with it. As part of this task, we receive ordered data with mutual addressing by transactions. Let us give a name of BLOCK frame to what we obtain as a result.
Figure 1. Viewing a block recorded in the blockchain. The example uses the block № 400123.
Columns in the table:
The sequence number in the block, the name of the transaction, the time of the transaction, the number of property items at the entrance to the transaction, the number of property items at the exit of the transaction, the number of BTC entering the transaction, the number of BTC outgoing from the transaction, the amount of commission paid to the miner who recorded the block.
The table allows for sorting. It is also possible to switch to the previous and next blocks, or set the number of the block of interest in the field: Block Number, at the top of the page.
Figure 2. View of transaction № 15 in block 400123.
The 3rd column in the table of incoming property items shows the previous operation with this property item (where it appeared).
The 5th column in the table with outgoing property items indicates the next operation, where the property item will already be included in the incoming transactions. If the time is not specified– the property item has not been used yet.
Additional information:?It is worth noting that out of about 700 million transactions recorded in the Bitcoin blockchain, I was unable to identify incoming property items for about 1000 transactions. If you come across a similar transaction in the visualizer, the system will inform you about the lack of data in it. It is very strange that such collisions have arisen in the blockchain, but identifying the causes of their occurrence is not my goal.
Let us decompose operations by addresses and sort them by block number and by transaction number in the block and get the data for any address. Let us also add the rating of addresses and sorting for it by the number of remaining BTCs, the number of incoming or outgoing transactions, and even the total volume passed through the address or whatever else your heart desires. Finally, we will have a kind of an analogue to the sites that allow you to view transactions from the Bitcoin blockchain. There are many of them.
Figure 3. The rating of addresses
Sorting by many columns is available. The details of operations at the address can be viewed by clicking on the address itself in the 10th column.
Access to viewing the rating is provided by clicking on the Rating link at the top of the interface. The system does not have a separate visualization for addresses within the BLOCK frame, it has been improved and has become more informative. An example of visualization will be given below.
There is one little problem: like any of these sites, and the resulting BLOCK-frame, we are practically unable to cope with any of the goals we have set. The search is almost impossible, only the manual one, which is ineffective. The reliability of concealment is also impossible to verify. Financial analysis is at the level of monitoring changes in large wallets, and it is also not possible to determine whether this is a qualitative change or “dust in the eyes”. In addition, the representation by blocks is absolutely unusual to the human eye.
To level out the last factor, you need to assemble the BLOCK frame in a different way, sorting it not by block numbering, but by time. We decided to do it. We look at our data from the blockchain and see: we have no time stamps on transactions, only the time of recording the block. If this affects the search and the determination of the reliability of concealment to a lesser extent, then it is simply a disaster for the needs of financial analytics. Where can we find the transaction time? They are not recorded in the blockchain; sometimes there are time stamps on some sites, but the accuracy of this data is either doubtful, or the source does not want to share them voluntarily, especially since any such service has time stamps not since 2009.
Figure 4. TIME-frame. Viewing of hour 2018-10-13-01 UTC
The first column contains a link to a full view of the transaction, taking into account the TIME frame and the other elements of the system, which will be described below.
The second column indicates the block number and the transaction number in the BLOCK frame, and viewing the transaction is similar to the example, as in Figure 2.
The table allows for different types of sorting and fast navigation in the TIME frame.
What should we do? We should immediately start collecting time stamp data for new transactions that have just entered the mempool from the blockchain node. This process is not so smooth: it will not give us information about the past, but only about the time stamps of transactions since we started their registration, but this is at least something. The transformed data model gets the name - TIME-frame. We managed to get the transaction time, therefore we have to get the second-by-second ticks of the exchange rate value of Bitcoin, so desired by financial analysis: we have already had enough. You can solve this problem with the help of exchanges, they will be happy to share such data through their APIs.
By linking the time of transactions with the exchange rate, it seems to us that we get the opportunity to understand at what price the bitcoins were bought and sold. We can even build wonderful graphs and feast our eyes upon them a little. After this we begin to understand slowly: this is not at all what is needed for financial analysis, because not every transaction is really a sale or a purchase. And if we remember the fact that most addresses in a blockchain have a lifetime of only 2 transactions, then consideration at the address level does not give us anything. Don't get upset, let's move on.
An additional comment:Everyone has repeatedly seen on tabloid sites information about the number of bitcoins participating in the block. Now we understand that it is useless. If we take accurate measurements of the time the transactions appeared in the blockchain– this may already be a completely different matter. At least, fairer.
The direction of development is prompted by the blockchain itself. Most readers have probably seen huge transactions in the blockchain, consisting of a large number of different incoming addresses. The practice of creating such transactions is used by stores, large exchanges, and notorious mixers. The latter are just using this to confuse the trail of money, and are a minefield for those involved in the search and give hope for security to those who want to hide the sources of receiving bitcoins or any involvement in them. Consequently, such transactions can be both good, in case of shops, exchanges, casinos, and bad, when they are used by mixers.
Another important property of these transactions is that they are large, therefore, by no means can they be made manually. This means that they are formed by a certain software product. It follows from this that if we have any two addresses seen in the inbox of any transaction, they are under the control of one program. Any combination of these addresses forms an ecosystem of addresses of some program. We will call such associations "the basic definition of an ecosystem" (BECO). In the future, the addresses that received BECO are no longer independent units, and interaction, including taking into account balance, is carried out exclusively for BECO.
Figure 5. An example of the transaction forming BECO. Тransaction № 3809 from the hour 2018-03-13-01 UTC.
Viewing ecosystems is possible only when viewing transactions from a TIME frame. They are not accessible from the BLOCK frame.
In the present example different addresses receive a new BECO which did not exist previously – 2018-03-13-01/3809. For convenience, BECO name by default coincides with the transaction of its occurrence.
Figure 6. Viewing the address. By way of example, we took one quite big address in the blockchain. 12ib7dApVFvg82TXKycWBNpN8kFyiAN1dr
Clicking on the link in the TX column opens the transaction of interest. Transactions can also be viewed in a compressed form, inside this table, by clicking on the cell of column 2-5, 6...13. An example of a compressed view is shown in Figure 6A.
Figure 6A. A lot of transactions can be opened in a compressed view. Figure 6B.
Figure 6B. Graphical representation of the address status and its "effect".
Charts are available for all addresses that have more than 2 transactions.
Ecosystems can be different: those with just two addresses and with millions of addresses. By combining all the addresses in the blockchain into their ecosystems, we will get not 700 million entities-addresses, but several times less. Free addresses that have not been combined into ecosystems and ecosystems themselves. If we are engaged in a search or analysis, this is good, but those who want to hide bitcoins are in a bad mood: they saw vulnerability and are now trying to determine how good the mixer they used for their tasks is.
Sincere belief in the fact that using the addresses whose life lasts only two transactions increases security has been shaken, because such a principle fails when using the ecosystem technology. A chain of addresses with a lifetime of two transactions belongs to the category of Overflow operations. There exist more complex types of overflow in the blockchain, when overflow occurs due to large operations, but this does not change the essence of the matter -- overflow remains overflow.
领英推荐
Figure 7. Overflow.
You can view any overflow that occurs in the blockchain by clicking on the aECO link in the inputs or outputs. At the output the system will show overflow from this particular property item. At the input the system will independently find the starting point of this overflow and show the entire sequence of this overflow, from the first to the last transaction, show all the addresses involved in the overflow, all the counterparties of the overflow, show in which transactions and in which BECOs the fragments of this property item from this overflow were combined, show the remainder at the overflow addresses and how much commission was paid to the miners. Various types of sorting are possible.
We will upset even more those who want to hide their bitcoins. If there is such a situation that two different backups are involved in entering a transaction, then they are combined. For convenience, the younger BECO becomes part of the older BECO. If we could not determine who is the owner of the BECO, or in cases where it is impossible to determine, in the figures for this article, all ecosystems in human-readable form are encoded according to the TIME frame, namely YYYY-MM-DD HH/NN, where the beginning is the date and hour of the occurrence of the BECO, and NN is the running number of the transaction in the TIME frame.
Now let's look at the addresses that don't have a BECO yet. In order for everything to run smoothly, we need to assign some ECO to these addresses as well. The fact is that any new address in a blockchain appears only when it receives funds. If it even existed before, it was not reflected in the blockchain. Even the first Nakamoto address, which is in the zero block, also appeared in the same way. If we have an address that arises from some kind of BECO and it is new, then it gets an expanded ecosystem (EECO). This does not mean that the BECO of the address will become the same as EECO, but it is necessary to control overflows. Addresses that first appeared in the blockchain due to mining operations receive EECO of “mining”. The addresses from the EECO mining group, of course, may belong to different miners, you can try to divide them between those, but there is a reason for this if we track mining pools, but this is only for those who are interested in it.
Figure 8. The first address in the Bitcoin blockchain. Satoshi Nakamoto’s address receives EECO of mining. 1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa
Figure 8A. Address 1A1zPDPmCoM1aR6uUqMBsm7wvKYWJwzLot received EECО 2013-10-16-13/469, and later was joined to BECO 2013-09-25-23/347
So, with the help of EECO technology, we can now transfer any quality parameters between addresses outside BECO. This is also important if we have an anonymous overflow exclusively to new addresses. Until the address is combined with some kind of BECO, we do not know exactly what it is – a spending or a person's / program's surplus. The combined structure of all BECOs and EECOs was named ECO-frame.
Figure 9. Viewing the ECO-frame. All the operations of a particular BECO.
You can view all BECO operations by clicking on the name of BECO on the transaction page or its digital number. On the BECO view page, you will see all its interactions, what the age of money in incoming transactions was and how it changed the age of property items in the BECO itself.
We now continue to lower the degree of anonymity of the blockchain. Now that we have an ECO-frame, we are already looking at the whole system from a different, better angle than it was at the stage of the BLOCK frame or even the TIME frame. It's time to figure out what is garbage in the blockchain, and what is the real movement of bitcoins. We set ourselves the task of finally spoiling the mood of those who love anonymity. It is necessary to determine what is a mixer (not a useful program), and what is a useful one. We are also hindered by the fact that mixers are written by different people and have different principles of operation. The only thing we can say for sure so far is that any mixer will inevitably fall under the concept of BECO. It physically cannot do anything differently, otherwise it can be distinguished by simply surfing transactions even at the BLOCK-frame level.
To combat these malicious programs, we need to drag something incredibly reliable through the entire blockchain, something that will show us with exceptional accuracy the information about any address, about any BECO and EECO. We will not be able to add anything there, which means that our only agent can be property items themselves, i.e - bitcoins. But how do we distinguish one Satoshi from another? At the same time, not forgetting about efficiency, speed and, most importantly, how not to consume resources, because in order to achieve our goal, we have to perform trillions of sequential mathematical operations.
For a more simplified understanding of the method, which will be described below, I propose to consider the probability of the following situation. Let's say we took a bundle of fresh fiat money from our Central Bank. Inside the bundle on the bills are all numbers in order. 1,2,3...100. All the other money of this type that is in circulation among the population has numbers other than 1…100. Let's take bills 51 and 52 and put them in a wallet. Then we'll go and spend the first one in one store, and the second one - in a store in the other city. What is the probability that after, say, a month, these 2 bills with numbers 51 and 52 will end up in our wallet, and, moreover, simultaneously? The answer is simple: the probability of such a case is zero.
How can we mark/stain our bitcoins? You can't even touch them, to say nothing of the fact that there are millions of them. Actually, it can be done in a very simple manner. The fact is that there are still operations in the blockchain, information about which has a much greater accuracy by default than all the others, and we do not need any additional gadgets. This is the time of recording the next block. In each block there is a time stamp of the block recording, which we use to convert from a BLOCK frame to a TIME frame of those transactions whose time we do not know. This is plus or minus 2 days, and sometimes more, but we still have no choice.
Any mining operation is separated from the first Nakamoto bitcoin mining operation for some time. It is different for each block and is calculated in seconds from the beginning of the Epoch. It would be possible to use the numbers of the blocks themselves, but after some time, it becomes clearer. Consequently, any subsequent operation with a property item transfers the time stamp to the new property item, while maintaining the number of bitcoins specified in the blockchain. If one property item is divided into two new ones, then the amount of new ones should be equal to the amount of the transaction entry (excluding commission), and the time parameter for these property objects will be identical. If two different property items merge, the new one will receive a quantitative characteristic known from the blockchain, and the time characteristic is calculated based on a simple proportion of the items being merged. The commission paid to miners for entering into a block is calculated by the time stamp of the block in which it is recorded, but here you cannot update the date for the commission, but this is a philosophical question, and reassigning is simply more convenient. In other words, for our purposes, we use the following curve of Satoshi's life: the birth in a mining operation, the path along the blockchain, the return to the mining point.
Now let's take a closer look at the mixers' issue. Our person who wants to get anonymity owns bitcoins Q0 with a time indicator T0. He sends bitcoins to the mixer, which has a time indicator in its BECO = T1, and the number of bitcoins in the mixer circuit = Q1. The new value of the time after the receipt of the money for the mix = (T0*Q0 + T1 *Q1) / (Q0 + Q1), everything is simple. Now our mixer system spits out bitcoins with a unique trace into different EECOs, then it does something with them there, combines them into some other BECO, does something else there, but at the output we will have a unique clearly distinguishable trace based on a unique time, and wherever the mixer spits out the equivalent of the original property item, now detectives have a real opportunity to track it. Yes, you'll have to dig a little, use your eyes a little, maybe write a few small programs, but it's no longer impossible.
Another feature of mixers, even the most complex ones, no matter how hard they try, is that during the operations of assembling bitcoins into one or more property items at the entrance, bitcoins will have almost the same time characteristics. BECO, which is saturated with the same money (of the same age) is nothing but a mixer's ECO. This understanding automatically interests financial analysts, because the blockchain is the main asset for a bunch of derivatives. In addition, it will now be possible to see the real turnover in the blockchain, determine the speculative component of the price, control the dumping of old bitcoins into exchange-listed BECOs and do many other things.
Figure 10. A transaction with a similar state of money. The difference is less than 1%. This is not possible under normal conditions.
In this example, there are additional factors indicating at disutility. The age of the incoming money in the transaction coincides with the age of the money stored in the BECO transaction, and also the number of incoming addresses coincides with the BECO transaction. Also, all the numerous exits from this transaction are new addresses. In a normal situation, this is unlikely.
Figure 10A. The money in the BECO of the transaction coincides with the money flowing into it. EECO of addresses coincides with the BECO of the transaction, and at the output all the addresses are new.
Figure 10B. Normal operation. All property items and EECO of addresses are different.
Another factor that shows that something is wrong with the transaction of interest is the same source of money (the same EECO). This is not necessarily a mixer; it can also be the addresses of an ordinary person who is merging his property items into one.
Figure 11. Merging property items from the same EECOs.
Owners of small property items are lucky: no one will pay extra attention to them. Besides, they can join a stock exchange with some financial losses and their trace will be lost forever. It is only necessary to agree with the risk of introducing bitcoins into a stock exchange. But for large participants in whom state detectives are interested, this cannot be a magic pill, because an exchange or a store that launders their bitcoins may well share their client's private information, especially if you take into account the fact that he used them without their knowledge, and did not pay for such a service.
I'll give a couple of tips to fans of anonymity to cheer them up a little. You should use the services of mixers, which use bitcoins that are not affiliated with the ECO-acceptance as a distribution. Perhaps they do exist, but it can only be determined by checking the method described above. The second option is to ask some mining pool to whitewash your bitcoins. You should remember that a high commission or a large sum will help detectives in their work, and, moreover, there is still a risk of losing everything, if the miner with whom you made no agreements, turns out to be more successful.
As part of the issue of determining the real turnover in the blockchain, you can't but mention transactions where several small objects of property are combined or flow from an incredibly huge property item. Even if such an operation takes place in a block, a huge property item cannot be attributed to real turnover.
Figure 12. The sum of 0.1 bitcoin flows from 48500 BTC.
Additional information: The ecosystem technology also can help to exclude from defining in circulation those bitcoins that are sent to the same BECO.
For example, the cash flow through the blockchain for the last week equals TLW. And here, let's say 200,000 bitcoins from different addresses, from different BECO/EECO, with time indicators – T1000D, which is 1000 days away from TLW, crashes into the ecosystem of stock exchanges. After all, obviously, the holders of these 200,000 bitcoins did not buy them for 60,000 USD for 1 BTC? And what will we do with these beautiful bitcoins next and how will this affect the exchange rate of the world's first cryptocurrency? Good question, right?
The age of specific bitcoins moving in the blockchain is, of course, not a very convenient parameter for financial analysis. That is why we can nominate bitcoins in mining operations not only by age, but also by their stock exchange value at the time of mining. However, it is worth remembering that bitcoin mined five years ago at the exchange rate of $ 10,000 per bitcoin has a completely different potential than the bitcoin mined a year ago at the same exchange rate. The resulting graph, of course, will be very interesting, but slightly less accurate than the one which was calculated based on the age of bitcoins.
I will be grateful for your feedback. I am open to any suggestions. Thank you for taking the time to read this article, I tried to present all the material in a manner understandable to as wide a range of readers as possible.
Yury Marchenko
Software Team Lead
3 年Thanks for sharing your research, very interesting and informative!