Deep interpretation of Optimism: Basic architecture, Gas mechanics, and challenges | CatcherVC Research
“While creating a vision with an idealized narrative, how does Optimism go about decentralization? The idea of “fraud proofs” mechanisms and multiple-Sequencer rotations is yet to be tested.”
Author: Web3er Liu, CatcherVC
Highlights of this article：
- For security and decentralization reasons, the Gas limit and the block time of ETH blocks cannot be changed significantly;
- The essence of Layer2 expansion is to build a chain with higher TPS and put the chain’s information on Ethereum;
- There is a large market appreciation space for Optimism, whose TPS upper limit can reach 1600, but the actual throughput utilization rate is less than one-thousandth, indicating great potential for future development.;
- Since peer nodes are not opened, it even takes one hour to verify the correctness of Optimism’s local Sequencer after the block is generated, which is a long-time delay;
- At present, the block-generating nodes of Optimism and Arbitrum are run by the authorities, which have serious centralization problems and are based more on the “credit” of the project than on “procedural justice” itself.
- Optimism’s “fraud proofs” mechanism does not work after the EVM equivalency upgrade, and officials say they will address the issue in the future;
- True decentralization and security are more valuable than efficiency. Without timely user participation in network maintenance, the so-called Layer2 will be indistinguishable from traditional financial platforms;
Layer2 and Rollup are becoming prominent in the blockchain industry as the ETH merger officially enters the process. Trace its origin, Layer2 aims to increase the number of transactions per second (TPS) processed by the system and reduce the Gas fee. The former is the core of the whole Layer2 scaling, while the latter is the key to improve the interactive experience of Layer2.
According to its definition, TPS = number of transactions processed in a period of time / time，which can be roughly regarded as TPS = average number of transactions per block / block time when applied to the blockchain field, ignoring cases such as forking or block reorganization. For normal public chains, to improve TPS, they need to solve the problem of block expansion and block time. Meanwhile, the actual value of TPS also depends on the Gas mechanism used by the public chains, this is true regardless of Ethereum, BNB Chain and Polygon.
However, increasing block Gas limit or shortening block time will destroy security. The root cause of the Ethereum expansion problem is the “impossible triangle”. How to improve efficiency while ensuring security and decentralization is always a hanging matter in the stage of paper war.
In this regard, the rapid rise of Layer2 represented by Optimism and Arbitrum with the banner of high efficiency and low Gas is quite remarkable. While attracting capital from all walks of life with its subtle narrative and gaining a large number of users with its ultra-low Gas fees, its inherent centralization problem is becoming clearer and causing more and more attention and doubt.
This article will reveal in detail the dilemmas faced by Layer1 expansion while ensuring decentralization, as well as the major problems face by the typical efficient Layer2 projects.
Ethereum’s Gas Mechanism
One of the key factors that determine the efficiency of Ethereum is the Gas mechanism it employs. In Ethereum systems, Gas is a form of measurement that reflects the complexity of different operations. Just like a car consumes gasoline to run, transactions on Ethereum incur Gas consumption. The simplest ETH transfer consumes 21,000 units of Gas. Other types of operations, such as a normal ERC-20 token transfer, or more complex contract interactions, can generate tens or even hundreds of thousands of Gas consumption.
Ethereum’s individual blocks have a Gas cap that limits the total amount of Gas that can be consumed by all transaction instructions within a block, which is like a refrigerator that can’t be filled with anything else once it’s stuffed. On the eve of the implementation of EIP-1559 last year, the Gas limit of a single block was about 15 million, which could roughly accommodate a maximum of 714 ETH token transfers, and if the average Block interval of 13 seconds is put into the TPS calculation formula, the theoretical TPS limit of Ethereum before EIP-1559 is 55.
However, in reality, many transactions are related to contract interactions with high Gas consumption, which will substantially occupy the Gas capacity of the block. The actual average TPS of Ethereum is reduced to 20, which pushes a lot of potential trading demand out of the chain. Since the commission of a single transaction = Gas Used × Gas Price (as Gas price is determined by the system, it can be regarded as a constant). After initiating a transaction, users should pay a higher Gas Price than others to be the first to be responded by the system. Ultimately, the imbalance between supply and demand created by the nature of the system creates high fees that affect countless people.
In the final analysis, by its nature， Ethereum is an auction platform for trading rights. Gas Price is the bid price, the attribution of trading authority is decided by both parties through the bidding mechanism, while it fits the free market principles of the blockchain, the seeds of involution are also sown.
Throughout the history of Ethereum, whenever there are hot events such as “crypto-cat” or “5.19” to stimulate trading demand, the intense Gas War phenomenon will show up on ETH chain — — Whoever pays the higher Gas Price ranks higher in the trading order. A fierce Price war inflated Gas Price, shutting out customers who could not afford the high fees, which makes Ethereum a veritable “noble chain” and causes numerous disputes，therefore, EIP-1559 once became the lifesaver.
Although received a lot of attention last year, in fact, rather than directly depressing Gas Price or banning the Gas bidding mechanism the core effect of EIP-1559 is simply to make the floating range of Gas Price more controllable，reduce inflation and selling pressure on ETH.
While the proposal raises the Gas limit for ETH blocks to 30 million, as long as the actual Gas consumption of the new block exceeds 15 million, the Gas Price in the next block will increase step by step under the system regulation. This process can last for multiple blocks until the Gas Price is extremely high that it stops the vast majority of people，causing the number of transactions that can be received by the new block drop sharply and Gas consumption to fall back to 15 million.
Looking at the data, we can see that in the 6 months before and after the implementation of EIP-1559, the daily Gas consumption of Ethereum has only increased by less than 10%. Considering that the block interval is stable at 13~13.5 seconds during these 6 months, the ETH is producing 6500~6650 blocks per day, and the Gas capacity of each block is always stable around at 15 million with no significant changes.
The TPS of Ethereum is not improved, the fees remain high, large numbers of potential users remain outside the Ethereum system.
According to relative data, the current Ethereum has nearly 200 million independent addresses, and the number of transactions processed daily is just over 1 million, in contrast, the number of transactions processed daily by BNB Chain, which has a lower Gas fee, processes more than 5 million transactions per day, but its number of independent addresses is less than 150 million. By rough estimation, the Ethereum network meets up as much as about 15% of the transaction demand.
The Block intervals:
From another perspective, since TPS = the number of transactions contained in each block ÷ block generation time, the block interval is also the key to the TPS. At the same time, several phases in the block interval can map different components of Ethereum’s business logic, which is the key to Layer2 scaling.
It should be emphasized that Ethereum is a system consisting of a large number of server nodes, and its business logic consists of execution, consensus, and multi-party storage. Among them,
- [execution] generally refers to the processing of trading events and other instructions to obtain results;
- [Consensus] means that all nodes agree on the result of the execution
- [Multi-party storage] means multiple nodes store the same content and make it accessible to the outside world.
In some materials, it also refers to [consensus] as [settlement] and [multi-party storage] as [data availability], and these names are essentially interchangeable.
A block interval consists of the following steps:
• First of all, a winner is selected among the nodes of the mining pool through [proof of work] to complete the [execution] transaction process and create a new block;
• [proof of work] requires violent exhaustion of random numbers, consuming a lot of computing power. These tasks are completed by the mining machine in the mining pool, which takes a long time;
• The winning mining pool node will grab a batch of transaction events from those that are waiting to be uploaded to the chain to [execute] according to the Gas Price and then incorporate the transaction information and results into the new block;
• After that, the new block is propagated to all Ethereum nodes and the content will be checked. Specifically, the node examining the block will read its contents and executes the transactions in it again to see if the data submitted by the block pool is correct. This is the [consensus];
• Finally, if the new block passes the check, the nodes will register the new block and complete [multi-party storage].
Therefore, a new block will be copied over 2,000 times and stored in the Ethereum nodes across the network. More specifically, all mining pool nodes and all full nodes will all store one copy. In this way, “consistency” is achieved between Ethereum nodes.
To sum up, a complete block interval of Ethereum consists of four stages: [Proof of work] + [execution] + [consensus] + [multiple storage]. Among them, the [proof of work] and [consensus] stages take the longest time. Since there are more than 2,000 mining pools and full nodes combined in Ethereum, these nodes will incur a lot of communication time to reach [consensus]; [Proof of Work]is a flexible time filling tool designed to keep the block interval stable at about 15 seconds (the block interval is about 13 seconds now).
Why is the block interval fixed at 13 seconds? This is a preferred solution for security and decentralization considerations. Due to the large number of Ethereum nodes and their scattered physical locations, moving blocks too quickly will increase the information gap between nodes and ruin the [consensus]; For example, If the Ethereum block interval is reduced to 0.1 second, and there is a 1 second time difference between the information propagating to different nodes in the US and Europe, there will be a 10-block difference between the nodes in the US and Europe, which goes against the design philosophy of blockchain.
If the block capacity is forcibly scaled up, it will also exacerbate the information difference between different nodes. For example, if the Gas capacity of ETH blocks is increased by 10 times, the number of transactions contained in each block will increase by 10 times and the information difference that can be generated between different nodes will also increase by 10 times.
According to relevant data, until the completion of the POS transformation of Ethereum, its block interval will be stable at 13 seconds, after the transformation of POS, the block interval will only be shortened by 1 second and stable at 12 seconds. By that measure, the POS transition could increase Ethereum’s TPS by as much as 10%, which likes a drop in the ocean.
At present, under the premise of ensuring security and decentralization, the ETH block’s Gas capacity and block generation time have basically reached the theoretical limit, leaving little room for optimization.
The Scaling solution — — OP Rollup
As mentioned above, Ethereum’s block capacity and block interval cannot be changed much under comprehensive consideration, and its TPS is basically below 20, which has not improved much in the last two years.
In response, scaling solutions outside of Ethereum have taken a different path, with public chains independent to Ethereum, such as BNB Chain and Polygon, made changes to block parameters. Take BNB Chain as an example, currently, the upper limit of its block Gas capacity is 80 million, up to 2.7 times that of ETH. Meanwhile, BNB Chain compresses the number of nodes involved in consensus to 20, only 1% of Ethereum, greatly reducing the time for nodes to reach consensus and shortening the block interval to 3 seconds. Although this raises the TPS ceiling to more than 10 times that of Ethereum, it is completely isolated from the steeled security of the Ethereum network, and the degree of decentralization is much lower than that of Ethereum.
Layer2, represented by Rollup, takes a different approach. While it is essentially a public chain outside of Ethereum, it is still largely tied to Ethereum’s security. For example, OP Rollup (Optimistic Rollup) will compress and store a copy of the blockchain of Layer2 on the Ethereum mainnet while:
• Layer2’s local block interval only preserves the [execute] transaction phase;
• [Proof of work] was cancelled;
• [Multi-party storage] functionality was moved to Ethereum network;
• The process of [consensus] is done by Layer2’s verifier node, but is not included in the local block interval of Layer2.
The principle of Optimism
Taking Optimism, the most typical OP Rollup solutions as an example — — its four most important modules are Sequencer, Verifier, CTC (Canonical Transaction Chain), and SCC (State Commitment Chain). Sequencer and Verifier are Layer2 nodes with hardware entities, which basically constitute the node network of Layer2; CTC and SCC are contracts deployed on Ethereum, and these four modules form the core architecture of Optimism.
Sequencers is a centralized mining pool node that is responsible for generating blocks on Layer2. Optimism eliminates the [proof of work] process by having a single Sequencer as the miner and does not immediately allow other nodes to verify [consensus], which saves a lot of time. The current Sequencer knocks out blocks immediately after executing a transaction, and the local block-generation time is even just 1 second, improving the TPS from the root.
However, Sequencer has a strong centralized character, and it effectively creates a side chain independent of Ethereum, which is inherently insecure without [consensus] and [multi-party storage] processes. To address this issue, Optimism stated in its early documentation that Sequencer must stake a certain amount of assets and:
• Every few minutes, the Sequencer node stores the compressed version of the local block to the ETH mainnet; These include a summary of the transaction data as well as the State Root after the transaction has occurred. This process is Rollup;
• The summary of the transaction data is stored in the CTC contract on Ethereum and the corresponding state root is stored in the SCC contract. This will generate two transaction events, a process in which the Ethereum system is only responsible for the [multi-party storage] and will not bother to check the correctness;
• Layer2’s Verifier automatically reads and reviews content stored by Sequencer to Ethereum, similar to the [consensus] on Ethereum;
• The current Optimism and Arbitrum both have serious centralization problems with their official organizations running Sequencer nodes.
CTC and SCC are contracts officially deployed on Ethereum by Optimism. They respectively record the summary of Layer2 transaction data in a Batch structure and the root hash of the Layer2 state tree after each transaction is executed. look from the appearance, CTC and SCC look like two bill lists.
(Note: The state tree is a database that records information about the addresses on the chain. By obtaining the state tree root and the transaction data summary, the Layer2 local block content can be pieced together. Generally speaking, the Layer2 state root stored in the SCC contract is more important. After obtaining the state root and calculating it with the transaction data, you can know whether Sequencer has rewritten the user address balance without permission.)
Layer2’s Verifier automatically reads the records in both the CTC and SCC contracts, trying to piece together the contents of Sequencer’s local blocks and verify them.
• If Verifier finds problems with the data submitted by Sequencer, it can challenge and submit the correct version it thinks. Successful challenge can rewrite the wrong data in CTC and SCC and obtain a certain amount of token reward;
• If the Sequencer is successfully challenged and confirmed to have dishonest behaviors, it’s subject to certain penalties and part of the staked assets will be deducted; If the stake balance is lower than the threshold, Sequencer will be forcibly removed from the list and will no longer be eligible to generate blocks.
• The above is the “fraud proof” mechanism, which means that Verifier can disclose Sequencer’s fraud behavior.
• The [consensus] reached between Verifier and Sequencer has a serious lag. A transaction will be executed by Sequencer immediately after it is submitted, but Verifier can obtain the state root and final verification of the results up to an hour later.
•Optimism’s EVM equivalency was upgraded in November 2021, with Sequencer and Verifier clients subtitle the old OVM virtual machine. The “Proof of Fraud” program based on the old OVM did not work, while the new “Proof of Fraud” program has not yet been released.
According to the previous technical document, Optimism set the window for challenges to 7 days. If there is no challenge from Verifier within 7 days, the content published by Sequencer is finalized and cannot be rewritten.
Essentially, Optimism is a cross-domain interaction system composed of hardware and software entities on Layer1 and Layer2, whose unique business logic is to construct a mapped version of Layer2 blocks on Ethereum. Since information needs to be transmitted across domains, Sequencer and Verifier need to run L2geth, a copycat of Geth, through which Sequencer can realize the interaction across Layer2 and Layer1.
Gas mechanism of Optimism
In terms of Gas fee, the Gas fee per transaction = Gas fee on Layer1 + Gas fee on Layer2 due to the step of storing data to Ethereum, and the same is true for other OP Rollup schemes such as Arbitrum and Metis.
Among them, the cost of Layer2 mainly involves the transaction execution cost of Sequencer node. Due to the high TPS ceiling of Sequencer and the small number of current Optimism users, the local Gas Price is extremely low. The calculation formula is L2 Gas fee = L2 Gas Used x L2 Gas Price.
• According to OP’s official disclosure, only 0.4% of the cost of a transaction is from Layer2, while the remaining 99.6% of Gas costs are from Layer1. 
• Convert this to simple math: 0.4% x execution cost + 99.6% x storage cost.
It is not hard to see that the cost of executing a transaction has been drastically reduced.
Therefore, the more complex transactions (such as options and etc.) are executed, the more cost savings can be achieved in Optimism. For example, An option operation on Ethereum that costs $100 costs only about $1.50 on Optimism, just 1/60th of that. Ethereum normally costs $3 for a transfer, whereas Optimism might cost $0.30, which is 1/10 of Ethereum.
For the Layer1 part of the Gas fee, the formula = Scale factor x (fixed cost + storage cost). The fixed cost comes from the process of data packaging and cross-domain transmission. The storage cost is the Gas generated by data storage on ETH, and the scale factor is officially set by Optimism, mainly to reserve a part of funds in case the Gas price of ETH mainnet surges and data cannot be successfully stored on the chain.
For more insight, look at the Rollup and store steps:
• Sequencer compresses a Batch of transaction data before storing it on Ethereum, then combines the Batch of transactions and transmits them to ETH nodes.
• Each Batch can contain hundreds of transactions, just like blocks. The time period for releasing Batch is dynamically adjusted by Sequencer, currently about 3 to 10 minutes.
Therefore, the process of packaging and transferring Batch inevitably requires work, which will consume certain computing resources, and fixed costs can fill this part of the cost. Currently, the fixed overhead Gas per transaction on Optimism is 2100. Optimism officials say that the fixed overhead will be further adjusted downward as the number of transactions per Batch increases in the future as the number of users grows.
When Batch is stored on Layer1, Sequencer will transmit Batch information to the CTC contract in the form of text data Calldata. Generally speaking, text data is only used for storage and will not be used for operation. This step provides significant Gas savings compare to normal contract calls.
Sequencer transmits a transaction Batch to CTC in every few minutes, and the experience is similar to building a linked list of batches of trades on Ethereum. Afterwards, Sequencer will store the State Batch corresponding to the transaction Batch into the SCC contract, which is similar to the process mentioned above.
The above process consumes Gas, depending on how much content is stored. Different types of transactions generate different amounts of data and different storage costs.
What is the theoretical TPS ceiling of Optimism?
To explore the upper limit of theoretical TPS in Optimism, we should assume a critical state:
• The Sequencer local block generation speed is much higher than the Ethereum main network, so there is always information difference△ between the native Layer2 content and the Layer1 copy content. With the increase of Layer2 users, the actual TPS increases rapidly, and the information difference △ between Layer2 and Layer1 can be enlarged.
• When Optimism approaches its theoretical TPS ceiling, the information difference△ per second between Layer2 and Layer1 can be very large. So, in this case, Optimism must submit data to the Ethereum mainnet as soon as possible and synchronize it at all costs.
• In the end, Sequencer initiates instructions take up all the Gas of the Ethereum block, that is, all available resources on Ethereum, and each Ethereum block is stowed with data submitted by Sequencer;
•Calculating Gas ceiling of 30 million for each Ethereum block after EIP-1559, if the simplest transfer operations are performed locally and submitted to Ethereum, the Optimism ceiling TPS is about 1600 .
To sum up, the TPS ceiling of Optimism is at least 16 times that of Ethereum. Considering that Optimism currently has such a small number of users that its actual TPS is even less than 3% of Ethereum’s, its maximum potential for development could be up to 500 times its current size.
After sorting out the above contents and combining with the actual investigation, it can be concluded as follows:
The Sequencer nodes themselves create a blockchain with extremely high TPS, which is the source of scaling. Although extremely efficient, Sequencer can be evil or down due to its high degree of centralization;
To enhance security, Optimism requires Sequencer to stake certain assets. Sequencer is also required to disclose key information of Layer2 block on ETH mainnet, which is automatically read by Verifier to check the accuracy.
• Because the transaction data stored on Ethereum is compressed and Ethereum nodes are not responsible for executing and verifying the correctness of the data, this can result in significant Gas savings. Currently, the Gas fee for executing complex options on Optimism can be as low as 1% of Ethereum.
• The [consensus] reached between Verifier and Sequencer has a serious lag. A transaction is executed by Sequencer immediately after it is submitted, but it can be up to 1 hour after Verifier obtains the state root and performs the final verification of the result. Due to the long latency, multiple classes of attack scenarios can exist, which is a potential threat to the security of Optimism.
• Verifier incentives = token rewards for successful challenges — node operating costs. Issuing “fraud proof” and challenging success is an unpredictable event with low probability, so Verifier is not strongly incentivized. The number of such nodes is not easy to expand, and consensus and security are still weaker than Ethereum.
As mentioned earlier, the most effective way to expand Verifier is to increase the incentive or open the peer node network. For Optimism, which has not yet issued tokens or opened peer nodes, it is difficult to incentivize verifiers with its own issued tokens as Metis does. Therefore, Optimism faces considerable challenges in expanding the scale of verification nodes and enhancing the timeliness of verification currently.
• It is worth noting that the Sequencer nodes of OP Rollup such as Optimism and Arbitrum are all provided by the authorities, so it is still controversial whether the Sequencer punishment mechanism is effective. At present, the security of Optimism and Arbitrum comes more from the “credit” of the project side, rather than the “procedural justice” itself.
• Optimism’s EVM equivalency was upgraded in November 2021, with Sequencer and Verifier clients subtitle the old OVM virtual machine. The “Proof of Fraud” program based on the old OVM did not work, while the new “Proof of Fraud” program has not yet been released. Currently, the challenge mechanism does not work.
Although today’s Optimism is very popular, showing great prospects for development and appreciation, it still faces the problem of over-centralization as mentioned above. Gavin Wood once said, “True decentralization and security are more valuable than efficiency.” Without timely user participation in network maintenance, the so-called Layer2 will be indistinguishable from traditional financial platforms.
While the idealized narrative creates a vision of expansion, it remains to be seen how Optimism, which is already “too big to fail,” will move towards decentralization and deliver on its vision of “proof of fraud” mechanisms and multi-Sequencer rotation. But what is certain is that, in the long run, only true decentralization can stand in history and endure forever.