Simple Summary
Introduce account versioning for smart contracts so upgrading the VM or introducing new VMs can be easier.
Abstract
This defines a method of hard forking while maintaining the exact functionality of existing account by allowing multiple versions of the virtual machines to execute in the same block. This is also useful to define future account state structures when we introduce the on-chain WebAssembly virtual machine.
Motivation
By allowing account versioning, we can execute different virtual machine for contracts created at different times. This allows breaking features to be implemented while making sure existing contracts work as expected.
Note that this specification might not apply to all hard forks. We have emergency hard forks in the past due to network attacks. Whether they should maintain existing account compatibility should be evaluated in individual basis. If the attack can only be executed once against some particular contracts, then the scheme defined here might still be applicable. Otherwise, having a plain emergency hard fork might still be a good idea.
Specification
Account State
Re-define account state stored in the world state trie to have 5 items: nonce
, balance
, storageRoot
, codeHash
, and version
. The newly added field version
is a 256-bit scalar. We use the definition of "scalar" from Yellow Paper. Note that this is the same type as nonce
and balance
, and it is equivalent to a RLP variable-sized byte array with no leading zero, of maximum length 32.
When version
is zero, the account is RLP-encoded with the first 4 items. When version
is not zero, the account is RLP-encoded with 5 items.
Account versions can also optionally define additional account state RLP fields, whose meaning are specified through its version
field. In those cases, the parsing strategy is defined in "Additional Fields in Account State RLP" section.
Contract Execution
When fetching an account code from state, we always fetch the associated version field together. We refer to this as the code's version below. The code of the account is always executed in the code's version.
In particular, this means that for DELEGATECALL
and CALLCODE
, the version of the execution call frame is the same as delegating/receiving contract's version.
Contract Deployment
In Ethereum, a contract has a deployment method, either by a contract creation transaction, or by another contract. If we regard this deployment method a contract's parent, then we find them forming a family of contracts, with the root being a contract creation transaction.
We let a family of contracts to always have the same version
. That is, CREATE
and CREATE2
will always deploy contract that has the same version
as the code's version.
In other words, CREATE
and CREATE2
will execute the init code using the current code's version, and deploy the contract of the current code's version. This holds even if the to-be-deployed code is empty.
Validation
A new phrase, validation is added to contract deployment (by CREATE
/ CREATE2
opcodes, or by contract creation transaction). When version
is 0
, the phrase does nothing and always succeeds. Future VM versions can define additional validation that has to be passed.
If the validation phrase fails, deployment does not proceed and return out-of-gas.
Contract Creation Transaction
Define LATEST_VERSION
in a hard fork to be the latest supported VM version. A contract creation transaction is always executed in LATEST_VERSION
(which means the code's version is LATEST_VERSION
), and deploys contracts of LATEST_VERSION
. Before a contract creation transaction is executed, run validation on the contract creation code. If it does not pass, return out-of-gas.
Precompiled Contract and Externally-owned Address
Precompiled contracts and externally-owned addresses do not have version
. If a message-call transaction or CALL
/ CALLCODE
/ STATICCALL
/ DELEGATECALL
touches a new externally-owned address or a non-existing precompiled contract address, it is always created with version
field being 0
.
Additional Fields in Account State RLP
In the future we may need to associate more information into an account, and we already have some EIPs that define new additional fields in the account state RLP. In this section, we define the parsing strategy when additional fields are added.
- Check the RLP list length, if it is 4, then set account version to
0
, and do not parse any additional fields. - If the RLP list length more than 4, set the account version to the scalar at position
4
(counting from0
).- Check version specification for the number of additional fields defined
N
, if the RLP list length is not equal to5 + N
, return parse error. - Parse RLP position
5
to4 + N
as the meaning specified in additional fields.
- Check version specification for the number of additional fields defined
Extensions
In relation to the above "Specification" section, we have defined the base account versioning layer. The base account versioning layer is already useful by itself and can handle most EVM improvements. Below we define two specifications that can be deployed separately, which improves functionality of base layer account versioning.
Note that this section is provided only for documentation purpose. When "enabling EIP-1702", those extensions should not be enabled unless the extension specification is also included.
- 44-VERTXN: Account Versioning Extension for Contract Creation Transaction
- 45-VEROP: Account Versioning Extension for CREATE and CREATE2
Usage Template
This section defines how other EIPs might use this account versioning specification. Note that currently we only define the usage template for base layer.
Account versioning is usually applied directly to a hard fork meta. EIPs in the hard fork are grouped by the virtual machine type, for example, EVM and eWASM. For each of them, we define:
- Version: a non-zero scalar less than
2^256
that uniquely identifies this version. Note that it does not need to be sequential. - Parent version: the base that all new features derived from. With parent version of
0
we define the base to be legacy VM. Note that once a version other than0
is defined, the legacy VM's feature set must be frozen. When defining an entirely new VM (such as eWASM), parent version does not apply. - Features: all additional features that are enabled upon this version.
If a meta EIP includes EIPs that provide additional account state RLP fields, we also define:
- Account fields: all account fields up to the end of this meta EIP, excluding the basic 5 fields (
nonce
,balance
,storageRoot
,codeHash
andversion
). If EIPs included that are specific to modifying account fields do not modify VM execution logic, it is recommended that we specify an additional version whose execution logic is the same as previous version, but only the account fields are changed.
Rationale
This introduces account versioning via a new RLP item in account state. The design above gets account versioning by making the contract family always have the same version. In this way, versions are only needed to be provided by contract creation transaction, and there is no restrictions on formats of code for any version. If we want to support multiple newest VMs (for example, EVM and WebAssembly running together), then this will requires extensions such as 44-VERTXN and 45-VEROP.
Alternatively, account versioning can also be done through:
- 26-VER and 40-UNUSED: This makes an account's versioning soly dependent on its code header prefix. If with only 26-VER, it is not possible to certify any code is valid, because current VM allows treating code as data. This can be fixed by 40-UNUSED, but the drawback is that it's potentially backward incompatible.
- EIP-1891: Instead of writing version field into account RLP state, we write it in a separate contract. This can accomplish the same thing as this EIP and potentially reduces code complexity, but the drawback is that every code execution will require an additional trie traversal, which impacts performance.
Backwards Compatibility
Account versioning is fully backwards compatible, and it does not change how current contracts are executed.
Discussions
Performance
Currently nearly all full node implementations uses config parameters to decide which virtual machine version to use. Switching virtual machine version is simply an operation that changes a pointer using a different set of config parameters. As a result, this scheme has nearly zero impact to performance.
WebAssembly
This scheme can also be helpful when we deploy on-chain WebAssembly virtual machine. In that case, WASM contracts and EVM contracts can co-exist and the execution boundary and interaction model are clearly defined as above.
Test Cases and Implementations
To be added.
References
The source of this specification can be found at 43-VER.
Copyright
Copyright and related rights waived via CC0.