Simple Summary
Changes the memory model for the EVM to use pagination.
Abstract
Presently, the EVM charges for memory as a linear array starting at address 0 and extending to the highest address that has been read from or written to. This suffices for simple uses, but means that compilers have to generate programs that use memory compactly, which leads to wasted gas with reallocation of memory elements, and makes some memory models such as separate heap and stack areas impractical. This EIP proposes changing to a page-based billing model, which adds minimal complexity to implementations, while providing for much more versatility in EVM programs.
Motivation
Most modern computers implement "virtual memory" for userspace programs, where programs have access to a large address space, with pages of RAM that are allocated as needed by the OS. This allows them to distribute data throughout memory in ways that minimises the amount of reallocation and copying that needs to go on, and permits flexible use of memory for data with different lifetimes. Implementing a simple version of paged memory inside the EVM will provide the same flexibility to compilers targeting the EVM.
Specification
Parameters
Constant | Value |
---|---|
FORK_BLOCK | TBD |
PAGE_BITS | 10 |
PAGE_BASE_COST | 96 |
For blocks where block.number >= FORK_BLOCK
, the following changes apply.
Changes to memory allocation in EVM implementations
Memory is now allocated in pages of 2**PAGE_BITS
bytes each. The most significant 256 - PAGE_BITS
bits of each memory address denote the page number, while the least significant PAGE_BITS
bits of the memory address denote the location in the page. Pages are initialized to contain all zero bytes and allocated when the first byte from a page is read or written.
EVM implementations are encouraged to store the pagetable as an associative array (eg, hashtable or dict) mapping from page number to an array of bytes for the page.
Changes to memory expansion gas cost
Presently, the total cost to extend the memory to a
words long is Cmem(a) = 3 * a + floor(a ** 2 / 512)
. If the memory is already b
words long, the incremental cost is Cmem(a) - Cmem(b)
. a
is the number of words required to cover the range from memory address 0 to the last word that has been read or written by the EVM.
Under this EIP, we define a new memory cost function, based on the number of allocated pages. This function is Cmem'(p) = max(PAGE_BASE_COST * (p - 1) + floor(2 * (p - 1) ** 2), 0)
. As above, if the memory already contains q
pages, the incremental cost is Cmem'(p) - Cmem'(q)
.
Changes to MLOAD
and MSTORE
Loading a word from memory or storing a word to memory requires instantiating any pages that it touches that do not already exist, with the resulting gas cost as described above. If the word being loaded or stored resides in a single page, the gas cost remains unchanged at 3 gas. If the word being loaded spans two pages, the cost is 6 gas.
Changes to other memory-touching opcodes
CALLDATACOPY
, CODECOPY
, EXTCODECOPY
, CALL
, CALLCODE
, DELEGATECALL
, STATICCALL
, CREATE
, MSTORE8
and any other opcodes that read or write memory are modified as follows:
- Any page they touch for reading or writing is instantiated if it is not already.
- Memory expansion gas is charged as described above.
Rationale
Memory expansion gas cost
The new gas cost follows the same curve as the previous one, while ensuring that the new gas cost is always less than or equal to the previous cost. This prevents existing programs that make assumptions about memory allocation gas costs from resulting in errors, without unduly discounting memory below what it costs today. Intuitively, a program that uses up to a page boundary pays for one page less than they would under the old model, while a program that uses one word more than a page boundary pays for one word less than they would under the old model.
We believe that this incremental reduction will not have a significant impact on the effective gas limit, as it gets proportionally smaller as programs use more RAM.
Additional cost for MLOADs and MSTOREs spanning two pages
Loading or storing data spanning two memory pages requires more work from the EVM implementation, which must split the word at the page boundary and update the two (possibly disjoint) pages. Since we cannot guarantee loads and stores in existing EVM programs are page-aligned, we cannot prohibit this behaviour for efficiency. Instead, we propose treating each as two loads or stores for gas accounting purposes. This discourages the use of this functionality, and accounts for the additional execution cost, without prohibiting it outright.
This will result in additional gas costs for any programs that perform these operations. We believe this to be minimal, and hope to do future analysis to confirm this.
Backwards Compatibility
The new function for memory expansion gas cost is designed specifically to avoid backwards compatibility issues by always charging less than or equal to the amount the current EVM would charge. Under some circumstances existing programs will be charged more for MLOADs and MSTOREs that span page boundaries as described above; we believe these changes will affect a minimum of programs and have only a small impact on their gas consumption.
Test Cases
TBD
Security Considerations
Potential CPU DoS issues arising from additional work done under the new model are alleviated by charging more for non-page-aligned reads and writes. Charges for memory expansion asymptotically approach those currently in force, so this change will not permit programs to allocate substantially more memory than they can today.
Copyright
Copyright and related rights waived via CC0.