Ethereum Virtual Machine

A practical guide for solidity developers

Published in

Coinmonks

4 min readOct 9, 2022

What is EVM?

You might have heard of Java Virtual Machine, where developers code in Java (or other languages) that compiles down to Java bytecode and JVM is a runtime engine that executes Java bytecode. Ethereum Virtual Machine is similar on a high level, where developers code in Solidity (or other languages like Vyper) and compile down to EVM bytecode and EVM is a runtime engine that executes EVM bytecode.

credit to ethereum.org on a high level of EVM

How does it work?

EVM is a transactional-based state machine — If the transaction failed, the state is not updated, except for the sender’s nonce and eth balance to pay for the gas.

EVM Code — developer writes the smart contract, compiles it into bytecode, then signs and pushes the transaction on-chain to create the smart contract in EVM.
When a user or another smart contract interacts with a smart contract, EVM will load the contract bytecode (sequences of opcode) into memory.
EVM will then execute the sequence of opcodes. While execution is ongoing, the program counter (address of next instruction) will be increased. Gas available will decrease based on which opcode and terminated if insufficient gas is available. Any operation which stores data in storage will persist in account storage (only for the smart contracts).

What is opcode?

There are a total of 141 EVM opcodes that can perform a few categories of tasks (not exhaustive) such as the below. The full list and its associated gas cost can be found at: https://www.evm.codes/

Load and Save to memory/storage ( mload | mstore| sload | store )
Process flow (jump )
Arithmetic ( add | mul | sub | div etc…)
Comparison ( eq | isZero )
EVM env related ( gas| caller )

Example: Solidity -> ByteCode -> OpCode

In this section, we’ll use a simple Solidity code example and walkthrough

How a contract looks like in ByteCode and opcode
How EVM knows which opcode to execute for a function call

How a contract looks like in ByteCode and opcode

We will use a simple contract as shown below:

After compiling into ByteCode via solcjs
> solcjs -o BytecodeDir --bin src/Demo.sol

608060405234801561001057600080fd5b50610133806100206000396000f3fe6080604052348015600f57600080fd5b506004361060325760003560e01c80633c6bb4361460375780633d4197f0146051575b600080fd5b603d6069565b604051604891906090565b60405180910390f35b606760048036038101906063919060d5565b606f565b005b60005481565b8060008190555050565b6000819050919050565b608a816079565b82525050565b600060208201905060a360008301846083565b92915050565b600080fd5b60b5816079565b811460bf57600080fd5b50565b60008135905060cf8160ae565b92915050565b60006020828403121560e85760e760a9565b5b600060f48482850160c2565b9150509291505056fea26469706673582212202008ce9f466108e302b16a0a7d864882e3925c470c9a7371c7c03dfa5083b56f64736f6c63430008110033

And you can potentially use Etherscan’s tool to convert to opcode (can use solc command as well)

[0] DUP1
[2] PUSH1 0x40
[3] MSTORE
[4] CALLVALUE
...

When you call a function eg. setVal(uint256) — your input data will contain both your method signature and input value. For example when you call setVal(10) the first 4 bytes are 3d4197f0 and the next 32 bytes contain your input value of 10

> abi.encodeWithSignature("setVal(uint256)", 10)0x3d4197f0000000000000000000000000000000000000000000000000000000000000000a

And you can try searching for 3d4197f0 — it's present in the bytecode! This is the first hint on how EVM knows which opcode to execute for a method call 😀

How EVM knows which opcode to execute for a function call

Step 1:
Copy the above Demo.solin Remix, compile and deploy. Then call setVal(10) and press the “Debug” icon, it should bring you to the debugger

You should see that it brings you to this opcode directly which is in the function body, though we are keen on how EVM brought us to this opcode not the opcode for the function execution.

112 DUP1 
113 PUSH1 00
115 DUP2

Step 2:
Press step backward until you see this instruction highlighted.

At this point, you can see the method signature is in Stack. EQ (take 2 items out of the stack. And if they are equal, push 1, otherwise 0) is the next opcode to be called. Stack now:

Since both value are equal
0: 0x..01
1: 0x..3d4197f0

Followed by PUSH 51 which push 0x51 to the stack. Stack now:

0: 0x..51
1: 0x..01
2: 0x..3d4197f0

Followed by the key part JUMPI(pop 2 value out of the stack, proceed to index 0 program counter if index 1 is greater than 0, otherwise, continue).

In our example, since the stack’s index 1 is non-zero, the program counter will proceed to 0x51 which is our method body. And this is how EVM looked at the function signature and proceed to the correct opcode execution.

What if it's not equal in JUMPIO? The program will not jump and eventually reach the REVERT call at 054 the opcode

Ending note

Understanding how EVM works helps in becoming a better Solidity dev, just like Java devs should have an understanding of JVM. For example, look at the gas for opcode mstore and sstore — you would see the gas differences storing in memory vs storage.

I do hope that this post help! In the next post, we would go towards gas optimization for smart contracts and this opcode knowledge should come in handy!

Reference

These articles have helped me tremendously to understand EVM, credit to them!

New to trading? Try crypto trading bots or copy trading