Jan 30, 2023

Creating NFT Contract: The Hard Way – Part 1

Solidity has an inline assembly language “Yul” for low-level Ethereum virtual machine (EVM) calls. Yul can also be used on its own, generated byte code can be executed on EVM as regular smart contracts. To learn more in-depth about how smart contracts are created and executed, I decided to code entire ERC-721 compliant smart contract in Yul.

The blog post covers the challenges during the development but it’s not a tutorial or detailed post about how to create ERC-721 in Yul or Yul in general. This post can be used as reference for developing contracts in general for low-level details. I’ve listed some links at the bottom if you’re interested in Yul or in advance smart contracts development in general.

The complete contract source can be found at here at Github.

How smart contracts are created

If you deploy a smart contract you get a smart contract address. But how does this happen? Several other questions may arise, such as:

How the contracts are stored?
How constructor is executed?
How the addresses are unique?
How do we call methods?

To find the answers let’s explore what smart contracts are in there final form.

Bytecode

When we compile a solidity smart contract it’s converted into sequence of bytes, which are sequence of opcodes and their operands and higher level.

The following byte code is of the simple contract which can accept ERC-721 token i.e it implements ERC721 Receiver interface.

// ERC-721 Receiver Contract Bytecode

602b600d600039602b6000f3fe60003560e01c63150b7a028114601457600080fd5b606435806024013563150b7a028060005260206000f3

This byte code is sent in new transaction “data” field to create a smart contract on chain. The created smart contract is stored on-chain and its address is returned in transaction receipt. But where do we send this transaction to?

The Null Recipient

To create/deploy a new contract its byte code must be in data field of the transaction and the to “to” field (i.e recipient address) must be omitted (thus RLP empty byte sequence – 0x80). The contract is created using the provided byte code and if the execution is successful its new address is returned in the transaction receipt.

Following is the demonstration of the above ERC-721 Receiver contract deployment in local environment using cethacea tool (https://github.com/elek/cethacea).

null receipient

As we can notice in transaction receipt, “to“ is set to <nil>, data is set to our contract byte code sequence. We get the new contract address “contract” field. We can use this address to call methods of deployed contract.

Why this contract address?

At first I thought contract address would be randomly generated but I ignored consensus in network! A randomly generated value can’t be part of blockchain consensus it must deterministic so all the chain nodes can agree. And that’s how the ethereum does it.

The contract address is computed using sender address, bytecode hash, and transaction nonce as salt value. Here’s short snippet from solidity-by-example.org.

function getAddress(
        bytes memory bytecode,
        uint _salt
    ) public view returns (address) {
        bytes32 hash = keccak256(
            abi.encodePacked(bytes1(0xff),
            ddress(this), _salt,
            keccak256(bytecode))
        );

        return address(uint160(uint(hash)));
    }

“bytecode” is our contract bytecode (with constructor params – more on it later ) and “_salt” can be any number, transaction nonce is passed as salt as this value never repeats.

Now we know contract address is generated, we can pre-determine our contract deployment address even before deploying it!

How constructor is executed

Most of us when coding smart contract save the some state variables in storage variable in the constructor.

address uint block;
constructor(address _owner)  {
	block = block.number;
}

But we can’t pass the block number in the transaction when deploying contract as we don’t know ourself either, in which block our transaction would end up.

Deployment Code

Here’s interesting catch, the code which creates our actual contract is also the part of the bytecode we send in the transaction data field. We can call this code “Deployment Code” because it’s purpose is to execute constructor and return the bytecode of actual contract to EVM.

What?

When we send transaction with data field to null recipient address. The EVM assumes the data field to be the code (deployment code) which would return bytecode. The bytecode that is returned by the deployment code is actually stored on the blockchain known as runtime.

Deployment code is also responsible for storing state variables we asked in constructor by either appending them to bytecode or in storage.

Here’s an example from ERC-721 in Yul.

object "NFT" {
    code{ // Deployment Code
        sstore(0x0,caller()) // store sender in storage
        datacopy(0,dataoffset("runtime"),datasize("runtime"))
        return(0x0,datasize("runtime")) // return the actual contract
    }
    object "runtime" { // Actual Contract
        code {
            /* .. */
        }
    }
}

How methods are invoked

Ok so we have the contract address and runtime code stored on the blockchain, but do we invoke our methods.

The runtime code is executed like python script fashion. Execution starts from start every time. Our contracts are also executed from start every time we call/invoke any method on it. It’s responsibility of runtime to execute specific code we asked in our data field of the transaction.

Here we’re calling “mint(address,uint256)” method of our ERC-721 contract.

calling mint method

Notice we’re passing token owner and token id which is 5 as param in the command. In transaction info we also have data field, which is the actual data (calldata) being sent to our contract address. Our runtime code would receive this sequence of bytes and it’s up to runtime to decide which code would be executed.

How this data is computed – ABI

We can pass any data to contracts and each contract can have different way to interpret the call data. But for consistency, there’s standard to execute contract methods in a specific way which is called Abstract Binary Interface, all ABI compliant contract methods can be invoked in a standard way like the one we called before. The calldata is encoded using the method signature we want to execute along with the parameters we want to pass.

ABI Encoding of methods

Method signature

In ABI encoding the first four bytes of call data are always of the keccak256/sha3 of the method signature. The mint function of our contract has following signature

mint(address,uint256)

And if we compute is keccak256 we get the following 64 length hex string ( 32 bytes )

computing keccak256 of mint method

Notice the first four bytes which are first 8 characters in our hex output. They’re exactly same in our invoke transaction “data” field above!

So now we’ve passed our contract the unique id of the function it should invoke in call data. But how do we pass the parameters necessary for the execution.

Parameters encoding

The parameters are passed after the 4 bytes of method id. We know with our function signature that the first parameter must be address which are 20 bytes in length and second parameter is token id which must be 32 bytes.

The Address Type

In EVM there’s just one native type which uint256 ( 32 bytes in size ). EVM storage can only store sequence of uint256 numbers. The ABI has also kept that in mind design encoding around this type.

The address parameter even though has 20 bytes, but when passing in call data is padded to 32 bytes in length. So in our above call, the address parameter we passed was

C09d65464237a40d7BF44e424Fe1d90cAfC0E402

But in ABI encoding it’s padded to 32 bytes in size becoming

000000000000000000000000c09d65464237a40d7bf44e424fe1d90cafc0e402

and it’s exactly the sequence we’ve in our call data above.

Uint256 type

Since unit256 type is native to EVM, there’s no encoding needed, it’s passed as it is. We passed 5 as token id, and we can see the value 5 in 32 byte padded unit256.

Following is the encoding we get

mint call encoding highlight

Note on Dynamic Types

Solidity has dynamic types such as bytes and arrays. These dynamic types have different encoding because the data depend on their length. Refer to ABI specs for detailed info.

We still need to know how runtime code parses this ABI encoded call data and executes the function.

How runtime code parses calldata

We know function signature id is always in the first four bytes of call data so we can extract the first four bytes, and then compare this id with existing function signature ids the contract has.

object "runtime"{
    code {
        // extract first four bytes by shifting 224 (0xe0) bits to right
        switch shr(0xe0,calldataload(0x0))
        case 0x70a08231 {
            /// function balanceOf(address _owner) external view returns (uint256);
            returnUint(balanceOf(calldataload(0x4)))
        }
        case 0x6352211e {
            /// function ownerOf(uint256 _tokenId) external view returns (address);

            returnUint(ownerOf(calldataload(0x4)))
        }
        case 0x40c10f19 {
            /// Non-EIP
            /// function mint(address to,uint256 tokenId) external returns (uint256 tokenId)

            mint(calldataload(0x4),calldataload(0x24))
            return (0x0,0x0)
        }
        /*... */

        default {
	          revertABI()
        }
    }
}

Since the code is always executed from top, we always start by extracting first four bytes i.e the function to execute and compare them with existing function signatures.

Notice third case, for our mint function this case would match the call data function id. We’re calling our custom mint function and passing parameters by extracting them from call data.

If none of cases match it means we don’t have the function sender wants to invoke, in this case our default body executes and reverts the transactions.

We can see how EVM is comparing the functions in the following image.

debug output of transaction

We push the first uint256 bits of call data in the stack (CALLDATALOAD), and right shift (SHR) them by 224 bits. We then duplicate (DUP2) the method id so we can use it to compare with cases (EQ). We can see on lines 12, 23, 34 and 45 the EQ opcode is being used to compare the case we wrote in switch statement. Once the cases match we’ll jump to that function.

Conclusion

We’ve covered the creation process of contract, abstract binary interface, and contract execution flow and that’s it for this part. But there’s still a lot more to cover including:

Layout of storage.
Emitting events (logs).
Reverting with ABI compliant errors.
Calling other contacts method from our contract.
Debugging our contract.

See you in the second part.