Builtins and implicit arguments¶
Introduction¶
Builtins are predefined optimized low-level execution units which are added to the Cairo CPU board to perform predefined computations which are expensive to perform in vanilla Cairo (e.g., range-checks, Pedersen hash, ECDSA, …).
The communication between the CPU and the builtins is done through memory: each builtin is assigned a continuous area in the memory and applies some constraints (depending on the builtin definition) on the memory cells in that area. For example, the Pedersen builtin will enforce that:
[p + 2] = hash([p + 0], [p + 1])
[p + 5] = hash([p + 3], [p + 4])
[p + 8] = hash([p + 6], [p + 7])
...
Cairo code may read/write from those memory cells to “invoke” the builtin.
The following code verifies that hash(x, y) == z
:
// Write the value of x to [p + 0].
x = [p];
// Write the value of y to [p + 1].
y = [p + 1];
// The builtin enforces that [p + 2] == hash([p + 0], [p + 1]).
z = [p + 2];
Once we use the addresses [p + 0]
, [p + 1]
, [p + 2]
in order to compute the first hash
we cannot use them again to compute a different hash (since Cairo memory is immutable). Instead,
we should use [p + 3]
, [p + 4]
, [p + 5]
, and so on.
This means that we have to keep track of a pointer to the next unused builtin instance.
The convention is that functions which use the builtin should get that pointer as an argument
and return an updated pointer to the next unused instance.
A more complete version of the example above would look like this:
func hash2(hash_ptr: felt*, x, y) -> (hash_ptr: felt*, z: felt) {
// Invoke the hash function.
x = [hash_ptr];
y = [hash_ptr + 1];
// Return the updated pointer (increased by 3 memory cells)
// and the result of the hash.
return (hash_ptr=hash_ptr + 3, z=[hash_ptr + 2]);
}
We can use Typed references with the type HashBuiltin
from
starkware.cairo.common.cairo_builtins
as follows:
from starkware.cairo.common.cairo_builtins import HashBuiltin
func hash2(hash_ptr: HashBuiltin*, x, y) -> (
hash_ptr: HashBuiltin*, z: felt
) {
let hash = hash_ptr;
// Invoke the hash function.
hash.x = x;
hash.y = y;
// Return the updated pointer (increased by 3 memory cells)
// and the result of the hash.
return (hash_ptr=hash_ptr + HashBuiltin.SIZE, z=hash.result);
}
Implicit arguments¶
If a function foo()
calls hash2()
, foo()
must also get and return the
builtin pointer (hash_ptr
) and so does every function calling foo()
.
Since this pattern is so common, Cairo has syntactic sugar for it, called “Implicit arguments”.
Take a look at the following implementation of hash2
(note the function declaration in particular):
from starkware.cairo.common.cairo_builtins import HashBuiltin
func hash2{hash_ptr: HashBuiltin*}(x, y) -> (z: felt) {
// Create a copy of the reference and advance hash_ptr.
let hash = hash_ptr;
let hash_ptr = hash_ptr + HashBuiltin.SIZE;
// Invoke the hash function.
hash.x = x;
hash.y = y;
// Return the result of the hash.
// The updated pointer is returned automatically.
return (z=hash.result);
}
The curly braces declare hash_ptr
as an implicit argument. This automatically adds
an argument and a return value to the function.
If you’re using the high-level return
statement, you don’t have to explicitly return
hash_ptr
. The Cairo compiler just returns the current binding of the hash_ptr
reference. Since hash2
has to return the pointer to the next available instance,
we added the reference rebinding:
let hash_ptr = hash_ptr + HashBuiltin.SIZE;
.
Note that its only effect is on the return statement (implicitly).
Calling a function that gets implicit arguments¶
Cairo’s standard library includes hash2
in the module starkware.cairo.common.hash
.
You can call hash2()
in a few ways:
Explicitly, using
{x=y}
, wherex
is the name of the implicit argument andy
is the name of the reference to bind to it. We use the word “bind” sincey
is not merely passed tofoo
– after the call,y
will be rebound to the value returned byfoo
for the implicit argumentx
.from starkware.cairo.common.cairo_builtins import HashBuiltin from starkware.cairo.common.hash import hash2 func foo{hash_ptr0: HashBuiltin*}() -> (z: felt) { let (z) = hash2{hash_ptr=hash_ptr0}(1, 2); // The previous statement rebinds the value of hash_ptr0. // If hash_ptr0 were used here, it would've referred to // the updated value, rather than foo's argument. return (z=z); }
Note that you must use named arguments with implicit arguments.
Implicitly, if the calling function also has an implicit argument named
hash_ptr
:from starkware.cairo.common.cairo_builtins import HashBuiltin from starkware.cairo.common.hash import hash2 func foo{hash_ptr: HashBuiltin*}() -> (z: felt) { let (z) = hash2(1, 2); // The previous statement rebinds the value of hash_ptr. // If hash_ptr were used here, it would've referred to // the updated value, rather than foo's argument. return (z=z); }
Trying to use
hash2(1, 2)
if there’s is no reference namedhash_ptr
, or this reference is not an implicit argument (or marked using awith
statement as you’ll see below) will fail.Implicitly, inside a
with
statement on a reference namedhash_ptr
:from starkware.cairo.common.cairo_builtins import HashBuiltin from starkware.cairo.common.hash import hash2 func foo(hash_ptr: HashBuiltin*) -> ( hash_ptr: HashBuiltin*, z: felt ) { // Use a with-statement, since 'hash_ptr' is not an // implicit argument. with hash_ptr { let (z) = hash2(1, 2); } return (hash_ptr=hash_ptr, z=z); }
The purpose of the
with
statement is to make the code more readable: The call tohash2
changes (rebinds) the referencehash_ptr
, even thoughhash_ptr
is not mentioned in that line. While it is extremely convenient to program this way, it makes it difficult to understand which function call changes what variable. Therefore, the only references that may be implicitly changed are implicit arguments and references mentioned in awith
statement.
Using the implicit argument mechanism, and helper functions, such as hash2
,
you don’t have to worry about the builtin pointer – all you have to do is add hash_ptr
as an implicit argument and then you can call hash2
without explicitly passing
the pointer.
Revoked implicit arguments¶
Try to compile the following code:
from starkware.cairo.common.cairo_builtins import HashBuiltin
from starkware.cairo.common.hash import hash2
func foo(n) {
if (n == 0) {
return ();
}
foo(n=n - 1);
return ();
}
func bar{hash_ptr: HashBuiltin*}() {
hash2(1, 2);
foo(3);
hash2(3, 4);
return ();
}
You should get the following error:
test.cairo:15:5: While trying to retrieve the implicit argument 'hash_ptr' in:
hash2(3, 4);
^*********^
hash.cairo:13:12: Reference 'hash_ptr' was revoked.
func hash2{hash_ptr: HashBuiltin*}(x, y) -> (result: felt) {
^********************^
Reference was defined here:
test.cairo:13:5
hash2(1, 2);
^*********^
To understand why you got this error, you should note that implicit arguments are implemented as references and as such they can be revoked.
In this case, the line hash2(1, 2)
rebinds hash_ptr
to the value returned
by hash2
(due to the implicit argument of hash2
).
This reference is relative to the ap
register.
The call to foo()
revokes this reference since the compiler cannot track the expected change
to the ap
register. On the other hand, the line hash2(3, 4)
requires this reference,
which is the reason we got the error.
To solve it, you can add the line local hash_ptr: HashBuiltin* = hash_ptr;
which copies the value of the reference
to a local variable (and rebinds the reference accordingly) just after the call to hash2(1, 2)
(where the revoked reference was defined).
In fact, it suffices to add alloc_locals;
to the function, and the Cairo compiler will
automatically add this instruction for you.
from starkware.cairo.common.cairo_builtins import HashBuiltin
from starkware.cairo.common.hash import hash2
func foo(n) {
if (n == 0) {
return ();
}
foo(n=n - 1);
return ();
}
func bar{hash_ptr: HashBuiltin*}() {
alloc_locals;
hash2(1, 2);
// You can skip the line below, and the compiler will add
// it automatically, because of the alloc_locals keyword.
local hash_ptr: HashBuiltin* = hash_ptr;
foo(3);
hash2(3, 4);
return ();
}
After the line local hash_ptr: HashBuiltin* = hash_ptr;
the reference hash_ptr
is relative to
fp
(rather than ap
) so it’s not revoked by the call to foo()
.
The compiler is not always able to add such instructions automatically, for example where if-blocks and jumps are involved. In such cases you will have to add them manually. Consider the following example:
from starkware.cairo.common.cairo_builtins import HashBuiltin
from starkware.cairo.common.hash import hash2
func bar{hash_ptr: HashBuiltin*}(x) {
if (x == 0) {
hash2(1, 2);
}
hash2(3, 4);
return ();
}
In this case, the hash_ptr
reference is revoked because its binding depends on
whether the branch x == 0
was taken or not.
If x == 0
, the reference points to the value returned from hash2(1, 2)
and otherwise
it points to the implicit argument of bar
.
A possible solution is to rebind hash_ptr
at the end of both branches
(this necessitates adding an explicit else
block) using a temporary variable:
from starkware.cairo.common.cairo_builtins import HashBuiltin
from starkware.cairo.common.hash import hash2
func bar{hash_ptr: HashBuiltin*}(x: felt) {
if (x == 0) {
hash2(1, 2);
tempvar hash_ptr = hash_ptr;
} else {
tempvar hash_ptr = hash_ptr;
}
hash2(3, 4);
return ();
}
The fact that the temporary variable is defined at the end of both branches implies
that after the if
statement is completed,
the hash_ptr
reference is at the same location with respect to ap
whether x == 0
or not
(in our case the reference is going to be [ap - 1]
).
Layouts¶
Cairo supports a few possible layouts.
Each layout specifies which of the different builtins exist
and how many instances of that builtin can be used.
This is measured as the ratio between the number of instructions and
the number of available builtin instances. For example, if this ratio of a hash builtin is 16,
it means that the number of hash invocations can be at most n_steps / 16
where
n_steps
is the number of Cairo steps.
If your program needs more hash invocations, you can either increase the number of steps
(using the --steps
flag) or choose a layout with a smaller ratio.
The plain
layout, which is the default layout, has no builtins.
Thus, if your program needs to write output, compute the Pedersen hash or use another builtin,
you will need to call cairo-run
with another layout, which is specified using the
--layout
flag.
The small
layout¶
The small
layout (--layout=small
) includes the following builtins:
Builtin name Ratio
---------------------
Output -
Pedersen 8
Range check 8
ECDSA 512
Note: Since the number of ECDSA
instances is n_steps / 512
and
it must be an integer, it implies that the number of steps must be divisible by 512
when the small
layout is used.
The %builtins
directive¶
The %builtins
directive specifies which builtins are used by the program.
Each builtin adds an argument to main()
and requires a return value.
Those can be replaced by adding implicit arguments to main
.
For example,
%builtins output pedersen
from starkware.cairo.common.cairo_builtins import HashBuiltin
from starkware.cairo.common.hash import hash2
// Implicit arguments: addresses of the output and pedersen
// builtins.
func main{output_ptr, pedersen_ptr: HashBuiltin*}() {
// The following line implicitly updates the pedersen_ptr
// reference to pedersen_ptr + 3.
let (res) = hash2{hash_ptr=pedersen_ptr}(1, 2);
assert [output_ptr] = res;
// Manually update the output builtin pointer.
let output_ptr = output_ptr + 1;
// output_ptr and pedersen_ptr will be implicitly returned.
return ();
}
Exercise¶
Write a function that gets a pointer to a hash function builtin and computes the hash of three values as \(H(H(x, y), z)\) (recall that it should return the updated pointer).
Use the builtin directly without
hash2
. Don’t use implicit arguments.Rewrite your function so that it gets the builtin pointer as an implicit argument and uses the standard library function
hash2
.
Write a main function calling your function.
Write a function that gets a pointer to an array and computes its hash chain:
\[H(\cdots H(H(x_0,x_1),x_2), \ldots, x_n)\]
Range-checks¶
The range-check builtin is used to check that a field element is within the range \([0, 2^{128})\). Namely, it forces that
0 <= [p + 0] < 2^128
0 <= [p + 1] < 2^128
0 <= [p + 2] < 2^128
...
where p
is the beginning address of the builtin.
Checking that a value, x
, is in a smaller range \([0, \text{BOUND}]\)
(where \(\text{BOUND} < 2^{128}\)) can be done using two range-check instances:
Use one instance to verify that \(0 \leq x < 2^{128}\).
Use another instance to verify that \(0 \leq \text{BOUND} - x < 2^{128}\).
Note: Talking about \(x \geq 0\) (without an upper bound, such as \(x < 2^{128}\)) is not well defined – it depends on the interpretation of the field elements as integers (for example, one could interpret the field elements in the range \([0, p)\) which will imply that all the elements are nonnegative, or in the range \([-\lfloor p/2 \rfloor, \lfloor p/2 \rfloor]\) in which half of the elements are nonnegative). On the other hand, once we bound \(x\) from both sides (\(0 \leq x < 2^{128}\)), the range becomes well defined.
Exercise¶
Write a function
foo(x)
that verifies that \(0 \leq x \leq 1000\).Why isn’t checking that \(0 \leq 1000 - x < 2^{128}\) enough?
Write a function
foo(x, y, z, w)
that verifies that \(0 \leq x \leq y \leq z \leq w < 2^{128}\) using as few instances of the bulitin as you can.How can you check that \(0 \leq x < 2^{200}\)? (hint: you will need more than one instance of the builtin)
Hint
Any number \(0 \leq x < 2^{200}\) can be expressed as \(x = a \cdot 2^{128} + b\), where \(0 \leq a < 2^{200 - 128}\) and \(0 \leq b < 2^{128}\).
Divisibility testing¶
Divisibility is a question of whether an integer x
is divisible by y
without remainder
(namely, is there an integer z
such that \(x = y \cdot z\)).
A special case is testing whether x
is even (divisible by 2) or odd.
The question of (integer) divisibility is not well-defined in finite fields:
\(P - 1\) is an even integer, but it is also used to represent -1, which is clearly odd.
One way to overcome this is to force a range. For example, the question “Is the integer
\(0 \leq x < 2^{128}\) divisible by 3?” is well defined.
Exercise¶
Write a function that verifies that x
is within the range \([0, 2^{128})\) and is divisible by
3.
Hint
Check that x and y (for a nondeterministic y) are within the range \([0, 2^{128})\) and that \(x = 3 \cdot y\) (the range-checks will guarantee that there is no overflow).
Integer division¶
We can use the range-check builtin in order to compute integer division with remainder. The goal is to compute \(q = \lfloor x / y \rfloor\) and \(r = x \text{ mod } y\). We can rewrite it as \(x = q \cdot y + r\) (as integers) where \(0 \leq r < y\). When we test \(x = q \cdot y + r\) we need to be careful – we need to make sure the computation will not overflow. For simplicity we will assume here that \(0 \leq x, y < 2^{64}\) (if this is not the case, you can modify the code according to your constraints).
The following code computes \(q\) and \(r\) (and validates \(0 \leq x, y < 2^{64}\)) assuming that \(|\mathbb{F}| > 2^{128}\):
func div{range_check_ptr}(x, y) -> (q: felt, r: felt) {
alloc_locals;
local q;
local r;
%{ ids.q, ids.r = ids.x // ids.y, ids.x % ids.y %}
// Check that 0 <= x < 2**64.
[range_check_ptr] = x;
assert [range_check_ptr + 1] = 2 ** 64 - 1 - x;
// Check that 0 <= y < 2**64.
[range_check_ptr + 2] = y;
assert [range_check_ptr + 3] = 2 ** 64 - 1 - y;
// Check that 0 <= q < 2**64.
[range_check_ptr + 4] = q;
assert [range_check_ptr + 5] = 2 ** 64 - 1 - q;
// Check that 0 <= r < y.
[range_check_ptr + 6] = r;
assert [range_check_ptr + 7] = y - 1 - r;
// Verify that x = q * y + r.
assert x = q * y + r;
let range_check_ptr = range_check_ptr + 8;
return (q=q, r=r);
}
Exercise¶
Convince yourself that the code is correct:
Completeness – if
x
andy
are in range, all the range-checks will pass.Soundness – if all the range-checks pass, then the result is correct (assume a malicious prover which may ignore the hint, and run any hint it wants instead). Why is the assumption \(|\mathbb{F}| > 2^{128}\) required? (recall that the equation
x = q * y + r
is checked modulo the field size).