Chapter 3: Lowering the AST to IR
In Chapter 2 we hand-built IR for a specific Fibonacci function. This chapter shows how to do that systematically for any Kaleidoscope program by lowering the AST produced in Chapter 1 into the dialect ops defined in Chapter 2.
The implementation for this chapter lives in examples/kaleidoscope/from_ast.rs
Design
Every Kaleidoscope value is a 64-bit signless integer (i64). The key
design decisions are:
- All variables are memory-backed. Every local variable and every function
parameter is backed by a
DeclOpslot (analogous to LLVM’salloca). Reads becomeLoadOps; writes becomeStoreOps. This avoids having to worry about SSA form. - Control flow uses regions.
IfOpowns two regions (then / else) andWhileOpowns one region (the loop body). Each region gets its own newBasicBlock. WhileOpholds a condition pointer. Rather than threading a boolean value as an argument, the condition is stored in a dedicatedDeclOpslot in the outer block and updated at the end of each loop iteration.
Entry point
The public API is a single function. It takes an AST Function node and
produces a FuncOp inside the provided Context:
#![allow(unused)]
fn main() {
pub fn lower_function(ctx: &mut Context, func: &Function) -> Result<FuncOp> {
let i64_ty = IntegerType::get(ctx, 64, Signedness::Signless);
// Build the function type: all params are i64, single i64 return.
let param_tys: Vec<_> = func.params.iter().map(|_| i64_ty.into()).collect();
let func_ty = FunctionType::get(ctx, param_tys, vec![i64_ty.into()]);
let func_op = FuncOp::new(
ctx,
func.name
.as_str()
.try_into()
.expect("invalid function name"),
func_ty,
);
let entry = func_op.get_entry_block(ctx);
let mut ins = OpInserter::new_at_block_end(entry);
let mut var_map = VarMap::default();
// Spill each function parameter (block argument) into a mutable DeclOp slot.
for (idx, param_name) in func.params.iter().enumerate() {
let param_val = entry.deref(ctx).get_argument(idx);
let slot = DeclOp::new(ctx, i64_ty.into());
let slot_val = slot.get_result(ctx);
ins.append_op(ctx, &slot);
let store = StoreOp::new(ctx, slot_val, param_val);
ins.append_op(ctx, &store);
var_map.insert(param_name.clone(), slot_val);
}
lower_stmts(ctx, &mut ins, &mut var_map, &func.body)?;
// If no terminator was emitted (e.g., function ends with an `if` where both
// branches return), add a fallback return of 0 to satisfy the verifier.
if entry.deref(ctx).get_terminator(ctx).is_none() {
let zero = ConstantOp::new_i64(ctx, 0);
let zero_val = zero.get_result(ctx);
ins.append_op(ctx, &zero);
let ret = ReturnOp::new(ctx, zero_val);
ins.append_op(ctx, &ret);
}
Ok(func_op)
}
}
Two type aliases keep the rest of the code concise:
#![allow(unused)]
fn main() {
/// Inserter type used throughout this module.
type OpInserter = IRInserter<DummyListener>;
/// Maps variable names to the slot-pointer [`Value`] produced by their [`DeclOp`].
type VarMap = FxHashMap<String, Value>;
}
OpInserter is an IRInserter with a no-op listener. It tracks a cursor
position inside a block and appends ops there via append_op. VarMap maps
source-level variable names to the Value (result of a DeclOp) that holds
the variable’s storage slot.
Creating the FuncOp
lower_function first creates an empty FuncOp with the right type, then
obtains the entry block and initialises the inserter at its end:
#![allow(unused)]
fn main() {
pub fn lower_function(ctx: &mut Context, func: &Function) -> Result<FuncOp> {
let i64_ty = IntegerType::get(ctx, 64, Signedness::Signless);
// Build the function type: all params are i64, single i64 return.
let param_tys: Vec<_> = func.params.iter().map(|_| i64_ty.into()).collect();
let func_ty = FunctionType::get(ctx, param_tys, vec![i64_ty.into()]);
let func_op = FuncOp::new(
ctx,
func.name
.as_str()
.try_into()
.expect("invalid function name"),
func_ty,
);
let entry = func_op.get_entry_block(ctx);
let mut ins = OpInserter::new_at_block_end(entry);
let mut var_map = VarMap::default();
// Spill each function parameter (block argument) into a mutable DeclOp slot.
for (idx, param_name) in func.params.iter().enumerate() {
let param_val = entry.deref(ctx).get_argument(idx);
let slot = DeclOp::new(ctx, i64_ty.into());
let slot_val = slot.get_result(ctx);
ins.append_op(ctx, &slot);
let store = StoreOp::new(ctx, slot_val, param_val);
ins.append_op(ctx, &store);
var_map.insert(param_name.clone(), slot_val);
}
lower_stmts(ctx, &mut ins, &mut var_map, &func.body)?;
// If no terminator was emitted (e.g., function ends with an `if` where both
// branches return), add a fallback return of 0 to satisfy the verifier.
if entry.deref(ctx).get_terminator(ctx).is_none() {
let zero = ConstantOp::new_i64(ctx, 0);
let zero_val = zero.get_result(ctx);
ins.append_op(ctx, &zero);
let ret = ReturnOp::new(ctx, zero_val);
ins.append_op(ctx, &ret);
}
Ok(func_op)
}
}
(Only the setup and fallback parts are discussed in detail below.)
Spilling parameters
Every function parameter arrives as a basic-block argument. To allow
mutation later, each argument is immediately spilled into a DeclOp slot:
#![allow(unused)]
fn main() {
// Spill each function parameter (block argument) into a mutable DeclOp slot.
for (idx, param_name) in func.params.iter().enumerate() {
let param_val = entry.deref(ctx).get_argument(idx);
let slot = DeclOp::new(ctx, i64_ty.into());
let slot_val = slot.get_result(ctx);
ins.append_op(ctx, &slot);
let store = StoreOp::new(ctx, slot_val, param_val);
ins.append_op(ctx, &store);
var_map.insert(param_name.clone(), slot_val);
}
}
entry.deref(ctx).get_argument(idx) returns the Value of the idx-th block
argument. DeclOp::new_with_type allocates a pointer-typed slot; StoreOp
writes the initial value into it.
Fallback terminator
After lowering the body, the entry block might still lack a terminator. This
can happen when the source function ends with an if statement where both
branches contain a return. In that case a fallback return 0 is appended:
#![allow(unused)]
fn main() {
// If no terminator was emitted (e.g., function ends with an `if` where both
// branches return), add a fallback return of 0 to satisfy the verifier.
if entry.deref(ctx).get_terminator(ctx).is_none() {
let zero = ConstantOp::new_i64(ctx, 0);
let zero_val = zero.get_result(ctx);
ins.append_op(ctx, &zero);
let ret = ReturnOp::new(ctx, zero_val);
ins.append_op(ctx, &ret);
}
}
Lowering statements
Statements are lowered by two helpers. lower_stmts iterates a slice and
delegates each item to lower_stmt, short-circuiting on the first error:
#![allow(unused)]
fn main() {
fn lower_stmts(
ctx: &mut Context,
ins: &mut OpInserter,
var_map: &mut VarMap,
stmts: &[Stmt],
) -> Result<bool> {
let mut terminated = false;
for stmt in stmts {
terminated = lower_stmt(ctx, ins, var_map, stmt)?;
}
Ok(terminated)
}
}
Both functions return Result<bool> where true means the last op emitted
was a block terminator (ReturnOp). This signal is used to suppress
redundant YieldOp terminators in if / while regions.
Variable declaration
var name; or var name = expr; allocates a slot, registers it in the
VarMap, and optionally stores an initial value:
#![allow(unused)]
fn main() {
Stmt::VarDecl { name, init } => {
let slot = DeclOp::new(ctx, i64_ty.into());
let slot_val = slot.get_result(ctx);
ins.append_op(ctx, &slot);
var_map.insert(name.clone(), slot_val);
if let Some(init_expr) = init {
let val = lower_expr(ctx, ins, var_map, init_expr)?;
let store = StoreOp::new(ctx, slot_val, val);
ins.append_op(ctx, &store);
}
Ok(false)
}
}
Assignment
name = expr; looks up the existing slot in VarMap and emits a StoreOp.
If the variable was never declared, input_error! returns a structured error
instead of panicking:
#![allow(unused)]
fn main() {
Stmt::Assign { name, value } => {
let val = lower_expr(ctx, ins, var_map, value)?;
let slot = *var_map.get(name.as_str()).ok_or_else(|| {
input_error!(
Location::Unknown,
"assignment to undeclared variable: {name}"
)
})?;
let store = StoreOp::new(ctx, slot, val);
ins.append_op(ctx, &store);
Ok(false)
}
}
Return
return expr; lowers the expression, emits ReturnOp, and signals that the
block is now terminated:
#![allow(unused)]
fn main() {
Stmt::Return(expr) => {
let val = lower_expr(ctx, ins, var_map, expr)?;
let ret = ReturnOp::new(ctx, val);
ins.append_op(ctx, &ret);
Ok(true) // ReturnOp is a block terminator
}
}
If statement
if cond { then } else { else } emits an IfOp, then builds two new blocks —
one for each region. Each branch is lowered recursively. A YieldOp is
appended only when the branch did not end with a return:
#![allow(unused)]
fn main() {
Stmt::If {
cond,
then_body,
else_body,
} => {
let cond_val = lower_expr(ctx, ins, var_map, cond)?;
let if_op = IfOp::new(ctx, cond_val);
ins.append_op(ctx, &if_op);
// Then region: add YieldOp only if the branch didn't terminate.
let then_block = BasicBlock::new(ctx, None, vec![]);
then_block.insert_at_front(if_op.then_region(ctx), ctx);
let mut then_ins = OpInserter::new_at_block_end(then_block);
let mut then_vars = var_map.clone();
let then_terminated = lower_stmts(ctx, &mut then_ins, &mut then_vars, then_body)?;
if !then_terminated {
let then_yield = YieldOp::new(ctx);
then_ins.append_op(ctx, &then_yield);
}
// Else region: add YieldOp only if the branch didn't terminate.
let else_block = BasicBlock::new(ctx, None, vec![]);
else_block.insert_at_front(if_op.else_region(ctx), ctx);
let mut else_ins = OpInserter::new_at_block_end(else_block);
let mut else_vars = var_map.clone();
let else_terminated = lower_stmts(ctx, &mut else_ins, &mut else_vars, else_body)?;
if !else_terminated {
let else_yield = YieldOp::new(ctx);
else_ins.append_op(ctx, &else_yield);
}
Ok(false) // IfOp itself is not a terminator in the outer block
}
}
Note that IfOp itself is not a block terminator in the outer block, so
lower_stmt_if returns Ok(false).
While loop
while cond { body } uses the memory-backed condition pattern described in
the design section above. Before entering the loop, the condition is evaluated
and stored in a fresh DeclOp slot. At the end of each iteration, the
condition is re-evaluated and the slot is updated. A YieldOp closes the
body region (if it doesn’t already end with a return):
#![allow(unused)]
fn main() {
Stmt::While { cond, body } => {
// Allocate the condition slot in the outer block.
let cond_slot = DeclOp::new(ctx, i64_ty.into());
let cond_slot_val = cond_slot.get_result(ctx);
ins.append_op(ctx, &cond_slot);
// Compute the initial condition and store it.
let init_cond = lower_expr(ctx, ins, var_map, cond)?;
let init_store = StoreOp::new(ctx, cond_slot_val, init_cond);
ins.append_op(ctx, &init_store);
let while_op = WhileOp::new(ctx, cond_slot_val);
ins.append_op(ctx, &while_op);
// Build the loop body.
let body_block = BasicBlock::new(ctx, None, vec![]);
body_block.insert_at_front(while_op.body_region(ctx), ctx);
let mut body_ins = OpInserter::new_at_block_end(body_block);
let mut body_vars = var_map.clone();
let body_terminated = lower_stmts(ctx, &mut body_ins, &mut body_vars, body)?;
if !body_terminated {
// Re-evaluate the condition at the end of the body and update the slot.
let next_cond = lower_expr(ctx, &mut body_ins, &body_vars, cond)?;
let next_store = StoreOp::new(ctx, cond_slot_val, next_cond);
body_ins.append_op(ctx, &next_store);
let body_yield = YieldOp::new(ctx);
body_ins.append_op(ctx, &body_yield);
}
Ok(false) // WhileOp itself is not a terminator in the outer block
}
}
Expression statement
A bare expr; lowers the expression for its side effects and discards the
resulting value:
#![allow(unused)]
fn main() {
Stmt::Expr(expr) => {
lower_expr(ctx, ins, var_map, expr)?;
Ok(false)
}
Lowering expressions
lower_expr matches on the Expr variant and returns the Value that holds
the computed result:
#![allow(unused)]
fn main() {
fn lower_expr(
ctx: &mut Context,
ins: &mut OpInserter,
var_map: &VarMap,
expr: &Expr,
) -> Result<Value> {
let i64_ty = IntegerType::get(ctx, 64, Signedness::Signless);
match expr {
// ── integer literal ───────────────────────────────────────────────
Expr::Integer(n) => {
let op = ConstantOp::new_i64(ctx, *n);
let val = op.get_result(ctx);
ins.append_op(ctx, &op);
Ok(val)
}
// ── variable reference ────────────────────────────────────────────
Expr::Variable(name) => {
let slot = *var_map.get(name.as_str()).ok_or_else(|| {
input_error!(
Location::Unknown,
"reference to undeclared variable: {name}"
)
})?;
let load = LoadOp::new(ctx, slot, i64_ty.into());
let val = load.get_result(ctx);
ins.append_op(ctx, &load);
Ok(val)
}
// ── binary operation ──────────────────────────────────────────────
Expr::BinOp { op, lhs, rhs } => {
let lhs_val = lower_expr(ctx, ins, var_map, lhs)?;
let rhs_val = lower_expr(ctx, ins, var_map, rhs)?;
let kind = ast_binop_to_kind(op);
let bin_op = BinOp::new(ctx, kind, lhs_val, rhs_val);
let val = bin_op.get_result(ctx);
ins.append_op(ctx, &bin_op);
Ok(val)
}
// ── function call ─────────────────────────────────────────────────
Expr::Call { callee, args } => {
let mut arg_vals = Vec::with_capacity(args.len());
for a in args {
arg_vals.push(lower_expr(ctx, ins, var_map, a)?);
}
let callee_id = callee.as_str().try_into().expect("valid callee identifier");
let call_op = CallOp::new(ctx, IdentifierAttr::new(callee_id), arg_vals, i64_ty.into());
let val = call_op.get_operation().deref(ctx).get_result(0);
let val = { val }; // reborrow to release ctx ref before append
ins.append_op(ctx, &call_op);
Ok(val)
}
}
}
Integer literal
A literal is wrapped in a ConstantOp. The result value is returned:
#![allow(unused)]
fn main() {
Expr::Integer(n) => {
let op = ConstantOp::new_i64(ctx, *n);
let val = op.get_result(ctx);
ins.append_op(ctx, &op);
Ok(val)
}
}
Variable reference
A variable reference looks up the slot in VarMap and emits a LoadOp to
read its current value. An undeclared variable is reported as an
InvalidInput error using the input_error! macro:
#![allow(unused)]
fn main() {
Expr::Variable(name) => {
let slot = *var_map.get(name.as_str()).ok_or_else(|| {
input_error!(
Location::Unknown,
"reference to undeclared variable: {name}"
)
})?;
let load = LoadOp::new(ctx, slot, i64_ty.into());
let val = load.get_result(ctx);
ins.append_op(ctx, &load);
Ok(val)
}
}
Binary operation
Both operands are lowered recursively, then a BinOp is created with the
appropriate BinOpKind attribute:
#![allow(unused)]
fn main() {
Expr::BinOp { op, lhs, rhs } => {
let lhs_val = lower_expr(ctx, ins, var_map, lhs)?;
let rhs_val = lower_expr(ctx, ins, var_map, rhs)?;
let kind = ast_binop_to_kind(op);
let bin_op = BinOp::new(ctx, kind, lhs_val, rhs_val);
let val = bin_op.get_result(ctx);
ins.append_op(ctx, &bin_op);
Ok(val)
}
}
Function call
Arguments are lowered in order and collected into a Vec. A CallOp is then
created with the callee name as an IdentifierAttr:
#![allow(unused)]
fn main() {
Expr::Call { callee, args } => {
let mut arg_vals = Vec::with_capacity(args.len());
for a in args {
arg_vals.push(lower_expr(ctx, ins, var_map, a)?);
}
let callee_id = callee.as_str().try_into().expect("valid callee identifier");
let call_op = CallOp::new(ctx, IdentifierAttr::new(callee_id), arg_vals, i64_ty.into());
let val = call_op.get_operation().deref(ctx).get_result(0);
let val = { val }; // reborrow to release ctx ref before append
ins.append_op(ctx, &call_op);
Ok(val)
}
Note the reborrow trick: call_op.get_operation().deref(ctx) borrows ctx
immutably, so the result val must be captured before passing call_op to
append_op, which borrows ctx mutably.
Note: The general guideline around calling deref or deref_mut on Ptr<T> is to
keep the borrowed value (Ref<T> or RefMut<T>) around for as little time as possible,
and to avoid passing it to any function that might call back into the context.
Error handling
All lowering functions return pliron::result::Result<T>. Errors are
propagated with ?. The input_error! macro creates a
pliron::result::Error with kind ErrorKind::InvalidInput and an optional
source location. Because the Kaleidoscope AST does not carry source positions,
Location::Unknown is used throughout.
Tests
The test module defines a helper that parses a source string, lowers every function into a fresh module, verifies the IR, and prints it:
#![allow(unused)]
fn main() {
fn lower_program(src: &str) -> String {
let funcs = parse_program(src).expect("parse error");
let ctx = &mut Context::new();
let module = ModuleOp::new(ctx, "test".try_into().expect("valid module name"));
for func in &funcs {
let func_op = lower_function(ctx, func).expect("lowering failed");
module.append_operation(ctx, func_op.get_operation(), 0);
}
verify_op(&module, ctx).expect("IR verification failed");
format!("{}", module.get_operation().disp(ctx))
}
}
Try it out
cargo test --example kaleidoscope -- fibonacci_from_ast --show-output
cargo test --example kaleidoscope -- factorial_from_ast --show-output
cargo test --example kaleidoscope -- inline_fibonacci_from_ast --show-output
Each test prints the lowered IR so you can inspect how the AST maps to ops.
Next step
Chapter 4 lowers the Kaleidoscope dialect into a lower-level LLVM dialect.