Implementing first class functions in a bytecode interpreter is trivial.

But how do compilers that generate machine code (or lower to C, or SSA) implement higher order functions? Back in 2021, I found an answer when contributing closures to the Pallene compiler.

Today I was researching something loosely related, and found yet another neat trick called defunctionalization in this paper.

Defunctionalization is a transform – a way to re-write the original program without using higher order functions such that it can be trivially compiled to a flat IR in subsequent compiler passes.

The paper uses an OCaml example, and I’ll be porting the same program to Haskell. Our implementation will assume that the language being compiled supports GADTs, though it’s certainly possible to defunctionalize without them.

  • ChubakPDP11+TakeWithGrainOfSalt
    link
    1
    edit-2
    1 month ago

    I think by ‘GADTs’ you mean an AST (could be mistaken). In that case, it would not be a bytecode interpreter, it would be a tree walker. In most languages, an AST is translated down to bytecode, and passed to a VM execution engine. How this engine deals with closures is highly compliant on the architecture of the AST.

    First, keep this in mind: A closure is just a closure at the higher-level. Job of the AST is to translate it down to ta more abstract form, thus, a closure would most probably be inlined. Otherwise, it will be a global subroutine defined somewhere on the stack. It never makes sense not to inline closures, unless that closure is a ‘hotzone’, e.g. frequently used, in whcih case you should most obviously define as a subroutine, and if possible, JIT.

    A VM lke NekoVM has a higher-order, terse IR with which you can handle closures like you do in JS.

    Don’t conflate higher-order constructs with intermediate representations. If you are interested to see a VM development in progress, star my RuppVM. Still early though. But i tend to work on it actively.

    Thanks.