You can use ASDL to describe the tree grammar of your language. After doing that, translate the code to C by running it through my program. Now you have a C file that contains a bunch of constructors, type defs, macros, etc ,that describe the AST of your language. Two examples, a basic one, and one for regex, has been provided. There’s an e simple one for m4 in the man page.

Some people may need further explanations so let’s go ahead and do it:

When you pass a program to the compiler or the interpreter, or another application does that for you, it first gets ‘lexed’, that is, every token is scanned and categorized. Then, these tokens are used to ‘parse’ the program. That is, the chunks of token are used to define the structure of your program, based on a pre-conceived grammar. For example, in my ASDL implementation, you can view the grammar for ASDL in both Yacc format and in EBNF format (in companions/GRAMMAR.ebnf).

Now, the parser takes ‘semantic’ actions whenever it successfully parses a chunk of code. In the early days of computing, people just printed Assembly code! But now, with optimizing compilers and stuff like that, there was need to represent the language in form of a tree. The parser tree was first used. But parser trees are dense, so ‘abstract’ form it was used, which is what this program makes. You can see the abstract structures that represent ASDL itself, in absyn.c.

After you translate your code to AST, you can translate it to DAGs ,or Directed Acyclic Graphs, to get the flow of the program. Then a control flow graph. If your language is interpeted, you can translate it to your VM bytecode.

So this is what ASDL is. A tool for constructing compilers and interpeters, and other DSLs perhaps?

I explained why I created this. Basically, the original implementation was useless.

Thanks.