Tree-sitter: parser generator tool and incremental parsing library.

lysdexic · 7 months ago

Tree-sitter: parser generator tool and incremental parsing library.

refalo · 7 months ago

You might also be interested in https://github.com/alexpovel/srgn, you can use it to easily do things like context-sensitive search/replace and a lot more.

Uli@sopuli.xyz · 7 months ago

Read through the Readme and it’s definitely a good tool to know about. It doesn’t fit the needs of my current problem, but I’m certain I’ll use it in the future for context sensitive searching, since grep/awk/sed/tr have definitely fallen flat for me in the past. I might also be able to study how they utilized tree-sitter CLI when I explore my own implementation.

For my purposes, I want to take a group of similar-yet-different YAML file sets (though file type should be arbitrary), and feed them through a tool that will spit out a YAML template containing everything that is shared between multiple sets.

Then, I want it to create a file for each YAML which defines which parts to pull from the template file and a list of variables to be inserted into holes in the templates. Basically creating a madlib that can recreate any file in the original group given the right list of variables to insert.

For example, if I have a hundred YAML files that are mostly similar but contain different project names, have different server types provisioned, and are pulling different product versions, I would want this script to parse all hundred files and spit out a template that could be used as the basis to build any of the hundred files. The template would be combined with a hundred variable trees that would insert each unique part of each file into the right place.

In effect, I could have a small variables file that gives only the unique portions of the equivalent YAML - in this case, it would contain only the project name, the server type, the product version. Then, these small files could be combined with the universal template to recreate the original hundred YAML files. But unlike using a simple override mechanism, I would be able to change elements of the template YAML including broad structural changes, and after some processing, the change would affect all one hundred output YAMLs.

One could track things like environment variables that are specific to a certain project version and require that whenever a project version has a particular value to insert a particular environment variable into the output YAML. Or a centralized file could be made specifying which product versions correspond to which projects, allowing the engineer to change all product versions for a given set of projects in one go. Or one could create a universal template of IaC code that’s applicable to a broad swath of use cases and quickly build out a full set of YAML manifests and Terraform files using a small file that specifies what components will be needed and where to authenticate to the server.

I’m not aware of any tool that does this, but I think tree-sitter gets me much of the way there. If I can use it to parse any given file into a context aware tree, I would then need to make a script that combines the shared features of many context trees and splits the unique features out into small variable files. Then a script to merge them back together as needed. And something to manage file system structure, such as whether to parse every file individually or to strategically merge some sets so you have one variable file that produces multiple output YAML.

Sorry I’m brainstorming at you, just trying to figure out if the tool I’m envisioning is even feasible. Seems like it is, but I’ll have to figure out how to use tree-sitter CLI before I begin.

Tree-sitter: parser generator tool and incremental parsing library.

Tree-sitter: parser generator tool and incremental parsing library.

Tree-sitter｜Introduction