Compiling a Tree Sitter Grammar to WASM

2 min read

If you want to compile a Tree Sitter grammar to WASM, you might find this Docker Build/BuildKit example helpful. I didn’t want to set up all these tools directly on my machine, so this provides a helpful one long step to isolating the build and tools, and at the end, saves the created WASM file in the directory from where you launched the build (you can change the location by adjusting the output argument).

You can use build arguments or change the file directly.

Replace the TREE_SITTER_GRAMMAR_GIT_URL with the git URL where the grammar is located. For example, it might be:

ARG TREE_SITTER_GRAMMAR_GIT_URL=https://github.com/dhcmrlchtdj/tree-sitter-sqlite

Replace the TREE_SITTER_NAME with the destination directory name that is used for the clone. For example using the repository name from above, it would be tree-sitter-sqlite.

Then, from the folder where you’ve saved the Dockerfile:

cmd
docker buildx build -t tree_sitter_sqlite_wasm  --output . .

After a lengthy download and build process, you’ll end up with a file called:

${TREE_SITTER_NAME}.wasm

For example, it might be:

tree-sitter-sqlite.wasm

Hope this helps!

docker
# docker buildx build -t tree_sitter_sqlite_wasm  --output . .
FROM rust:latest AS tree-sitter

ARG NODE_VERSION=20
ARG TREE_SITTER_GRAMMAR_GIT_URL=https://github.com/🤓.git
ARG TREE_SITTER_NAME=🤓

WORKDIR /tree-sitter
# Remove imagemagick due to https://security-tracker.debian.org/tracker/CVE-2019-10131
RUN apt-get update && export DEBIAN_FRONTEND=noninteractive \
    && apt-get purge -y imagemagick imagemagick-6-common \
    && apt-get install -y git \
    && apt-get install -y curl \
    && apt-get install -y python3 \
    && apt-get install -y cmake

RUN curl -fsSL https://deb.nodesource.com/setup_${NODE_VERSION}.x -o nodesource_setup.sh \
    && bash nodesource_setup.sh \
    && apt-get install -y nodejs

# Rust and Cargo is installed already
RUN cargo install tree-sitter-cli

WORKDIR /em
RUN git clone https://github.com/emscripten-core/emsdk.git
RUN /em/emsdk/emsdk install latest
RUN /em/emsdk/emsdk activate latest
WORKDIR /em/emsdk
RUN chmod +x /em/emsdk/emsdk_env.sh \
    && . ./emsdk_env.sh

# Setting the path via the emsdk_env.sh call doesn't persist, so we need to set it here
ENV PATH="/em/emsdk:/em/emsdk/upstream/emscripten:${PATH}"

WORKDIR /tree-sitter
RUN git clone ${TREE_SITTER_GRAMMAR_GIT_URL}
WORKDIR /tree-sitter/${TREE_SITTER_NAME}
RUN tree-sitter generate
RUN tree-sitter build --wasm

# Without the extra step here, the buildkit copy to the host doesn't work
FROM tree-sitter AS built
# This is a hack to get the wasm file out of the tree-sitter layer
WORKDIR /built
COPY --from=tree-sitter /tree-sitter/${TREE_SITTER_NAME}/${TREE_SITTER_NAME}.wasm .

FROM scratch
COPY --from=built /built/. .