Skip to content

An opinionated tree-sitter + tree-sitter-highlight + grammars bundle

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

bearcove/arborium

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

arborium

Batteries-included tree-sitter grammar collection with HTML rendering and WASM support.

Crates.io Documentation License

Features

  • 69 language grammars included out of the box
  • 67 permissively licensed (MIT/Apache-2.0/CC0/Unlicense) grammars enabled by default
  • WASM support with custom allocator fix
  • Feature flags for fine-grained control over included languages

Usage

[dependencies]
arborium = "0.1"

By default, all permissively-licensed grammars are included. To select specific languages:

[dependencies]
arborium = { version = "0.1", default-features = false, features = ["lang-rust", "lang-javascript"] }

Browser Usage

Arborium can be used in the browser in two ways:

Option 1: Drop-in Script (Easiest)

Add a single script tag and arborium auto-highlights all code blocks:

<script src="https://cdn.jsdelivr.net/npm/@arborium/arborium/dist/arborium.iife.js"></script>

That's it! Arborium will:

  • Auto-detect languages from class="language-*" or data-lang="*" attributes
  • Load grammar WASM plugins on-demand from jsDelivr CDN
  • Inject the default theme CSS

Configuration via data attributes:

<script
  src="https://cdn.jsdelivr.net/npm/@arborium/arborium/dist/arborium.iife.js"
  data-theme="mocha"
  data-selector="pre code"
  data-manual
></script>

Configuration via JavaScript:

<script>
  window.Arborium = {
    theme: 'tokyo-night',
    selector: 'pre code, .highlight',
    cdn: 'jsdelivr',  // or 'unpkg' or a custom URL
    version: '0.1.3', // or 'latest'
  };
</script>
<script src="https://cdn.jsdelivr.net/npm/@arborium/arborium/dist/arborium.iife.js"></script>

Manual highlighting:

<script src="..." data-manual></script>
<script>
  // Highlight all code blocks
  arborium.highlightAll();

  // Highlight a specific element
  arborium.highlightElement(document.querySelector('code'), 'rust');
</script>

Option 2: ESM Module (Programmatic)

For bundlers (Vite, webpack, etc.) or ESM-native environments:

npm install @arborium/arborium
import { loadGrammar, highlight } from '@arborium/arborium';

// Load a grammar (fetched from CDN on first use)
const grammar = await loadGrammar('rust');

// Highlight code
const html = grammar.highlight('fn main() { println!("Hello!"); }');

// Or use the convenience function
const html = await highlight('rust', code);

Option 3: Compile Rust to WASM (Maximum Control)

For complete control and offline-first apps, compile the Rust crate directly to WASM:

[dependencies]
arborium = { version = "0.1", default-features = false, features = ["lang-rust", "lang-javascript"] }
# Requires LLVM with WASM support (see FAQ below)
cargo build --target wasm32-unknown-unknown

This embeds selected grammars directly in your WASM binary - no CDN required at runtime.

Themes

Available themes (dark): mocha, macchiato, frappe, tokyo-night, dracula, monokai, one-dark, nord, gruvbox-dark, github-dark

Available themes (light): latte, gruvbox-light, github-light, alabaster, dayfox

Import theme CSS:

<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@arborium/arborium/dist/themes/tokyo-night.css">

Or let the IIFE bundle auto-inject it via the data-theme attribute.

Feature Flags

Grammar Collections

Feature Description
mit-grammars All permissively licensed grammars (MIT, Apache-2.0, CC0) - default
gpl-grammars GPL-licensed grammars (copyleft - may affect your project's license)
all-grammars All grammars including GPL

Permissive Grammars (67)

These grammars use permissive licenses (MIT, Apache-2.0, CC0, Unlicense) and are included by default.

Feature Language License Source
lang-asm Assembly MIT tree-sitter-asm
lang-awk AWK MIT tree-sitter-awk
lang-bash Bash MIT tree-sitter-bash
lang-batch Batch MIT tree-sitter-batch
lang-c C MIT tree-sitter-c
lang-c-sharp C# MIT tree-sitter-c-sharp
lang-caddy Caddyfile MIT tree-sitter-caddy
lang-capnp Cap'n Proto MIT tree-sitter-capnp
lang-clojure Clojure Unlicense tree-sitter-clojure
lang-cpp C++ MIT tree-sitter-cpp
lang-css CSS MIT tree-sitter-css
lang-dart Dart MIT tree-sitter-dart
lang-devicetree Device Tree MIT tree-sitter-devicetree
lang-diff Diff MIT tree-sitter-diff
lang-dockerfile Dockerfile MIT tree-sitter-dockerfile
lang-elixir Elixir Apache-2.0 tree-sitter-elixir
lang-elm Elm MIT tree-sitter-elm
lang-fsharp F# MIT tree-sitter-fsharp
lang-gleam Gleam Apache-2.0 tree-sitter-gleam
lang-glsl GLSL MIT tree-sitter-glsl
lang-go Go MIT tree-sitter-go
lang-haskell Haskell MIT tree-sitter-haskell
lang-hcl HCL (Terraform) Apache-2.0 tree-sitter-hcl
lang-hlsl HLSL MIT tree-sitter-hlsl
lang-html HTML MIT tree-sitter-html
lang-idris Idris MIT tree-sitter-idris
lang-ini INI Apache-2.0 tree-sitter-ini
lang-java Java MIT tree-sitter-java
lang-javascript JavaScript MIT tree-sitter-javascript
lang-jq jq MIT tree-sitter-jq
lang-json JSON MIT tree-sitter-json
lang-lean Lean MIT tree-sitter-lean
lang-lua Lua MIT tree-sitter-lua
lang-markdown Markdown MIT tree-sitter-markdown
lang-meson Meson MIT tree-sitter-meson
lang-nix Nix MIT tree-sitter-nix
lang-objc Objective-C MIT tree-sitter-objc
lang-perl Perl MIT tree-sitter-perl
lang-php PHP MIT tree-sitter-php
lang-postscript PostScript MIT tree-sitter-postscript
lang-powershell PowerShell MIT tree-sitter-powershell
lang-prolog Prolog MIT tree-sitter-prolog
lang-python Python MIT tree-sitter-python
lang-r R MIT tree-sitter-r
lang-rescript ReScript MIT tree-sitter-rescript
lang-ron RON MIT OR Apache-2.0 tree-sitter-ron
lang-rust Rust MIT tree-sitter-rust
lang-scala Scala MIT tree-sitter-scala
lang-scss SCSS MIT tree-sitter-scss
lang-sql SQL MIT tree-sitter-sql
lang-starlark Starlark MIT tree-sitter-starlark
lang-svelte Svelte MIT tree-sitter-svelte
lang-thrift Thrift MIT tree-sitter-thrift
lang-tlaplus TLA+ MIT tree-sitter-tlaplus
lang-toml TOML MIT tree-sitter-toml
lang-typescript TypeScript MIT tree-sitter-typescript
lang-vb Visual Basic MIT tree-sitter-vb
lang-verilog Verilog MIT tree-sitter-verilog
lang-vhdl VHDL MIT tree-sitter-vhdl
lang-vim Vimscript MIT tree-sitter-vim
lang-vue Vue MIT tree-sitter-vue
lang-wasm WebAssembly Apache-2.0 tree-sitter-wasm
lang-x86asm x86 Assembly MIT local
lang-xml XML MIT tree-sitter-xml
lang-yaml YAML MIT tree-sitter-yaml
lang-zig Zig MIT tree-sitter-zig
lang-zsh Zsh MIT tree-sitter-zsh

GPL-Licensed Grammars (2)

These grammars are not included by default due to their copyleft license. Enabling them may have implications for your project's licensing.

Feature Language License Source
lang-jinja2 Jinja2 GPL-3.0 tree-sitter-jinja2
lang-nginx nginx GPL-3.0 tree-sitter-nginx

Sponsors

Thanks to all individual sponsors:

GitHub Sponsors Patreon

...along with corporate sponsors:

Zed Depot

License

This project is dual-licensed under MIT OR Apache-2.0.

The bundled grammar sources retain their original licenses - see LICENSES.md for details.

WASM Support

Arborium supports building for wasm32-unknown-unknown. This requires compiling C code (tree-sitter core and grammar parsers) to WebAssembly.

macOS

On macOS, the built-in Apple clang does not support the wasm32-unknown-unknown target. You need to install LLVM via Homebrew:

brew install llvm

Then ensure the Homebrew LLVM is in your PATH when building:

export PATH="$(brew --prefix llvm)/bin:$PATH"
cargo build --target wasm32-unknown-unknown

FAQ

Build fails with "No available targets are compatible with triple wasm32-unknown-unknown"

Error message:

error: unable to create target: 'No available targets are compatible with triple "wasm32-unknown-unknown"'

Cause: You're using Apple's built-in clang, which doesn't include the WebAssembly backend.

Solution: Install LLVM via Homebrew and use it instead:

brew install llvm
export PATH="$(brew --prefix llvm)/bin:$PATH"
cargo build --target wasm32-unknown-unknown

You may want to add the PATH export to your shell profile (.zshrc, .bashrc, etc.) or use a tool like direnv to set it per-project.

Development

This project uses cargo xtask for development tasks. Run cargo xtask help to see available commands.

Repository Structure

arborium/
├── crates/
│   └── arborium-{lang}/         # Per-language grammar crates
│       ├── arborium.kdl         ← SOURCE OF TRUTH (committed)
│       ├── grammar/
│       │   ├── grammar.js       ← tree-sitter grammar (committed)
│       │   ├── scanner.c        ← custom scanner if any (committed)
│       │   └── src/             ← GENERATED (gitignored)
│       ├── queries/
│       │   └── highlights.scm   ← highlight queries (committed)
│       ├── samples/             ← test samples (committed)
│       ├── Cargo.toml           ← GENERATED (gitignored)
│       ├── build.rs             ← GENERATED (gitignored)
│       └── src/lib.rs           ← GENERATED (gitignored)
├── demo/                        # WASM demo website
├── xtask/                       # Build tooling
└── Cargo.toml                   # Workspace root

What's in Git vs Generated

Location In Git Notes
arborium.kdl Source of truth for grammar config
grammar/grammar.js Tree-sitter grammar definition
grammar/scanner.c Custom scanner (if any)
queries/*.scm Highlight/injection queries
samples/* Test samples
Cargo.toml Generated by xtask gen
build.rs Generated by xtask gen
src/lib.rs Generated by xtask gen
grammar/src/* Generated by xtask gen (tree-sitter)

Key Commands

# Regenerate all grammar crates (local dev)
cargo xtask gen

# Regenerate with specific version (for releases)
cargo xtask gen --version 0.3.0

# Regenerate specific grammar only
cargo xtask gen rust

# Build and serve WASM demo
cargo xtask serve --dev

# Build WASM plugins
cargo xtask plugins build

Local Development Workflow

# 1. Edit grammar source files
#    - arborium.kdl (config, license, metadata)
#    - grammar/grammar.js (tree-sitter grammar)
#    - queries/highlights.scm (syntax highlighting)

# 2. Regenerate crate files
cargo xtask gen

# 3. Build and test
cargo build
cargo test -p arborium-rust

Version Management

Versions don't matter locally - path dependencies ignore version numbers.

For releases, CI parses the version from the git tag and runs:

cargo xtask gen --version $VERSION

This updates all Cargo.toml files with the correct version before publishing.

See PUBLISH.md for full release workflow details.

arborium.kdl Format

Each grammar crate has an arborium.kdl file as its source of truth:

repo "https://github.com/tree-sitter/tree-sitter-rust"
commit "abc123..."
license "MIT"

grammar {
    id "rust"
    name "Rust"
    tag "code"
    tier 1
    icon "devicon-plain:rust"
    aliases "rs"
    has-scanner #true
    generate-component #true

    sample {
        path "samples/example.rs"
        description "Example code"
        license "MIT"
    }
}

Key fields:

  • license - SPDX license for the grammar (used in generated Cargo.toml)
  • generate-component #true - Include in WASM plugin builds
  • has-scanner #true - Grammar has external scanner (scanner.c)
  • tier - 1-5, affects default feature inclusion

About

An opinionated tree-sitter + tree-sitter-highlight + grammars bundle

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Sponsor this project

  •  

Packages