定制 heitorflavio/toon-php 二次开发

按需修改功能、优化性能、对接业务系统,提供一站式技术支持

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

heitorflavio/toon-php

Composer 安装命令:

composer require heitorflavio/toon-php

包简介

TOON (Token-Oriented Object Notation) encoder/decoder for PHP - a compact, human-readable serialization format designed for LLM prompts

README 文档

README

TOON (Token-Oriented Object Notation) is a compact, human-readable serialization of the JSON data model, designed to shrink the token footprint of structured data in LLM prompts. It combines YAML-like indentation for nested objects with a CSV-style tabular layout for uniform arrays: each array declares its length ([N]) and its fields ({a,b,c}) exactly once, then streams one row per line. On uniform arrays of objects this typically yields 30–60% fewer tokens than pretty-printed JSON while remaining a lossless, drop-in representation — use JSON programmatically, encode it as TOON for LLM input. The explicit length and field declarations also act as guardrails that help models parse and validate the data.

This library implements TOON SPEC v3.3 for PHP and passes the official language-agnostic conformance fixture suite. See also the reference TypeScript implementation.

JSON vs TOON

The same document, side by side:

JSON (pretty-printed):

{
  "context": {
    "task": "Summarize sales",
    "region": "EMEA"
  },
  "tags": ["q3", "priority"],
  "orders": [
    { "sku": "A1", "qty": 2, "price": 9.99 },
    { "sku": "B2", "qty": 1, "price": 14.5 },
    { "sku": "C7", "qty": 5, "price": 3.75 }
  ]
}

TOON:

context:
  task: Summarize sales
  region: EMEA
tags[2]: q3,priority
orders[3]{sku,qty,price}:
  A1,2,9.99
  B2,1,14.5
  C7,5,3.75

Braces, brackets, repeated field names, and most quotes are gone. The orders array declares [3] rows and {sku,qty,price} fields once, then emits pure data.

Installation

composer require heitorflavio/toon-php

Requires PHP >= 8.2 (ext-mbstring). No other runtime dependencies.

Quick start

Encoding

use Toon\Toon;

echo Toon::encode([
    'users' => [
        ['id' => 1, 'name' => 'Alice', 'role' => 'admin'],
        ['id' => 2, 'name' => 'Bob', 'role' => 'user'],
    ],
]);
users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user

Decoding

use Toon\Toon;
use Toon\DecodeOptions;

$toon = <<<TOON
users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user
TOON;

// Objects decode to stdClass by default (like json_decode):
$data = Toon::decode($toon);
echo $data->users[0]->name; // Alice

// Or to associative arrays (like json_decode($s, true)):
$data = Toon::decode($toon, new DecodeOptions(associative: true));
echo $data['users'][1]['role']; // user

Round-trips are lossless on the JSON data model:

$original = ['a' => [1, 2], 'b' => ['c' => null, 'd' => true]];
$back = Toon::decode(Toon::encode($original), new DecodeOptions(associative: true));
var_export($back === $original); // true

Laravel & Blade

Building LLM prompts in Blade views? The package ships a ServiceProvider that is auto-discovered on install — no registration needed — and adds two directives. Their output is intentionally not HTML-escaped, since prompts are plain text, not page markup. Eloquent models and Illuminate\Support\Collection encode natively (both implement JsonSerializable).

@toon — encode a whole structure

This is the directive you want most of the time. TOON iterates uniform arrays internally, so there's no manual loop: hand it the data and the array key names the table.

@toon(['orders' => $orders])
orders[3]{sku,qty,price}:
  A1,2,9.99
  B2,1,14.5
  C7,5,3.75

Pass a bare collection (@toon($orders)) to emit a root-level table without a name.

@tooneach — loop-style projection

Reach for the block form only when you need per-row control — selecting a subset of fields, computing a derived value, or skipping items with @if. Each @toonrow pushes one associative row; the block collects them into a uniform table. The optional leading string literal names the table (recommended — it gives the model context and acts as a parsing guardrail).

@tooneach('orders', $orders as $o)
    @toonrow(['sku' => $o->sku, 'qty' => $o->qty, 'total' => $o->qty * $o->price])
@endtooneach
orders[3]{sku,qty,total}:
  A1,2,19.98
  B2,1,14.5
  C7,5,18.75

Whitespace between the directives is discarded — only the encoded table is emitted. Omit the name (@tooneach($orders as $o)) for a root-level table. Don't nest @tooneach blocks (they share an internal accumulator); shape nested data in PHP and pass it to @toon instead.

API

Toon::encode(mixed $value, ?EncodeOptions $options = null): string
Toon::decode(string $toon, ?DecodeOptions $options = null): mixed

Toon::encode() throws Toon\Exception\EncodeException (e.g. for malformed UTF-8 strings). Toon::decode() throws Toon\Exception\DecodeException on syntax or strict-mode violations. Both extend Toon\Exception\ToonException.

EncodeOptions

new EncodeOptions(indent: 2, delimiter: Delimiter::COMMA, keyFolding: KeyFolding::OFF, flattenDepth: null)
Option Type Default Meaning
indent int 2 Number of spaces per indentation level (must be >= 1).
delimiter string Delimiter::COMMA Delimiter for inline array values and tabular rows: Delimiter::COMMA (,), Delimiter::TAB ("\t") or Delimiter::PIPE (|). See Delimiters.
keyFolding string KeyFolding::OFF KeyFolding::SAFE collapses chains of single-key objects into dotted paths (a.b.c: 1). See Key folding & path expansion.
flattenDepth ?int null Maximum number of segments to fold when keyFolding is safe. null means unbounded.

DecodeOptions

new DecodeOptions(indent: 2, strict: true, expandPaths: ExpandPaths::OFF, associative: false)
Option Type Default Meaning
indent int 2 Expected number of spaces per indentation level (must be >= 1).
strict bool true Enforce strict-mode validation (SPEC §14). See Strict mode.
expandPaths string ExpandPaths::OFF ExpandPaths::SAFE expands dotted unquoted keys into nested objects (inverse of key folding).
associative bool false When true, decoded objects become associative arrays. When false (default), they become stdClass.

Why associative defaults to false

PHP arrays cannot distinguish an empty object from an empty array — [] is both. Decoding objects to stdClass (like json_decode()'s default) preserves the {} vs [] distinction, so encode(decode($toon)) is lossless:

echo Toon::encode(['config' => new stdClass(), 'list' => []]);
// config:
// list: []

Caveat with associative: true: an empty TOON object decodes as [] (an empty PHP array), exactly like json_decode($s, true). Re-encoding that value produces an empty array (list: []), not an empty object — the distinction is lost. Use the default stdClass mode whenever empty objects must survive a round-trip.

Number decoding

Integer-looking tokens decode to PHP int when they fit in PHP_INT_MIN..PHP_INT_MAX, otherwise to float (like json_decode). Tokens with a fractional part or exponent decode to float:

var_dump(Toon::decode('n: 9223372036854775807')->n); // int(9223372036854775807)
var_dump(Toon::decode('n: 9223372036854775808')->n); // float(9.223372036854776E+18)
var_dump(Toon::decode('n: 1.5')->n);                 // float(1.5)

Host-type normalization (PHP → TOON)

Before encoding, PHP values are normalized to the JSON data model (SPEC §3):

PHP value Encodes as
null, bool, int, float, string The corresponding TOON primitive.
Array where array_is_list() is true Array. An empty PHP array [] is an empty array (same as json_encode).
Associative array Object (keys cast to string).
stdClass Object, property order preserved. An empty stdClass is an empty object.
JsonSerializable Result of jsonSerialize(), normalized recursively (takes precedence over the rules below).
DateTimeInterface ISO 8601 / RFC 3339 string, e.g. "2026-06-11T10:30:00+00:00".
Backed enum (\BackedEnum) Its value.
Pure enum (\UnitEnum) Its name.
NAN, INF, -INF null. (Float -0.0 encodes as 0.)
Other objects Object built from public properties (get_object_vars()).
Closures, resources null.
Strings with invalid UTF-8 Toon\Exception\EncodeException (output must be valid UTF-8 text).

For example:

enum Status: string { case Active = 'active'; }
enum Color { case Red; }

echo Toon::encode([
    'when' => new DateTimeImmutable('2026-06-11T10:30:00+00:00'),
    'status' => Status::Active,
    'color' => Color::Red,
    'ratio' => NAN,
    'callback' => fn () => 1,
]);
when: "2026-06-11T10:30:00+00:00"
status: active
color: Red
ratio: null
callback: null

Number output follows the spec's canonical form (matching JavaScript's String(n)): no trailing .0, plain decimals inside the canonical range, lowercase-e scientific notation outside it:

echo Toon::encode(['pi' => 3.14, 'big' => 1e21, 'tiny' => 0.0000001, 'whole' => 5.0]);
// pi: 3.14
// big: 1e+21
// tiny: 1e-7
// whole: 5

Delimiters

Inline array values and tabular rows can use comma (default), tab, or pipe. The chosen delimiter is declared inside the bracket header ([2 ] for tab, [2|] for pipe; no symbol means comma), so documents are self-describing and decoding needs no option:

use Toon\Delimiter;
use Toon\EncodeOptions;

$data = ['items' => [
    ['sku' => 'A1', 'name' => 'Anvil, small', 'qty' => 2],
    ['sku' => 'B2', 'name' => 'Rope (10 m)', 'qty' => 1],
]];

echo Toon::encode($data); // comma (default)
items[2]{sku,name,qty}:
  A1,"Anvil, small",2
  B2,Rope (10 m),1
echo Toon::encode($data, new EncodeOptions(delimiter: Delimiter::TAB));
items[2	]{sku	name	qty}:
  A1	Anvil, small	2
  B2	Rope (10 m)	1
echo Toon::encode($data, new EncodeOptions(delimiter: Delimiter::PIPE));
items[2|]{sku|name|qty}:
  A1|Anvil, small|2
  B2|Rope (10 m)|1

When this matters for tokens: values are only quoted when they contain the active delimiter, so switching delimiters changes how much quoting you pay for. "Anvil, small" needs quotes with the comma delimiter but not with tab or pipe. If your data is full of commas (prose, addresses, formatted numbers), tab or pipe usually saves tokens; tab also tends to tokenize cheaply in modern LLM tokenizers.

Key folding & path expansion

Key folding (encoder)

With keyFolding: safe, chains of single-key objects fold into a dotted path, saving indentation and lines:

use Toon\EncodeOptions;
use Toon\KeyFolding;

$cfg = ['data' => ['metadata' => ['items' => [1, 2, 3]]]];

echo Toon::encode($cfg);
// data:
//   metadata:
//     items[3]: 1,2,3

echo Toon::encode($cfg, new EncodeOptions(keyFolding: KeyFolding::SAFE));
// data.metadata.items[3]: 1,2,3

echo Toon::encode($cfg, new EncodeOptions(keyFolding: KeyFolding::SAFE, flattenDepth: 2));
// data.metadata:
//   items[3]: 1,2,3

Folding is safe by construction: it only applies when every segment is an identifier ([A-Za-z_][A-Za-z0-9_]*, no dots) and the folded key would not collide with an existing sibling key:

echo Toon::encode(['a' => ['b' => 1], 'a.b' => 2], new EncodeOptions(keyFolding: KeyFolding::SAFE));
// a:
//   b: 1
// a.b: 2

Path expansion (decoder)

expandPaths: safe is the inverse: unquoted dotted keys whose segments are all identifiers expand back into nested objects (deep-merged in encounter order):

use Toon\DecodeOptions;
use Toon\ExpandPaths;

$opts = new DecodeOptions(expandPaths: ExpandPaths::SAFE, associative: true);

var_export(Toon::decode('data.metadata.items[3]: 1,2,3', $opts));
// array ('data' => array ('metadata' => array ('items' => array (0 => 1, 1 => 2, 2 => 3))))

// Without expandPaths, the dotted key stays literal:
var_export(Toon::decode('data.metadata.items[3]: 1,2,3', new DecodeOptions(associative: true)));
// array ('data.metadata.items' => array (0 => 1, 1 => 2, 2 => 3))

// Quoted keys are never expanded:
var_export(Toon::decode('"a.b": 1', $opts));
// array ('a.b' => 1)

In strict mode, expansion conflicts (two paths writing the same leaf) throw a DecodeException; in non-strict mode the last write wins.

Strict mode

Strict mode is on by default and makes the decoder a validator (SPEC §14). It rejects, among others:

  • Count mismatches — declared [N] vs actual inline values, list items, or tabular rows; row width vs field count:

    Toon::decode("items[3]: 1,2");
    // DecodeException: Expected 3 inline value(s), got 2 (line 1)
    
    Toon::decode("users[2]{id,name}:\n  1,Alice\n  2");
    // DecodeException: Tabular row has 1 value(s), expected 2 (line 3)
  • Indentation problems — not a multiple of indent, or tabs used for indentation:

    Toon::decode("items[2]:\n  - 1\n   - 2");
    // DecodeException: Indentation of 3 space(s) is not a multiple of 2 (line 3)
    
    Toon::decode("a:\n\tb: 1");
    // DecodeException: Tabs are not allowed in indentation (line 2)
  • Malformed syntax — missing colons, invalid escapes, malformed bracket lengths ([03], [-1], [bar]), unterminated strings:

    Toon::decode("a: 1\njust some text");
    // DecodeException: Missing colon in line "just some text" (line 2)
    
    Toon::decode('s: "a\x"');
    // DecodeException: Invalid escape sequence "\x" (line 1)
  • Duplicate sibling keys and path-expansion conflicts:

    Toon::decode("a: 1\na: 2");
    // DecodeException: Duplicate key "a" (line 2)

Toon\Exception\DecodeException carries the offending line when known:

use Toon\Exception\DecodeException;

try {
    Toon::decode("items[3]:\n  - 1\n  - 2");
} catch (DecodeException $e) {
    echo $e->getMessage();  // Expected 3 list item(s), got 2 (line 1)
    echo $e->lineNumber;    // 1 (?int, null when no line applies)
}

With strict: false, the decoder is lenient: depth is derived as floor(spaces / indent), blank lines inside arrays are ignored, and duplicate keys / expansion conflicts resolve silently as last-write-wins:

var_export(Toon::decode("a: 1\na: 2", new DecodeOptions(strict: false, associative: true)));
// array ('a' => 2)

Format overview

A 60-second tour (see the spec for the normative rules):

  • Objects use indentation instead of braces: key: value, nested objects indent one level.

  • Primitive arrays are inline with a declared length: tags[2]: q3,priority.

  • Uniform arrays of objects (same key set, primitive values) become tables — fields declared once, one row per line:

    users[2]{id,name,role}:
      1,Alice,admin
      2,Bob,user
    
  • Mixed or non-uniform arrays fall back to a hyphenated list:

    echo Toon::encode(['items' => [42, ['name' => 'nested'], 'text']]);
    items[3]:
      - 42
      - name: nested
      - text
    
  • Strings are quoted only when necessary (empty, leading/trailing whitespace, looks like a boolean/number, contains the active delimiter or structural characters):

    echo Toon::encode(['note' => 'hello, world', 'empty' => '', 'looksBool' => 'true', 'looksNum' => '42']);
    note: "hello, world"
    empty: ""
    looksBool: "true"
    looksNum: "42"
    
  • Any JSON root works: objects, arrays ([3]: 1,2,3 at root), or a single scalar line (hello world). An empty document is an empty object.

  • Output is UTF-8 with LF line endings, no trailing whitespace, and deterministic canonical formatting.

Conformance

This implementation targets TOON SPEC v3.3 and passes the official language-agnostic conformance fixtures shipped under tests/fixtures/ — 153 encoder and 236 decoder fixture cases. The full test suite (conformance plus PHP-specific unit tests) currently runs 444 tests with 9,817 assertions, all passing:

php vendor/phpunit/phpunit/phpunit --no-progress
# OK (444 tests, 9817 assertions)

CLI

The package ships a toon binary (installed to vendor/bin/toon) for quick conversions between JSON and TOON:

toon encode [file|-] [options]    Encode JSON input to TOON
toon decode [file|-] [options]    Decode TOON input to JSON

Input is read from the file argument, or from STDIN when the argument is - or omitted. Output goes to STDOUT unless -o/--output FILE is given.

# Encode JSON to TOON
vendor/bin/toon encode data.json -o data.toon

# Pipe from stdin
echo '{"users":[{"id":1,"name":"Alice","role":"admin"},{"id":2,"name":"Bob","role":"user"}]}' | vendor/bin/toon encode
# users[2]{id,name,role}:
#   1,Alice,admin
#   2,Bob,user

# Decode TOON back to (pretty-printed) JSON; --compact minifies
vendor/bin/toon decode data.toon --compact
Option Applies to Meaning
-o, --output FILE both Write output to FILE instead of STDOUT.
--indent=N both Spaces per indentation level (default: 2).
--delimiter=D encode comma, tab, pipe, or a literal , / | (default: comma).
--key-folding=MODE encode off or safe (default: off).
--flatten-depth=N encode Max folded key segments with --key-folding=safe (default: unlimited).
--no-strict decode Disable strict-mode validation.
--expand-paths=MODE decode off or safe (default: off).
--compact decode Emit compact JSON instead of pretty-printed.
-h, --help — Show help text.
-V, --version — Show version information (toon 0.1.0 (TOON spec v3.3) ...).

Exit codes: 0 success, 1 usage error, 2 encode/decode/IO error.

Development

git clone https://github.com/heitorflavio/toon-php.git
cd toon-php
composer install
composer test

Useful filters while working on one side:

php vendor/phpunit/phpunit/phpunit --filter EncodeConformanceTest --no-progress
php vendor/phpunit/phpunit/phpunit --filter DecodeConformanceTest --no-progress

License

MIT © 2026 Thiago Castagnazzi.

TOON format by Johann Schopplich — see the spec and reference implementation.

统计信息

  • 总下载量: 0
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 0
  • 点击次数: 1
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2026-06-11

承接程序开发

PHP开发

VUE

Vue开发

前端开发

小程序开发

公众号开发

系统定制

数据库设计

云部署

网站建设

安全加固