edgaras/llm-json-cleaner
最新稳定版本:v1.1.1
Composer 安装命令:
composer require edgaras/llm-json-cleaner
包简介
A PHP library that ensures strict JSON extraction and schema validation from LLM API responses, preventing malformed or unexpected output.
README 文档
README
PHP library for sanitizing JSON responses from LLM APIs and validating them against a specified JSON schema.
Features
- JSON Response Cleaning: Remove unwanted artifacts.
- Schema Validation: Validate and enforce JSON schema constraints.
Installation
Install the package via Composer:
composer require edgaras/llm-json-cleaner
Usage
Extracting JSON from LLM Responses
require_once 'vendor/autoload.php'; use Edgaras\LLMJsonCleaner\JsonCleaner; $llmResponse = "Hi there! Please find the details below:\n\n{ \"task\": \"generate_report\", \"parameters\": { \"date\": \"2025-02-17\", \"format\": \"pdf\" } }\n\nLet me know if you need further assistance."; // Return JSON only $extractJson = JsonCleaner::extract($llmResponse, false); echo $extractJson; // {"task":"generate_report","parameters":{"date":"2025-02-17","format":"pdf"}} // Return JSON only as an array $extractJsonAsArray = JsonCleaner::extract($llmResponse, true); print_r($extractJsonAsArray); // ( // [task] => generate_report // [parameters] => Array // ( // [date] => 2025-02-17 // [format] => pdf // ) // )
Cleaning Malformed JSON
JsonCleaner::extract() automatically fixes common LLM output issues:
// Trailing commas $json = JsonCleaner::extract('{"name": "Alice", "age": 30,}', true); // ['name' => 'Alice', 'age' => 30] // Single-quoted JSON $json = JsonCleaner::extract("{'name': 'Alice'}", true); // ['name' => 'Alice'] // Unquoted keys $json = JsonCleaner::extract('{name: "Alice", active: true}', true); // ['name' => 'Alice', 'active' => true] // Multiline string values (literal newlines inside strings) $json = JsonCleaner::extract('{"text": "line one line two"}', true); // ['text' => "line one\nline two"] // Top-level arrays $json = JsonCleaner::extract('Here: [{"id":1},{"id":2}] done.', true); // [['id' => 1], ['id' => 2]]
Validating JSON Against a Schema
require_once 'vendor/autoload.php'; use Edgaras\LLMJsonCleaner\JsonValidator; $json = '{ "order_id": 401, "customer": "Alice", "payment_methods": [ { "method_id": "p1", "type": "Credit Card" }, { "method_id": "p2", "type": "PayPal" } ] }'; $schema = [ 'order_id' => ['required', 'integer', 'min:1'], 'customer' => ['required', 'string'], 'payment_methods' => ['required', 'array', 'min:1'], 'payment_methods.*.method_id' => ['required', 'string'], 'payment_methods.*.type' => ['required', 'string'], ]; $validator = JsonValidator::validateSchema(json_decode($json, 1), $schema); var_dump($validator); // bool(true) $schemaNotFull = [ 'order_id' => ['required', 'integer', 'min:1'], 'customer' => ['required', 'string'], ]; $validator2 = JsonValidator::validateSchema(json_decode($json, 1), $schemaNotFull); print_r($validator2); // Array // ( // [0] => Array // ( // [payment_methods] => Array // ( // [0] => Unexpected field: payment_methods // ) // ) // )
Validation Rules
| Rule | Description |
|---|---|
required |
Field must be present and non-empty |
nullable |
Field may be null (skips other rules when null) |
string |
Must be a string |
integer |
Must be an integer |
float |
Must be a float or integer |
numeric |
Must be numeric (int, float, or numeric string) |
boolean |
Must be a boolean |
array |
Must be an array |
email |
Must be a valid email address |
url |
Must be a valid URL |
date |
Must be a valid date string |
min:N |
Minimum value (numbers) or minimum item count (arrays) |
max:N |
Maximum value (numbers) or maximum item count (arrays) |
in:a,b,c |
Value must be one of the listed options |
Fields without required are optional - missing fields won't trigger errors, but present fields are still validated.
Nested array items are validated using wildcard dot notation: items.*.field_name
Unexpected fields (not defined in schema) are detected at all nesting levels.
Changelog
v1.1.0
JsonCleaner
- Added automatic trailing comma removal (
{"a": 1,}) - Added single-quote to double-quote conversion (
{'key': 'value'}) - Added unquoted key detection and quoting (
{key: "value"}) - Added multiline string value sanitization (literal newlines inside JSON strings)
- Added top-level JSON array extraction (
[...]embedded in text) - All sanitization is string-aware - content inside quoted values is never corrupted
JsonValidator
- Added
nullablerule - allowsnullvalues and skips further validation - Added
floatrule - accepts floats and integers - Added
numericrule - accepts any numeric value including numeric strings - Added
email,url,datevalidation rules - Added optional field support - fields without
requireddon't error when missing - Added unexpected nested field detection inside wildcard arrays (not just top-level)
- Added sequential array check for wildcard (
.*) paths - associative arrays are rejected - Fixed
requiredrule to allowfalseboolean values
General
- Tested and compatible with PHP 8.4 and PHP 8.5
统计信息
- 总下载量: 480
- 月度下载量: 0
- 日度下载量: 0
- 收藏数: 5
- 点击次数: 0
- 依赖项目数: 0
- 推荐数: 0
其他信息
- 授权协议: MIT
- 更新时间: 2025-02-17