定制 edgaras/llm-json-cleaner 二次开发

按需修改功能、优化性能、对接业务系统,提供一站式技术支持

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

edgaras/llm-json-cleaner

最新稳定版本:v1.1.1

Composer 安装命令:

composer require edgaras/llm-json-cleaner

包简介

A PHP library that ensures strict JSON extraction and schema validation from LLM API responses, preventing malformed or unexpected output.

README 文档

README

PHP library for sanitizing JSON responses from LLM APIs and validating them against a specified JSON schema.

Features

  • JSON Response Cleaning: Remove unwanted artifacts.
  • Schema Validation: Validate and enforce JSON schema constraints.

Installation

Install the package via Composer:

composer require edgaras/llm-json-cleaner

Usage

Extracting JSON from LLM Responses

require_once 'vendor/autoload.php';

use Edgaras\LLMJsonCleaner\JsonCleaner;

$llmResponse = "Hi there! Please find the details below:\n\n{
    \"task\": \"generate_report\",
    \"parameters\": {
        \"date\": \"2025-02-17\",
        \"format\": \"pdf\"
    }
}\n\nLet me know if you need further assistance.";

// Return JSON only
$extractJson = JsonCleaner::extract($llmResponse, false);
echo $extractJson;
// {"task":"generate_report","parameters":{"date":"2025-02-17","format":"pdf"}}

// Return JSON only as an array
$extractJsonAsArray = JsonCleaner::extract($llmResponse, true);
print_r($extractJsonAsArray);
// (
//  [task] => generate_report
//  [parameters] => Array
//      (
//          [date] => 2025-02-17
//          [format] => pdf
//      )
// )

Cleaning Malformed JSON

JsonCleaner::extract() automatically fixes common LLM output issues:

// Trailing commas
$json = JsonCleaner::extract('{"name": "Alice", "age": 30,}', true);
// ['name' => 'Alice', 'age' => 30]

// Single-quoted JSON
$json = JsonCleaner::extract("{'name': 'Alice'}", true);
// ['name' => 'Alice']

// Unquoted keys
$json = JsonCleaner::extract('{name: "Alice", active: true}', true);
// ['name' => 'Alice', 'active' => true]

// Multiline string values (literal newlines inside strings)
$json = JsonCleaner::extract('{"text": "line one
line two"}', true);
// ['text' => "line one\nline two"]

// Top-level arrays
$json = JsonCleaner::extract('Here: [{"id":1},{"id":2}] done.', true);
// [['id' => 1], ['id' => 2]]

Validating JSON Against a Schema

require_once 'vendor/autoload.php';

use Edgaras\LLMJsonCleaner\JsonValidator;


$json = '{
  "order_id": 401,
  "customer": "Alice",
  "payment_methods": [
    {
      "method_id": "p1",
      "type": "Credit Card"
    },
    {
      "method_id": "p2",
      "type": "PayPal"
    }
  ]
}';

$schema = [
  'order_id' => ['required', 'integer', 'min:1'],
  'customer' => ['required', 'string'],
  'payment_methods' => ['required', 'array', 'min:1'],
  'payment_methods.*.method_id' => ['required', 'string'],
  'payment_methods.*.type' => ['required', 'string'],
];

$validator = JsonValidator::validateSchema(json_decode($json, 1), $schema);
var_dump($validator);
// bool(true)


$schemaNotFull = [
  'order_id' => ['required', 'integer', 'min:1'],
  'customer' => ['required', 'string'], 
];

$validator2 = JsonValidator::validateSchema(json_decode($json, 1), $schemaNotFull);
print_r($validator2);
// Array
// (
//    [0] => Array
//        (
//            [payment_methods] => Array
//                (
//                    [0] => Unexpected field: payment_methods
//                )
//        )
// )

Validation Rules

Rule Description
required Field must be present and non-empty
nullable Field may be null (skips other rules when null)
string Must be a string
integer Must be an integer
float Must be a float or integer
numeric Must be numeric (int, float, or numeric string)
boolean Must be a boolean
array Must be an array
email Must be a valid email address
url Must be a valid URL
date Must be a valid date string
min:N Minimum value (numbers) or minimum item count (arrays)
max:N Maximum value (numbers) or maximum item count (arrays)
in:a,b,c Value must be one of the listed options

Fields without required are optional - missing fields won't trigger errors, but present fields are still validated.

Nested array items are validated using wildcard dot notation: items.*.field_name

Unexpected fields (not defined in schema) are detected at all nesting levels.

Changelog

v1.1.0

JsonCleaner

  • Added automatic trailing comma removal ({"a": 1,})
  • Added single-quote to double-quote conversion ({'key': 'value'})
  • Added unquoted key detection and quoting ({key: "value"})
  • Added multiline string value sanitization (literal newlines inside JSON strings)
  • Added top-level JSON array extraction ([...] embedded in text)
  • All sanitization is string-aware - content inside quoted values is never corrupted

JsonValidator

  • Added nullable rule - allows null values and skips further validation
  • Added float rule - accepts floats and integers
  • Added numeric rule - accepts any numeric value including numeric strings
  • Added email, url, date validation rules
  • Added optional field support - fields without required don't error when missing
  • Added unexpected nested field detection inside wildcard arrays (not just top-level)
  • Added sequential array check for wildcard (.*) paths - associative arrays are rejected
  • Fixed required rule to allow false boolean values

General

  • Tested and compatible with PHP 8.4 and PHP 8.5

统计信息

  • 总下载量: 480
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 5
  • 点击次数: 0
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 5
  • Watchers: 2
  • Forks: 0
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2025-02-17

承接程序开发

PHP开发

VUE

Vue开发

前端开发

小程序开发

公众号开发

系统定制

数据库设计

云部署

网站建设

安全加固