iserter/php-ai-text-sanitizer 问题修复 & 功能扩展

解决BUG、新增功能、兼容多环境部署,快速响应你的开发需求

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

iserter/php-ai-text-sanitizer

Composer 安装命令:

composer require iserter/php-ai-text-sanitizer

包简介

A dependency-free PHP library that detects and removes invisible watermarks and normalizes tell-tale typography from AI-generated text.

README 文档

README

A small, dependency-free PHP library that detects and removes invisible watermarks — and other tell-tale characters — that AI providers leave in generated text.

Large language models routinely emit characters that render as nothing but survive copy/paste: zero-width spaces and joiners, Unicode tag characters (U+E0000–E007F) and variation selectors used for steganographic watermarking, bidirectional controls, and unusual spaces such as the narrow no-break space (U+202F). This library strips them, and can optionally normalize the visible typography (smart quotes, dashes, ellipses) that also signals machine authorship.

  • Zero dependencies. Pure PHP 8.1+, PCRE with the /u modifier, and mbstring.
  • Framework-agnostic. Drop it into any PSR-4 project and adjust the namespace.
  • Detect and clean. Use it as a filter, or as a "was this AI-watermarked?" check.

Install

composer require iserter/php-ai-text-sanitizer

Usage

One-liner

use iSerter\AiTextSanitizer\AITextSanitizer;

$clean = AITextSanitizer::clean_text($aiOutput);

Configured instance

$sanitizer = new AITextSanitizer([
    'normalize_smart_quotes' => true,
    'normalize_dashes'       => true,
]);

$clean = $sanitizer->sanitize($aiOutput);

Clean with a report of what changed

$result = $sanitizer->clean($aiOutput);

if ($result->wasModified()) {
    error_log($result->getReport()->summary());
}

echo $result->getText();

Detect only (don't modify)

$report = $sanitizer->detect($aiOutput);

if ($report->hasWatermarks()) {
    foreach ($report->getFindings() as $f) {
        printf("%s %s [%s] x%d\n", $f->notation(), $f->name, $f->category, $f->count);
    }
}

Options

Option Default Effect
remove_zero_width true Remove ZWSP, ZWNJ, ZWJ, word joiner, BOM (U+200B–200D, U+2060, U+FEFF).
remove_bidi true Remove bidirectional marks/controls (U+200E/200F, U+202A–202E, U+2066–2069, U+061C).
remove_invisible_math true Remove invisible math operators (U+2061–2064).
remove_variation_selectors true Remove variation selectors (U+FE00–FE0F, U+E0100–E01EF).
remove_tag_chars true Remove the Unicode Tags block (U+E0000–E007F).
remove_format_chars true Remove soft hyphen, CGJ, Hangul fillers, interlinear annotations, etc.
remove_braille_blank true Remove U+2800 BRAILLE PATTERN BLANK.
strip_control true Remove stray C0/C1 control characters (keeps TAB, LF, CR).
keep_emoji true Preserve ZWJ and variation selectors when they are part of a valid emoji sequence.
remove_citations false Remove AI-generated citations like (oaicite:5){index=5} or 【13†source】.
normalize_homoglyphs false Apply NFKC normalization to neutralize visually identical homoglyphs (requires ext-intl).
normalize_spaces true Fold unusual spaces (NBSP, thin space, ideographic space, …) to U+0020.
normalize_line_separators true U+2028\n, U+2029\n\n.
normalize_smart_quotes false Curly/angle quotes → straight ' and ".
normalize_dashes false Dashes and minus (U+2010–2015, U+2212) → -.
normalize_ellipsis false U+2026....
collapse_whitespace false Collapse runs of horizontal whitespace, trim line-end spaces, cap blank lines.
trim false Trim leading/trailing whitespace of the whole string.

Docker

docker build -t php-ai-text-sanitizer .
docker run --rm php-ai-text-sanitizer

Demo

php src/examples/demo.php

统计信息

  • 总下载量: 0
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 0
  • 点击次数: 2
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2026-07-04

承接程序开发

PHP开发

VUE

Vue开发

前端开发

小程序开发

公众号开发

系统定制

数据库设计

云部署

网站建设

安全加固