mortimer333/content 问题修复 & 功能扩展

解决BUG、新增功能、兼容多环境部署,快速响应你的开发需求

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

mortimer333/content

Composer 安装命令:

composer require mortimer333/content

包简介

README 文档

README

Allows for proper iteration and actions on multibyte strings

Note: currently supports only UTF-8

Install

composer require mortimer333/content

Why?

If you were to iterate over this string from file encoded with UTF-8:

$string = "óźćżó→ę";
for ($i=0; $i < strlen($string); $i++) {
    echo $string[$i] . ', ';
}

Your output will be:

�, �, �, �, �, �, �, �, �, �, �, �, �, �, �,

And thats because UTF-8 stores most of their symbols on multiple bytes and PHP access one at time. To properly iterate over this kind of string you have to manually separate string into proper letters. Or use this library:

$content = new Content\Utf8("óźćżó→ę");
for ($i=0; $i < $content->getLength(); $i++) {
    echo $content->getLetter($i) . ', ';
}

Output:

ó, ź, ć, ż, ó, →, ę,

Performance

Creating new classes is expensive, so you can create only one instance of Content and just add new versions/new strings to it:

$content = new Content\Utf8("óźćżó→ę");
echo "Version 1: " . $content; // óźćżó→ę

$content->cutAndAddContent("ÐÑÒÓÔ");
echo "Version 2: " . $content; // ÐÑÒÓÔ

// If you don't specify constent keeps his older version in memory if it might be needed later
// (to skip the expensive operation of cutting the same string into chunks again)
$content->removeContent();
echo "Version 1: " . $content; // óźćżó→ę

This future certainly lacks some flexibility (like assigning ID to version or being able to retrieve version without removing the current one), but this will be added in the future.

Actions

To String

Class implements _toString method so any instance can be just cast to string:

$string = new Content\Utf8('Foo');
echo (string) $string; // Foo

subStr

You can also just retrieve part of the content by using subStr or iSubStr:

$string = new Content\Utf8('Foo and Bar');
// By index and length
echo $string->subStr(0, 3); // Foo
// From index to index
echo $string->iSubStr(8, 10); // Bar

cutToContent

Similar to subStr, cutToContent returns part of the string but in form of new Content:

$string = new Content\Utf8('Foo and Bar');
// By index and length
echo $string->cutToContent(0, 3); // Utf8(Foo)
// From index to index
echo $string->iCutToContent(8, 10); // Utf8(Bar)

cutToArray

Just like subStr and cutToContent but returns chunked part of string in form of array:

$string = new Content\Utf8('Foo and Bar');
// By index and length
var_dump($string->cutToArray(0, 3)); // Array("F", "o", "o")
// From index to index
var_dump($string->iCutToArray(8, 10)); // Array("B", "a", "r")

trim

You can also trim your content if needed (it will return trimed version, not replace existing one):

$string = new Content\Utf8('==Foo and Bar==');
echo (string) $string->trim("="); // Foo and Bar;

splice

If you want to remove part of the string from current version or replace it/inject new string to it you can use splice method:

$string = new Content\Utf8('Foo #error Bar');
$string->splice(4, 6, 'and');                   // Foo and Bar
echo (string) $string->iSplice(4, 6, 'or');     // Foo or Bar

find

IF you need to find index of some needle you can use method find:

$string = new Content\Utf8('Foo findme Bar');
echo "`findme` starts at index " . $string->find('findme'); // `findme` starts at index 9

reverse

If you need to reverse your string you can use method reverse:

$string = new Content\Utf8('ź = ć');
echo (string) $string->reverse(); // ć = ź

Static methods

Content\[encoding]::get

Each encoding (currently there is only UTF-8) must implement one static method get which takes string and returns the closest letter:

$string = "źćó";
$start = $nextLetter = 0;
$letter = Content\Utf8::get($string, $start, $nextLetter);
echo $letter; // "ź";

// Notice that $nextLetter got updated with position of the end byte
// of the letter that was found, so if we use it as start we can get next letter
$letter = Content\Utf8::get($string, $nextLetter, $nextLetter);
echo $letter; // "ć";

Content\[encoding]::isWhitespace

This method checks if passed string contains only whitespaces:

Content\Utf8::isWhitespace("\n\t\r");          // true
Content\Utf8::isWhitespace(" notWhitespace "); // false
Content\Utf8::isWhitespace("");                // false

// It checks all whitespaces defined in unicode
// so it's a better then `trim` or `ctype_space` in terms of validating string
$string = "\u{2003}\u{2000}\u{2009}";
Content\Utf8::isWhitespace($string); // true
\mb_strlen(trim($string));           // 3
ctype_space($string);                // false

统计信息

  • 总下载量: 8
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 0
  • 点击次数: 2
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • 开发语言: PHP

其他信息

  • 授权协议: GPL-2.0-only
  • 更新时间: 2022-10-04

承接程序开发

PHP开发

VUE

Vue开发

前端开发

小程序开发

公众号开发

系统定制

数据库设计

云部署

网站建设

安全加固