定制 nai-php/naipostagger 二次开发

按需修改功能、优化性能、对接业务系统,提供一站式技术支持

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

nai-php/naipostagger

Composer 安装命令:

composer require nai-php/naipostagger

包简介

A part of speech tagger written in PHP.

README 文档

README

A lightweight framework-agnostic library in pure PHP for part-of-speech tagging. Can be used for chatbots, personal assistants, keywords extraction etc. Being written in PHP, it can be easily integrated in pre existent or new applications, giving the real ability to understand what users write.

It is based on vocabularies and predefined grammatical rules, without wrappers to third part systems, neural networks, machine learning or models that requires huge resources.

This is the english version. Documentation and TODO are coming, more info and demo on n-ai.cloud

Precision

In this table I'll put results of differents type of sentences corpus.

Corpus Total tokens Correctly tagged Not correctly tagged % of total correct
"Just Shoot Me" movie subtitles 3403 3381 22 99,35

Installation

  1. in your project folder e.g. "myproject" install the package via composer;

  2. create folder "dictionaries";

  3. inside folder "dictionaries" clone or download the english dictionary repository;

  4. run this example script:

use NaiPosTagger\Pipelines\PipelinePosTagging;
use NaiPosTagger\Models\NaiPosArr;


include('vendor/autoload.php');

include(__DIR__ . '/vendor/nai-php/naipostagger/src/Utilities/common_functions_helper.php');

define('DICTIONARIES_PATH', __DIR__ . '/./dictionaries/dictionaries-');

define('TRAITS_PATH', __DIR__ . '/./vendor/nai-php/naipostagger/src/');

$sentence = 'my name is Fred';

$PipelinePosTagging = new PipelinePosTagging();

$PipelinePosTagging->language = 'en';

$pos_arr = $PipelinePosTagging->transform($sentence);

// for a clear output, better hide metadata
$pos_arr = NaiPosArr::clearMetadata($pos_arr);

// and further simplify the output
$pos_arr = NaiPosArr::flatPosArr($pos_arr);

diex($pos_arr);

And the output will be:

Array
(
    [0] => Array
        (
            [form] => .
            [lemma] => .
            [features] => SENT
            [sh-feat] => SENT
            [label] => 
            [rule] => 
            [pos_score] => 0
        )

    [1] => Array
        (
            [form] => my
            [lemma] => my
            [features] => ADJ:pos+m+s
            [sh-feat] => ADJ
            [label] => 
            [rule] => 
            [pos_score] => 0
        )

    [2] => Array
        (
            [form] => name
            [lemma] => name
            [features] => NOUN-m:s
            [sh-feat] => NOUN
            [label] => 
            [rule] => 
            [pos_score] => 0
        )

    [3] => Array
        (
            [form] => is
            [lemma] => is
            [features] => VER:ind+pres+3+s
            [sh-feat] => VER
            [label] => 
            [rule] => 
            [pos_score] => 0
        )

    [4] => Array
        (
            [form] => Fred
            [lemma] => Fred
            [features] => NPR
            [sh-feat] => NPR
            [label] => 
            [rule] => 
            [pos_score] => 0
        )

    [5] => Array
        (
            [form] => .
            [lemma] => .
            [features] => SENT
            [sh-feat] => SENT
            [label] => 
            [rule] => 
            [pos_score] => 0
        )

)

To do list

  • Find contributors
  • Clean, check, fix and tag term in dictionaries
  • Clean, check, fix brill rules
  • Add more ngrams
  • Add more tests, expecially for filters
  • Collect and load frill words
  • Better Oop for some classes?
  • In module for logical analysis (yet not published) collect synonyms and temporal expressions

统计信息

  • 总下载量: 134
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 14
  • 点击次数: 2
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 14
  • Watchers: 2
  • Forks: 2
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2021-08-04

承接程序开发

PHP开发

VUE

Vue开发

前端开发

小程序开发

公众号开发

系统定制

数据库设计

云部署

网站建设

安全加固