承接 ges/ocr 相关项目开发

从需求分析到上线部署,全程专人跟进,保证项目质量与交付效率

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

ges/ocr

最新稳定版本:0.2.0

Composer 安装命令:

composer require ges/ocr

包简介

Core document processing services for OCR, classification, extraction, and normalization.

README 文档

README

Laravel package for document OCR, classification, extraction, and normalization.

This package is built for French business and identity documents, with current support for:

  • identity_card
  • residence_permit
  • passport
  • visa
  • crew_card
  • travel_document
  • other_identity_document
  • kbis
  • acte_propriete (land-title deed only)
  • msa (parcel table)

What This Package Does

Input pipeline:

  • detect technical input type: image, pdf_text, pdf_scan
  • transcribe images and scanned PDFs
  • classify the business document type
  • extract structured data
  • normalize values into a stable shape
  • return a ProcessedDocumentResult

Current model strategy:

  • qwen2.5vl:7b for visual transcription only
  • qwen2.5:7b for classification and structured extraction

Available AI providers:

  • ollama
  • openai

Provider strategy:

  • ollama uses a multi-step pipeline: vision transcription, classification, extraction, optional MRZ merge
  • openai uses a single structured request per document and returns classification plus extracted data in one response

Package Boundaries

This package contains:

  • OCR/transcription services
  • classifier
  • extractor
  • normalizer
  • schema factory
  • AI clients for Ollama and OpenAI
  • package DocumentProcessing model
  • package migration and factory
  • install command

This package does not own your application workflow.

Typical app-specific code stays outside:

  • accepted Document model
  • upload flow
  • matching an identity document against a user
  • deciding whether to persist a final document
  • queue jobs tied to your app domain

Install

composer require ges/ocr

Then install package assets:

php artisan ocr:install

Or install and migrate immediately:

php artisan ocr:install --migrate

Optional install flags:

php artisan ocr:install --check
php artisan ocr:install --no-config
php artisan ocr:install --no-migrations
php artisan ocr:install --force

What this command does:

  • publishes config/ges-ocr.php
  • publishes package migrations
  • optionally runs php artisan migrate
  • optionally runs php artisan ocr:health

Health check command:

php artisan ocr:health

It checks:

  • pdftotext
  • pdftoppm
  • selected AI provider connectivity
  • configured text and vision models

Configuration

Published config file:

config/ges-ocr.php

Main environment variables:

GES_OCR_AI_PROVIDER=ollama
GES_OCR_CLASSIFICATION_CONFIDENCE_THRESHOLD=0.75
GES_OCR_MAX_PAGES=0

OLLAMA_BASE_URL=http://host.docker.internal:11434
OLLAMA_TEXT_MODEL=qwen2.5:7b
OLLAMA_VISION_MODEL=qwen2.5vl:7b
OLLAMA_CONNECT_TIMEOUT=10
OLLAMA_TIMEOUT=120
OLLAMA_RETRY_TIMES=2
OLLAMA_RETRY_SLEEP_MS=500
OLLAMA_BASIC_AUTH_ENABLED=false
OLLAMA_BASIC_AUTH_USERNAME=
OLLAMA_BASIC_AUTH_PASSWORD=

OPENAI_BASE_URL=https://api.openai.com/v1
OPENAI_API_KEY=
OPENAI_TEXT_MODEL=gpt-4.1-mini
OPENAI_VISION_MODEL=gpt-4.1-mini
OPENAI_CONNECT_TIMEOUT=10
OPENAI_TIMEOUT=120
OPENAI_RETRY_TIMES=2
OPENAI_RETRY_SLEEP_MS=500

GES_OCR_MRZ_OCR_ENABLED=true
GES_OCR_CLEANUP_TEMPORARY_FILES=true

GES_OCR_AI_PROVIDER accepts ollama or openai.

GES_OCR_MAX_PAGES=0 means unlimited pages.

Main config areas:

  • ai
  • ollama
  • openai
  • mrz
  • processing

Optional Ollama upstream basic auth:

  • OLLAMA_BASIC_AUTH_ENABLED=true enables HTTP basic auth on requests sent to OLLAMA_BASE_URL
  • OLLAMA_BASIC_AUTH_USERNAME sets the upstream username
  • OLLAMA_BASIC_AUTH_PASSWORD sets the upstream password

Example OpenAI setup:

GES_OCR_AI_PROVIDER=openai
OPENAI_API_KEY=sk-...
OPENAI_TEXT_MODEL=gpt-4.1-mini
OPENAI_VISION_MODEL=gpt-4.1-mini

Public API

Main service:

use Ges\Ocr\DocumentProcessor;

$result = app(DocumentProcessor::class)->processFile(
    path: $absolutePath,
    mimeType: $mimeType,
    originalName: $originalName,
);

Returned DTO:

  • originalName
  • mimeType
  • path
  • inputType
  • documentType
  • status
  • pagesCount
  • rawClassificationJson
  • rawExtractionJson
  • normalizedJson
  • errorMessage

Main statuses:

  • pending
  • processing
  • done
  • failed
  • needs_review

Supported Output Shapes

Identity Card

Normalized keys:

  • document_type
  • civility
  • first_name
  • last_name
  • date_of_birth
  • place_of_birth
  • document_number
  • expiry_date
  • nationality
  • sex
  • street_address
  • postal_code
  • city

Residence Permit

Normalized keys:

  • document_type
  • civility
  • first_name
  • last_name
  • date_of_birth
  • place_of_birth
  • document_number
  • expiry_date
  • nationality
  • sex
  • street_address
  • postal_code
  • city

KBIS

Normalized keys:

  • document_type
  • company_name
  • trade_name
  • legal_form
  • capital
  • registration_number
  • siret
  • sirene
  • street_address
  • postal_code
  • city
  • naf_code
  • registration_date
  • issue_date
  • registry_city
  • legal_representatives

Representative shape:

  • entity_type
  • company_name
  • legal_form
  • civility
  • first_name
  • last_name
  • street_address
  • postal_code
  • city
  • registration_number
  • registry_city
  • role

Acte Propriete

Important: this currently means French land-title deed only.

Normalized keys:

  • document_type
  • cadastral_parcels
  • owners

Parcel shape:

  • prefixe
  • section
  • numero
  • street_address
  • postal_code
  • city

Owner shape:

  • entity_type
  • company_name
  • civility
  • first_name
  • last_name

Rules:

  • owners are acquirers only
  • sellers must not be returned as owners
  • municipalities and administrations are treated as company
  • lieudit / leudit may be used as parcel street_address

Package Model

The package provides:

Ges\Ocr\Models\DocumentProcessing

This model stores:

  • source file metadata
  • detected input type
  • business document type
  • status
  • raw classification JSON
  • raw extraction JSON
  • normalized JSON
  • error message

If your app wants its own subclass, it can extend the package model.

AI Notes

If you are an AI agent working in a project using this package:

  • Use DocumentProcessor::processFile(...) as the main entry point.
  • Treat rawClassificationJson as model output, not final truth.
  • Treat normalizedJson as the stable application-facing payload.
  • For images and scanned PDFs, the package uses two LLM stages:
    • vision transcription
    • text classification/extraction
  • Exception: when GES_OCR_AI_PROVIDER=openai, the package uses a one-shot analysis request instead.
  • Do not assume acte_propriete means generic property deed. In this package it currently means land-title deed only.
  • Distinguish identity_card from residence_permit.
  • Use residence_permit for French residence permits and identity_card for French identity cards.
  • For KBIS:
    • registration_number is the raw Immatriculation RCS
    • sirene is 9 digits
    • siret is optional and only if explicitly present

Tests

Package tests live under:

tests/Unit

Manual OCR fixture tests exist for:

  • CIN
  • titre de séjour
  • KBIS
  • land-title deeds

They are gated by:

RUN_MANUAL_OCR_TESTS=1

Current Assumptions

  • documents are French documents
  • the selected AI provider is reachable from the Laravel app
  • pdftotext and pdftoppm are available for PDF handling

Non-Goals

This package does not currently provide:

  • user/document matching workflow
  • approval workflow
  • final accepted document persistence
  • domain-specific queue orchestration
  • UI components

Those belong in the consuming application.

统计信息

  • 总下载量: 101
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 0
  • 点击次数: 3
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2026-03-30

承接程序开发

PHP开发

VUE

Vue开发

前端开发

小程序开发

公众号开发

系统定制

数据库设计

云部署

网站建设

安全加固