inwebo/csv
Composer 安装命令:
composer require inwebo/csv
包简介
This PHP class, provides a simple yet powerful way to read and process CSV files. Built as an extension of PHP's SplFileObject, it offers advanced features like column name mapping, data filtering, and sanitization to streamline your CSV processing tasks.
关键字:
README 文档
README
This library provides two classes — Inwebo\Csv\Reader and Inwebo\Csv\Writer — for reading and writing CSV files with a low memory footprint. Both extend PHP's SplFileObject, making all native file handling methods (like setCsvControl) available on each instance.
See the PHP documentation for SplFileObject for more details.
Key Features
- Column Name Mapping: Automatically maps each line's data to an associative array using the CSV header as keys, making your code more readable and maintainable.
- Data Normalization: Apply one or more callable functions to each line to clean and format the data before it's used.
- Data Filtering: Use callable functions to validate and filter out rows that don't meet your criteria.
- Generator-based Iteration: Process large files efficiently using a
Generatorto iterate over lines without consuming too much memory. Iteration always starts with arewind(). - CSV Writing: Write rows one by one or from any iterable, including Generators, for memory-efficient ETL pipelines.
- Excel Compatibility: Optional UTF-8 BOM and configurable line endings (
\r\n) for correct rendering in Excel on Windows. - Inherits
SplFileObject: Leverage all the native features and performance benefits ofSplFileObjectfor file handling.
Installation
composer req inwebo/csv-reader
Tests
composer phpunit
PhpStan
composer phpstan
Level 10
Usage
Basic Reading
To get started, simply instantiate the Reader class with the path to your CSV file. By default, it assumes the first row contains column names, and data starts at index 1.
use Inwebo\Csv\Reader; $reader = new Reader('path/to/your/file.csv'); foreach ($reader->rows() as $row) { /** @var array{FirstName: string, LastName: string, Gender: string} $row */ // $row will be an associative array, e.g., ['FirstName' => 'Philippe', 'LastName' => 'Petit', 'Gender' => 'M'] print_r($row); }
Disabling Column Names
If your CSV file does not have a header row, you can disable the column name mapping by setting the hasHeaders parameter to false. In this case, indices start at 0.
use Inwebo\Csv\Reader; $reader = new Reader('path/to/your/file.csv', hasHeaders: false); foreach ($reader->rows() as $row) { // $row will be a numeric array, e.g., [0 => 'Philippe', 1 => 'Petit', 2 => 'M'] print_r($row); }
Manual Column Mapping
For files without a header, you can manually define column names using the setHeader() method. This allows you to treat the data as an associative array even without a header row. Indices start at 0.
use Inwebo\Csv\Reader; $reader = new Reader('path/to/your/file.csv', hasHeaders: false); $reader ->setHeader(0, 'firstname') ->setHeader(1, 'lastname') ->setHeader(2, 'gender'); foreach ($reader->rows() as $row) { /** @var array{firstname: string, lastname: string, gender: string} $row */ // $row will be an associative array, e.g., ['firstname' => 'Philippe', 'lastname' => 'Petit', 'gender' => 'M'] print_r($row); }
Advanced Usage: Normalizers and Filters
You can add multiple normalizers and filters to your Reader instance. They are executed sequentially in the order they are added (FIFO).
Normalizers
Normalizers are used to modify the data. The callback receives the line array by reference, allowing you to directly alter its values. They are executed sequentially in the order they are added (FIFO).
use Inwebo\Csv\Reader; $reader = new Reader('path/to/your/file.csv'); // Add a normalizer to handle missing gender data $reader->pushNormalizer(function (array &$row): void { /** @var array{Gender: string} $row */ if (empty($row['Gender'])) { $row['Gender'] = 'U'; } }); // Add another normalizer to format the gender column $reader->pushNormalizer(function (array &$row): void{ /** @var array{Gender: string} $row */ $gender = strtolower($row['Gender']); if (str_starts_with($gender, 'm')) { $row['Gender'] = 'M'; } elseif (str_starts_with($gender, 'f')) { $row['Gender'] = 'F'; } }); // Add a normalizer to ensure Salary is an integer $reader->pushNormalizer(function (array &$row): void { /** @var array{Salary: string|int|null} $row */ $row['Salary'] = is_null($row['Salary']) ? 0 : (int) $row['Salary']; });
Filters
Filters are used to validate and exclude entire rows. If a filter returns false, the line will be skipped and will not be yielded by the generator.
use Inwebo\Csv\Reader; $reader = new Reader('path/to/your/file.csv'); // Add a filter to only include rows where Salary is greater than 80 000 € $reader->pushFilter(function (array $row): bool { /** @var array{Salary: string|int|null} $row */ return isset($row['Salary']) && (int) $row['Salary'] > 80000; }); // Add another filter to only include users older than 25 $reader->pushFilter(function (array $row): bool { /** @var array{Age: string|int|null} $row */ return isset($row['Age']) && (int) $row['Age'] > 25; });
With both normalizers and filters in place, the processing loop becomes a clean, declarative statement of what you want to achieve.
foreach ($reader->rows() as $row) { // This line has passed all your checks and is ready to be used print_r($row); }
Reading a Specific Range
You can also read a specific range of rows using the rows() method with from and to parameters. Both parameters must be provided together or both omitted.
Important
When hasHeaders is true (default), the first data row is at index 1. When false, it starts at 0.
use Inwebo\Csv\Reader; $reader = new Reader('path/to/your/file.csv'); // Read rows from 10 to 20 foreach ($reader->rows(from: 10, to: 20) as $row) { print_r($row); }
Reading a Specific Row
The rowAt() method allows you to retrieve a specific row by its index. If the row contains missing columns compared to the header, they will be returned as null.
use Inwebo\Csv\Reader; $reader = new Reader('path/to/your/file.csv'); // Retrieve the 5th row $row = $reader->rowAt(5); print_r($row);
Writer
Basic Writing
Instantiate Writer with the path to the output file. The default mode is 'w' (truncate or create). Use 'a' to append to an existing file.
use Inwebo\Csv\Writer; $writer = new Writer('path/to/output.csv'); $writer->setHeaders(['FirstName', 'LastName', 'Email']); $writer->row(['Philippe', 'Petit', 'philippe@example.com']); $writer->row(['Marie', 'Curie', 'marie@example.com']);
All configuration methods return static for fluent chaining:
$writer ->setHeader(['FirstName', 'LastName']) ->row(['Philippe', 'Petit']) ->row(['Marie', 'Curie']) ;
Writing Multiple Rows
rows() accepts any iterable — arrays or Generator objects:
$data = [ ['Philippe', 'Petit'], ['Marie', 'Curie'], ]; $writer->setHeader(['FirstName', 'LastName'])->rows($data);
Excel Compatibility
Enable the UTF-8 BOM to ensure correct character encoding when opening the file in Excel on Windows. Use \r\n as the line ending for RFC 4180 compliance.
$writer = new Writer('path/to/output.csv', bom: true); $writer->setLineEnding("\r\n");
The BOM is written automatically before the first row (header or data).
Custom Delimiter
CSV control (delimiter, enclosure, escape) is delegated to the inherited setCsvControl() method:
$writer = new Writer('path/to/output.csv'); $writer->setCsvControl(';'); // use semicolon as delimiter (common in France/Germany) $writer->setHeader(['Prénom', 'Nom'])->row(['Philippe', 'Petit']);
Filters and Normalizers
Filters and normalizers on the Writer follow the same FIFO pipeline as on the Reader. For each row, filters run first — if any returns false the row is skipped entirely — then normalizers transform the data before it is written.
Filters and normalizers do not apply to the header row written by setHeader().
Validating column count
A common use case is rejecting malformed rows whose column count does not match the header before writing them:
use Inwebo\Csv\Writer; $writer = new Writer('output.csv'); $headers = ['Id', 'FirstName', 'LastName', 'Email']; $writer->setHeaders($headers); $writer->pushFilter(function (array $row) use ($headers): bool { return count($row) === count($headers); }); $writer->rows([ ['1', 'Alice', 'Dupont', 'alice@example.com'], // written ['2', 'Bob'], // skipped — only 2 columns ['3', 'Charlie', 'Martin', 'charlie@example.com'], // written ]);
Normalizing data before writing
Normalizers receive the row array by reference, allowing direct modification:
use Inwebo\Csv\Writer; $writer = new Writer('output.csv'); // Trim whitespace and normalize casing on name columns $writer->pushNormalizer(function (array &$row): void { $row['FirstName'] = mb_convert_case(trim($row['FirstName']), MB_CASE_TITLE, 'UTF-8'); $row['LastName'] = mb_convert_case(trim($row['LastName']), MB_CASE_TITLE, 'UTF-8'); }); // Format phone numbers $writer->pushNormalizer(function (array &$row): void { $row['Phone'] = preg_replace('/\D/', '', $row['Phone'] ?? ''); }); $writer->setHeaders(['FirstName', 'LastName', 'Phone']); $writer->rows($data);
Combining filters and normalizers
Filters and normalizers can be combined freely. The pipeline order is always: filter → normalize → write.
use Inwebo\Csv\Writer; $headers = ['Id', 'FirstName', 'LastName', 'Email', 'Salary']; $writer = new Writer('output.csv'); $writer->setHeaders($headers); // Skip rows with wrong column count or invalid email $writer->pushFilter(fn(array $row) use ($headers): bool => count($row) === count($headers)); $writer->pushFilter(fn(array $row): bool => filter_var($row['Email'], FILTER_VALIDATE_EMAIL) !== false); // Normalize names and cast salary to int $writer->pushNormalizer(function (array &$row): void { $row['FirstName'] = mb_convert_case(trim($row['FirstName']), MB_CASE_TITLE, 'UTF-8'); $row['LastName'] = mb_convert_case(trim($row['LastName']), MB_CASE_TITLE, 'UTF-8'); $row['Salary'] = (string) (int) $row['Salary']; }); $writer->rows($data);
ETL: Reader to Writer
Because Writer::rows() accepts any iterable, you can pipe Reader::rows() directly into it without buffering the entire file in memory:
use Inwebo\Csv\Reader; use Inwebo\Csv\Writer; $reader = new Reader('input.csv'); $writer = new Writer('output.csv'); $writer->setHeaders($reader->getHeaders()); $writer->rows($reader->rows());
Filters and normalizers applied to the Reader are evaluated lazily during iteration, so the pipeline processes one row at a time regardless of file size.
Realistic Scenario: Customer Migration
This scenario reads a legacy customer CSV (tests/Fixtures/example.csv), cleans and filters the data, then writes two separate output files — one per segment — without ever loading the full file into memory.
We need to:
- Reject rows with a wrong column count.
- Clean up first and last names (trimming, casing).
- Format phone numbers.
- Formalize genders to 'M' or 'F'.
- Filter for valid email addresses.
- Output 1: Women with a salary < 10,000 →
women.csv - Output 2: Men with a salary > 22,500 →
men.csv
use Inwebo\Csv\Reader; use Inwebo\Csv\Writer; $reader = new Reader('tests/Fixtures/example.csv'); // Shared normalizers applied by the Reader on every row $reader->pushNormalizer(function (array &$row): void { $row['FirstName'] = mb_convert_case(trim($row['FirstName']), MB_CASE_TITLE, 'UTF-8'); $row['LastName'] = mb_convert_case(trim($row['LastName']), MB_CASE_TITLE, 'UTF-8'); }); $reader->pushNormalizer(function (array &$row): void { if (!empty($row['Phone'])) { $row['Phone'] = str_replace(['.', ' ', '-', '+33'], '', $row['Phone']); if (strlen($row['Phone']) === 9) { $row['Phone'] = '0' . $row['Phone']; } } }); $reader->pushNormalizer(function (array &$row): void { $gender = strtoupper(trim($row['Gender'])); $row['Gender'] = match(true) { in_array($gender, ['M', 'MALE']) => 'M', in_array($gender, ['F', 'FEMALE']) => 'F', default => 'U', }; }); // Shared Writer setup (Excel-compatible: BOM + semicolon delimiter) $headers = $reader->getHeaders(); $initWriter = static function (string $filename) use ($headers): Writer { $writer = new Writer($filename, bom: true); $writer->setLineEnding("\r\n"); $writer->setCsvControl(';'); $writer->pushFilter(fn(array $row) use ($headers): bool => count($row) === count($headers)); $writer->pushFilter(fn(array $row): bool => filter_var($row['Email'], FILTER_VALIDATE_EMAIL) !== false); $writer->setHeaders($headers); return $writer; }; // Output 1: Women with salary < 10,000 $womenWriter = $initWriter('women.csv'); $womenWriter->pushFilter(fn(array $row): bool => $row['Gender'] === 'F' && (int) $row['Salary'] < 10000); $womenWriter->rows($reader->rows()); // Output 2: Men with salary > 22,500 $menWriter = $initWriter('men.csv'); $menWriter->pushFilter(fn(array $row): bool => $row['Gender'] === 'M' && (int) $row['Salary'] > 22500); $menWriter->rows($reader->rows());
统计信息
- 总下载量: 0
- 月度下载量: 0
- 日度下载量: 0
- 收藏数: 3
- 点击次数: 3
- 依赖项目数: 0
- 推荐数: 0
其他信息
- 授权协议: MIT
- 更新时间: 2026-06-20