rem42/scraper
最新稳定版本:v3.2.0
Composer 安装命令:
composer require rem42/scraper
包简介
API Scraper website
README 文档
README
Lightweight toolbox to build reusable "scrapers":
- Declare a Request class annotated with the PHP attribute
#[Scraper(...)]. - Provide the corresponding Api class (replace "Request" with "Api" in the name) which extends
\Scraper\Scraper\Api\AbstractApiand implementsexecute(). - Use
\Scraper\Scraper\Clientwith anHttpClientInterfaceto execute the request and retrieve the deserialized object.
Badges
Installation
composer require rem42/scraper "^3.0"
Short introduction
The package centralizes the following logic:
- A Request (under
src/Request/) defines the necessary data and exposes getters used in path variables. - The attribute
#[\Scraper\Scraper\Attribute\Scraper(...)](on the Request) describesmethod,scheme,host,path. \Scraper\Scraper\Client::send()reads this attribute (viaExtractAttribute), builds the HTTP options (headers, query, body, json, auth) according to the interfaces implemented by the Request, then performs the HTTP call.- The matching Api class (eg:
FooApi) is instantiated and itsexecute()method returns the final object/array/string.
Quickstart (minimal example)
Schematic example (adapt according to your autoload/imports). Examples use use imports:
use Symfony\Component\HttpClient\HttpClient; use Scraper\Scraper\Client; use Scraper\Scraper\Request\ScraperRequest; use Scraper\Scraper\Attribute\Scraper; use Scraper\Scraper\Attribute\Method; use Scraper\Scraper\Attribute\Scheme; use Scraper\Scraper\Api\AbstractApi; #[Scraper( method: Method::GET, scheme: Scheme::HTTPS, host: 'example.com', path: '/items/{id}' )] class ItemRequest extends ScraperRequest { public function __construct(private string $id) {} public function getId(): string { return $this->id; } } // Provide a matching Api: ItemApi extends AbstractApi $http = HttpClient::create(); $client = new Client($http); $result = $client->send(new ItemRequest('42'));
Important conventions
- PSR-4 root namespace:
Scraper\\Scraper\\->src/(seecomposer.json). - Naming convention:
XRequest->XApi(Client performs this replacement automatically using reflection). - In the
pathattribute, variables{name}are replaced by callinggetName()on the Request instance (seesrc/Attribute/ExtractAttribute.php). - Implement the interfaces in
src/Request/to enable options:RequestHeaders,RequestQuery,RequestBody,RequestBodyJson,RequestAuthBearer,RequestAuthBasic.
Tests / quality / style
- Run unit tests:
composer run unit-test
# or
./vendor/bin/phpunit
- Static analysis (phpstan):
composer run static-analysis
- Check / apply coding style (php-cs-fixer):
composer run code-style-check composer run code-style-fix
PHP compatibility
composer.json requires php: ^8.4 — the code uses enums and recent types, so PHP 8.4+ is recommended.
Resources and documentation for agents
- Agent helper file:
AGENTS.md(tips, patterns, commands). Seepackages/scraper/AGENTS.md. - Key code points:
src/Client.php,src/Attribute/ExtractAttribute.php,src/Factory/SerializerFactory.php.
Non-exhaustive list of published scrapers
- rem42/scraper-allocine
- rem42/scraper-colissimo
- rem42/scraper-deezer
- rem42/scraper-giantbomb
- rem42/scraper-jeuxvideo
- rem42/scraper-prestashop
- rem42/scraper-shopify
- rem42/scraper-tmdb
- rem42/scraper-tnt
Contributing
See AGENTS.md for rules and patterns to follow. For PRs: green tests + highest phpstan level.
统计信息
- 总下载量: 4.21k
- 月度下载量: 0
- 日度下载量: 0
- 收藏数: 5
- 点击次数: 0
- 依赖项目数: 23
- 推荐数: 0
其他信息
- 授权协议: MIT
- 更新时间: 2018-08-23