承接 spatie/crawler 相关项目开发

从需求分析到上线部署,全程专人跟进,保证项目质量与交付效率

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

spatie/crawler

Composer 安装命令:

composer require spatie/crawler

包简介

Crawl all internal links found on a website

README 文档

README

Logo for crawler

Crawl the web using PHP

Latest Version on Packagist MIT Licensed Tests Total Downloads

This package provides a powerful, easy to use class to crawl links on a website. Under the hood, Guzzle promises are used to crawl multiple URLs concurrently.

Because the crawler can execute JavaScript, it can crawl JavaScript rendered sites. Under the hood, Chrome and Puppeteer are used to power this feature.

Here's a quick example:

use Spatie\Crawler\Crawler;
use Spatie\Crawler\CrawlResponse;

Crawler::create('https://example.com')
    ->onCrawled(function (string $url, CrawlResponse $response) {
        echo "{$url}: {$response->status()}\n";
    })
    ->start();

Or collect all URLs on a site:

$urls = Crawler::create('https://example.com')
    ->internalOnly()
    ->depth(3)
    ->foundUrls();

You can also test your crawl logic without making real HTTP requests:

Crawler::create('https://example.com')
    ->fake([
        'https://example.com' => '<html><a href="/about">About</a></html>',
        'https://example.com/about' => '<html>About page</html>',
    ])
    ->foundUrls();

If you need to stop a crawl based on external state, you can register a callback that receives the current crawler instance and is checked before scheduling each next request:

use Spatie\Crawler\Crawler;

$shouldStop = false;

Crawler::create('https://example.com')
    ->shouldStopCallback(function (Crawler $crawler) use (&$shouldStop) {
        return $shouldStop;
    })
    ->onCrawled(function (string $url) use (&$shouldStop) {
        $shouldStop = true;
    })
    ->start();

Support us

We invest a lot of resources into creating best in class open source packages. You can support us by buying one of our paid products.

We highly appreciate you sending us a postcard from your hometown, mentioning which of our package(s) you are using. You'll find our address on our contact page. We publish all received postcards on our virtual postcard wall.

Documentation

All documentation is available on our documentation site.

Testing

composer test

Changelog

Please see CHANGELOG for more information on what has changed recently.

Contributing

Please see CONTRIBUTING for details.

Security Vulnerabilities

Please review our security policy on how to report security vulnerabilities.

Credits

License

The MIT License (MIT). Please see License File for more information.

统计信息

  • 总下载量: 18.04M
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 2836
  • 点击次数: 15
  • 依赖项目数: 55
  • 推荐数: 1

GitHub 信息

  • Stars: 2822
  • Watchers: 63
  • Forks: 367
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2015-11-02

承接程序开发

PHP开发

VUE

Vue开发

前端开发

小程序开发

公众号开发

系统定制

数据库设计

云部署

网站建设

安全加固