topshelfcraft/scraper
Composer 安装命令:
composer require topshelfcraft/scraper
包简介
Easily fetch, parse, and rejigger HTML or XML from anywhere.
关键字:
README 文档
README
Easily fetch, slice, dice, and output HTML (or XML) content from anywhere.
A Top Shelf Craft creation
Michael Rog, Proprietor
Installation
-
From your project directory, use Composer to require the plugin package:
composer require topshelfcraft/scraper -
In the Control Panel, go to Settings → Plugins and click the “Install” button for Scraper.
-
There is no Step 3.
Scraper is also available for installation via the Craft CMS Plugin Store.
Usage
The Scraper plugin exposes a full-featured crawler object to your Twig template, allowing you to fetch, parse, and filter DOM elements from a remote source document.
Instantiating a client
When invoking the plugin, you can choose whether to use SimpleHtmlDom or Symfony components to instantiate your crawler:
{% set crawler = craft.scraper.using('symfony').get('https://zombo.com') %}
{% set crawler = craft.scraper.using('simplehtmldom').get('https://zombo.com') %}
I generally recommend using the Symfony components; they are more powerful and resilient to malformed source code. (The SimpleHtmlDom crawler is included to provide backwards compatibility with Craft 2 projects.)
Using the Symfony client
When you opt for Symfony components, the get method instantiates a full BrowserKit client, giving you access to all the BrowserKit and DomCrawler methods.
You can iterate over the DOM elements from your source document like this:
{% for node in crawler.filter('h2 > a') %}
{{ node.text() }}
{% endfor %}
Using the SimpleHtmlDom client
When you opt for the SimpleHtmlDom crawler, the get method instantiates a SimpleHtmlDom client, giving you access to all the SimpleHtmlDom methods.
You can iterate over the DOM elements from your source document like this:
{% for node in crawler.find('h1') %}
{{ node.innertext() }}
{% endfor %}
This is great! I still have questions.
Ask a question on StackExchange, and ping me with a URL via email or Discord.
What are the system requirements?
Craft 4.2.1+
I found a bug.
Please open a GitHub Issue, submit a PR to the 4.x.dev branch, or just email me.
Contributors:
- Plugin development: Michael Rog / @michaelrog
- Includes the "Simple HTML DOM" library, created by S. C. Chen
- Includes the Symfony DomCrawler via Goutte, created by Fabian Potencier / @fabpot
- Icon: "Upright vacuum cleaner" by Creaticca Creative Agency, via The Noun Project
统计信息
- 总下载量: 2.16k
- 月度下载量: 0
- 日度下载量: 0
- 收藏数: 16
- 点击次数: 2
- 依赖项目数: 0
- 推荐数: 0
其他信息
- 授权协议: proprietary
- 更新时间: 2019-06-19