anshu-krishna/html-scraper 问题修复 & 功能扩展

解决BUG、新增功能、兼容多环境部署,快速响应你的开发需求

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

anshu-krishna/html-scraper

Composer 安装命令:

composer require anshu-krishna/html-scraper

包简介

A set of PHP classes to simplify data extraction from HTML.

README 文档

README

A set of PHP classes to simplify data extraction from HTML.

Installation

composer require anshu-krishna/html-scraper

Base code for the CSS_to_Xpath method in HTMLScraper was cloned from https://github.com/zendframework/zend-dom.
Zend Framework : http://framework.zend.com/
Repository : http://github.com/zendframework/zf2
Copyright (c) 2005-2015 Zend Technologies USA Inc. http://www.zend.com
License : https://framework.zend.com/license New BSD License

For basic documentation see the DOC file.

Example

<?php
require_once 'vendor/autoload.php';

use Krishna\DOMNodeHelper;
use Krishna\HTMLScraper;

const TrimmedText = HTMLScraper::Extract_textContentTrim;

$doc = new HTMLScraper();

if(!$doc->load_HTML_file('https://www.royalroad.com/fiction/10073/the-wandering-inn')) {
	echo 'Unable to load data';
	exit(1);
}

$data = [];

$data['title'] = $doc->querySelector_extract(TrimmedText, 'div.fic-title h1[property="name"]', 0);

$data['url'] = $doc->xpath_extract(function($meta) {
	return $meta->getAttribute('content');
}, '//meta[@property="og:url"]', 0);

$data['description'] = htmlspecialchars($doc->querySelector_extract(function(&$div) {
	return trim(DOMNodeHelper::innerHTML($div));
}, 'div.description div[property="description"]', 0));

$data['tags'] = $doc->querySelector_extract(TrimmedText, 'span.tags span[property="genre"]');

var_dump($data);

统计信息

  • 总下载量: 19
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 3
  • 点击次数: 1
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 3
  • Watchers: 1
  • Forks: 0
  • 开发语言: HTML

其他信息

  • 授权协议: MIT
  • 更新时间: 2021-09-21

承接程序开发

PHP开发

VUE

Vue开发

前端开发

小程序开发

公众号开发

系统定制

数据库设计

云部署

网站建设

安全加固