承接 tojibon/web-scraper 相关项目开发

从需求分析到上线部署,全程专人跟进,保证项目质量与交付效率

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

tojibon/web-scraper

Composer 安装命令:

composer require tojibon/web-scraper

包简介

A web scraper php class using PHP cURL to scrap web page. By which you can scrap web page by cURL get, post methods also by which you can scrap web page content from a asp.net based websites with form post.

README 文档

README

  1. A very simple single page PHP web scraper class that utilizes the cURL library to scrape web page content. Scrape web pages using GET or POST methods. Also scrape web page content from asp.net based websites using form POST methods.
  2. Support for:
    1. GET Method
    2. POST Method
    3. ASP Calls
    4. Retrieve Page Contents by Markup Tag Names
    5. Retrieve Values from Form Fields

Installation

composer require juyal-ahmed/web-scraper

Getting a full webpage content:

<?php
require 'vendor/autoload.php';

// Create a Scraper instance with only the URL specified
$scraper = new \PhpFarmer\WebScraper\Scraper('https://example.com');
$pageHtmlContent = $scraper->getPageContent('https://example.com/page.html');
?>

Getting a full webpage content:

<?php
require 'vendor/autoload.php';

// Create a Scraper instance with custom cache settings
$scraperWithCache = new Scraper('https://example.com', true, './custom_cache/', 600);
$pageHtmlContent = $scraper->getPageContent('https://example.com/page.html');
?>

Getting a full webpage content with Using Proxy IP:

<?php
require 'vendor/autoload.php';

// Create a Scraper instance with only the URL specified
$scraper = new \PhpFarmer\WebScraper\Scraper('https://example.com');
$pageHtmlContent = $scraper->curl('https://example.com/page.html', "93.118.xx.141:8800", "6USERR:8PASS1");
?>

Parsing a page html content:

<?php
$subHtmlContent =  $scraper->getHtmlContentBetweenTags($pageHtmlContent, '', '');
?>

How It Works:

  1. Include The Class scraper.php in your Working page header.
  2. Set some default settings.
  3. Get the page content by its existing methods.
  4. Split your content by getHtmlContentBetweenTags methods if single content you are searching for.
  5. If grid data needed, split the content with a needle Ex: explode()
  6. Then loop it whole and get the content by getHtmlContentBetweenTags again to make the final array of grid data.
  7. That's' all

Thanks

统计信息

  • 总下载量: 66
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 46
  • 点击次数: 0
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 46
  • Watchers: 7
  • Forks: 33
  • 开发语言: PHP

其他信息

  • 授权协议: GPL-3.0
  • 更新时间: 2017-01-05

承接程序开发

PHP开发

VUE

Vue开发

前端开发

小程序开发

公众号开发

系统定制

数据库设计

云部署

网站建设

安全加固