项目作者: nxvhm

项目描述 :
Newscraping pkg for laravel
高级语言: PHP
项目地址: git://github.com/nxvhm/newscraper.git
创建时间: 2020-04-11T13:43:53Z
项目社区:https://github.com/nxvhm/newscraper

开源协议:

下载


News Scraper Package for laravel

Installation

  1. Download package and paste the package in your laravel app.

  2. Add the package folder path in composer’s repositories section as local, and update your require section. For example:
    ```
    “require” : {

    1. "nxvhm/newscraper": "dev-master"

    },

    “repositories”: {

    1. "local": {
    2. "type": "path",
    3. "url": "app/Library/nxvhm/newscraper"
    4. }

    },

  1. 3. Run ``composer require "nxvhm/newscraper dev-master"``. This will symlink the package to the vendor/ folder and install its dependencies and treat it as regular composer pkg.
  2. At this point the package should be auto-discoverable from laravel.
  3. 4. Publish config and migration:

php artisan vendor:publish —provider=”Nxvhm\Newscraper\NewscraperServiceProvider”

  1. 5. Run the migration

php artisan migrate

  1. ## Usage
  2. Register strategies in database in order for them to have unique id

php artisan newscraper:register-sites

  1. Start from artisan with the following cmd:

php artisan scrape:news {StrategyName}

  1. Where strategy name is an existing strategy class.
  2. To create scraping strategy class via CLI:

php artisan newscraper:create-strategy

  1. After creating it, you should update the autoload maps and register it in db:

composer dump-autoload
php artisan newscraper:register-sites

  1. ### Custom article db save logic
  2. For customized logic on saving article to db provide the responsible class in the config file in ``custom_save``. The class should implement the ``Nxvhm\Newscraper\Contracts\ArticleSaver`` contract. Example:

<?php

namespace App;
use Nxvhm\Newscraper\Strategies\Strategy;
use Nxvhm\Newscraper\Contracts\ArticleSaver as ArticleSaverInterface;
use Illuminate\Support\MessageBag;

class ArticleSaver implements ArticleSaverInterface
{
public static function saveArticle(array $article, Strategy $strategy): MessageBag
{

  1. # Custom logic goes here
  2. dd($article, $strategy->name);

}
}
```

ToDO

  • Define site strategies from a config file
  • —Allow strategy class lookup in a configurable namespaces not only in a single one—
  • Implement more strategies
  • Implement mechanism for Flexible Time/Date scraping