go>> html>> 返回
项目作者: wmentor

项目描述 :
HTML data fetcher
高级语言: Go
项目地址: git://github.com/wmentor/html.git
创建时间: 2020-03-11T20:22:48Z
项目社区:https://github.com/wmentor/html

开源协议:MIT License

下载


HTML

Coverage Status
https://goreportcard.com/report/github.com/wmentor/html
https://pkg.go.dev/github.com/wmentor/html
License: MIT

Simple HTML parser and data fetcher library written on Golang under MIT License.

Require

  • Golang (version >= 1.20)
  • golang.org/x/net

Install

  1. go get github.com/wmentor/html

Usage

Fetch data from URL

  1. package main
  2. import (
  3. "fmt"
  4. "time"
  5. "github.com/wmentor/html"
  6. )
  7. func main() {
  8. src := "https://edition.cnn.com"
  9. parser := html.New()
  10. opts := &html.GetOpts{
  11. Agent:"Mozilla/5.0 (compatible; MSIE 10.0)",
  12. Timeout: time.Second*60,
  13. }
  14. parser.Get(src,opts)
  15. fmt.Println( string(parser.Text()) )
  16. parser.EachLink(func(link string) {
  17. fmt.Println("url=" + link)
  18. } )
  19. parser.EachImage(func(link string) {
  20. fmt.Println("img=" + link)
  21. } )
  22. parser.EachIframe(func(link string) {
  23. fmt.Println("iframe=" + link)
  24. } )
  25. }

Fetch data from file/stdin

  1. package main
  2. import (
  3. "fmt"
  4. "os"
  5. "github.com/wmentor/html"
  6. )
  7. func main() {
  8. parser := html.New()
  9. parser.Parse(os.Stdin) // io.Reader
  10. fmt.Println( string(parser.Text()) )
  11. parser.EachLink(func(link string) {
  12. fmt.Println("url=" + link)
  13. } )
  14. parser.EachImage(func(link string) {
  15. fmt.Println("img=" + link)
  16. } )
  17. parser.EachIframe(func(link string) {
  18. fmt.Println("iframe=" + link)
  19. } )
  20. }