项目作者: serpapi

项目描述 :
Google Search Results via SERP API Ruby Gem
高级语言: Ruby
项目地址: git://github.com/serpapi/google-search-results-ruby.git
创建时间: 2017-12-15T21:31:23Z
项目社区:https://github.com/serpapi/google-search-results-ruby

开源协议:MIT License

下载


Google Search Results in Ruby

serpapi-search-ruby
Gem Version

This Ruby Gem is meant to scrape and parse results from Google, Bing, Baidu, Yandex, Yahoo, Ebay and more using SerpApi.

The following services are provided:

SerpApi.com provides a script builder to get you started quickly.

Installation

Modern Ruby must be already installed:

  1. $ gem install google_search_results

Link to the gem page

Tested Ruby versions:

  • 2.5
  • 3.0
  • 3.1
  • 3.2

See: GitHub Actions.

Quick start

  1. require 'google_search_results'
  2. search = GoogleSearch.new(q: "coffee", serp_api_key: "secret_api_key")
  3. hash_results = search.get_hash

This example runs a search about “coffee” using your secret api key.

The SerpApi.com service (backend)

  • searches on Google using the search: q = “coffee”
  • parses the messy HTML responses
  • return a standardizes JSON response
    The class GoogleSearch
  • Format the request to SerpApi.com server
  • Execute GET http request
  • Parse JSON into Ruby Hash using JSON standard library provided by Ruby
    Et voila..

Alternatively, you can search:

  • Bing using BingSearch class
  • Baidu using BaiduSearch class
  • Yahoo using YahooSearch class
  • Yandex using YandexSearch class
  • Ebay using EbaySearch class
  • Home depot using HomeDepotSearch class
  • Youtube using YoutubeSearch class

See the playground to generate your code.

Summary

Guide

How to set the private API key

The api_key can be set globally using a singleton pattern.

  1. GoogleSearch.api_key = "secret_api_key"
  2. search = GoogleSearch.new(q: "coffee")

or api_key can be provided for each search.

  1. search = GoogleSearch.new(q: "coffee", api_key: "secret_api_key")

To get the key simply copy/paste from serpapi.com/dashboard.

Search API capability for Google

  1. search_params = {
  2. q: "search",
  3. google_domain: "Google Domain",
  4. location: "Location Requested",
  5. device: "desktop|mobile|tablet",
  6. hl: "Google UI Language",
  7. gl: "Google Country",
  8. safe: "Safe Search Flag",
  9. num: "Number of Results",
  10. start: "Pagination Offset",
  11. api_key: "private key", # copy paste from https://serpapi.com/dashboard
  12. tbm: "nws|isch|shop",
  13. tbs: "custom to be search criteria",
  14. async: true|false # allow async
  15. }
  16. # define the search search
  17. search = GoogleSearch.new(search_params)
  18. # override an existing parameter
  19. search.params[:location] = "Portland,Oregon,United States"
  20. # search format return as raw html
  21. html_results = search.get_html
  22. # search format returns a Hash
  23. hash_results = search.get_hash
  24. # search as raw JSON format
  25. json_results = search.get_json

(the full documentation)[https://serpapi.com/search-api].

More search API are documented on SerpApi.com.

You will find more hands on examples below.

Example by specification

We love true open source, continuous integration and Test Drive Development (TDD).
We are using RSpec to test our infrastructure around the clock to achieve the best QoS (Quality Of Service).

The directory test/ includes specification/examples.

Set your api key.

  1. export API_KEY="your secret key"

Install RSpec

  1. gem install rspec

To run the test:

  1. rspec test

or if you prefers Rake

  1. rake test

Location API

  1. location_list = GoogleSearch.new(q: "Austin", limit: 3).get_location
  2. pp location_list

it prints the first 3 location matching Austin (Texas, Texas, Rochester)

  1. [
  2. {
  3. id: "585069bdee19ad271e9bc072",
  4. google_id: 200635,
  5. google_parent_id: 21176,
  6. name: "Austin, TX",
  7. canonical_name: "Austin,TX,Texas,United States",
  8. country_code: "US",
  9. target_type: "DMA Region",
  10. reach: 5560000,
  11. gps: [-97.7430608, 30.267153],
  12. keys: ["austin", "tx", "texas", "united", "states"]
  13. },
  14. #...
  15. ]

Search Archive API

This API allows to retrieve previous search.
To do so run a search to save a search_id.

  1. search = GoogleSearch.new(q: "Coffee", location: "Portland")
  2. original_search = search.get_hash
  3. search_id = original_search[:search_metadata][:id]

Now let retrieve the previous search from the archive.

  1. search = GoogleSearch.new
  2. archive_search = search.get_search_archive(search_id)
  3. pp archive_search

it prints the search from the archive.

Account API

  1. search = GoogleSearch.new
  2. pp search.get_account

it prints your account information.

Search Google Images

  1. search = GoogleSearch.new(q: 'cofffe', tbm: "isch")
  2. image_results_list = search.get_hash[:images_results]
  3. image_results_list.each do |image_result|
  4. puts image_result[:original]
  5. end

To download the image: wget #{image_result[:original]}

this code prints all the images links,
and download image if you un-comment the line with wget (linux/osx tool to download image).

Search Google News

  1. search = GoogleSearch.new({
  2. q: 'cofffe', # search search
  3. tbm: "nws", # news
  4. tbs: "qdr:d", # last 24h
  5. num: 10
  6. })
  7. 3.times do |offset|
  8. search.params[:start] = offset * 10
  9. news_results_list = search.get_hash[:news_results]
  10. news_results_list.each do |news_result|
  11. puts "#{news_result[:position] + offset * 10} - #{news_result[:title]}"
  12. end
  13. end

this script prints the first 3 pages of the news title for the last 24h.

Search Google Shopping

  1. search = GoogleSearch.new({
  2. q: 'cofffe', # search search
  3. tbm: "shop", # shopping
  4. tbs: "tbs=p_ord:rv" # by best review
  5. })
  6. shopping_results_list = search.get_hash[:shopping_results]
  7. shopping_results_list.each do |shopping_result|
  8. puts "#{shopping_result[:position]} - #{shopping_result[:title]}"
  9. end

This script prints all the shopping results order by review order with position.

Google Search By Location

With SerpApi.com, we can build Google search from anywhere in the world.
This code is looking for the best coffee shop per city.

  1. ["new york", "paris", "berlin"].each do |city|
  2. # get location from the city name
  3. location = GoogleSearch.new({q: city, limit: 1}).get_location.first[:canonical_name]
  4. # get top result
  5. search = GoogleSearch.new({
  6. q: 'best coffee shop',
  7. location: location,
  8. num: 1, # number of result
  9. start: 0 # offset
  10. })
  11. top_result = search.get_hash[:organic_results].first
  12. puts "top coffee result for #{location} is: #{top_result[:title]}"
  13. end

We do offer two ways to boost your searches thanks to async parameter.

  • Non-blocking - async=true (recommended)
  • Blocking - async=false - it’s more compute intensive because the search would need to hold many connections.
  1. company_list = %w(microsoft apple nvidia)
  2. puts "submit batch of asynchronous searches"
  3. search = GoogleSearch.new({async: true})
  4. search_queue = Queue.new
  5. company_list.each do |company|
  6. # set search
  7. search.params[:q] = company
  8. # store request into a search_queue - no-blocker
  9. result = search.get_hash()
  10. if result[:search_metadata][:status] =~ /Cached|Success/
  11. puts "#{company}: search done"
  12. next
  13. end
  14. # add result to the search queue
  15. search_queue.push(result)
  16. end
  17. puts "wait until all searches are cached or success"
  18. search = GoogleSearch.new
  19. while !search_queue.empty?
  20. result = search_queue.pop
  21. # extract search id
  22. search_id = result[:search_metadata][:id]
  23. # retrieve search from the archive - blocker
  24. search_archived = search.get_search_archive(search_id)
  25. if search_archived[:search_metadata][:status] =~ /Cached|Success/
  26. puts "#{search_archived[:search_parameters][:q]}: search done"
  27. next
  28. end
  29. # add result to the search queue
  30. search_queue.push(result)
  31. end
  32. search_queue.close
  33. puts 'all searches completed'

This code shows a simple implementation to run a batch of asynchronously searches.

Supported search engine

Google search API

  1. GoogleSearch.api_key = ""
  2. search = GoogleSearch.new(q: "Coffee", location: "Portland")
  3. pp search.get_hash

https://serpapi.com/search-api

Bing search API

  1. BingSearch.api_key = ""
  2. search = BingSearch.new(q: "Coffee", location: "Portland")
  3. pp search.get_hash

https://serpapi.com/bing-search-api

Baidu search API

  1. BaiduSearch.api_key = ""
  2. search = BaiduSearch.new(q: "Coffee")
  3. pp search.get_hash

https://serpapi.com/baidu-search-api

Yahoo search API

  1. YahooSearch.api_key = ""
  2. search = YahooSearch.new(p: "Coffee")
  3. pp search.get_hash

https://serpapi.com/yahoo-search-api

Yandex search API

  1. YandexSearch.api_key = ""
  2. search = YandexSearch.new(text: "Coffee")
  3. pp search.get_hash

https://serpapi.com/yandex-search-api

Ebay search API

  1. EbaySearch.api_key = ""
  2. search = EbaySearch.new(_nkw: "Coffee")
  3. pp search.get_hash

https://serpapi.com/ebay-search-api

Youtube search API

  1. YoutubeySearch.api_key = ""
  2. search = YoutubeSearch.new(search_query: "Coffee")
  3. pp search.get_hash

https://serpapi.com/youtube-search-api

Homedepot search API

  1. HomedepotSearch.api_key = ""
  2. search = HomedepotSearch.new(q: "Coffee")
  3. pp search.get_hash

https://serpapi.com/home-depot-search-api

Walmart search API

  1. WalmartSearch.api_key = ""
  2. search = WalmartSearch.new(query: "Coffee")
  3. pp search.get_hash

https://serpapi.com/walmart-search-api

Duckduckgo search API

  1. DuckduckgoSearch.api_key = ""
  2. search = DuckduckgoSearch.new(query: "Coffee")
  3. pp search.get_hash

https://serpapi.com/duckduckgo-search-api

Naver search API

  1. search = NaverSearch.new(query: "Coffee", api_key: "secretApiKey")
  2. pp search.get_hash

https://serpapi.com/duckduckgo-search-api

Apple store search API

  1. search = AppleStoreSearch.new(term: "Coffee", , api_key: "secretApiKey")
  2. pp search.get_hash

https://serpapi.com/duckduckgo-search-api

  1. SerpApiSearch.api_key = ENV['API_KEY']
  2. query = {
  3. p: "Coffee",
  4. engine: "youtube"
  5. }
  6. search = SerpApiSearch.new(query)
  7. hash = search.get_hash
  8. pp hash[:organic_results]

see: google-search-results-ruby/test/search_api_spec.rb

Error management

This library follows the regular raise an exception when something goes wrong provided by Ruby.
Any networking related exception will be returned as is.
Anything related to the client layer will be returned as a SerpApiException.
A SerpApiException might be caused by a bug in the library.
A networking problem will be caused by either SerpApi.com or your internet.

Change log

  • 2.2
    • add apple store search engine
    • add naver search engine
  • 2.1 - Add more search engine: Youtube, Duckduckgo, Homedepot, Walmart
    • improve error management and documentation.
  • 2.0 - API simplified( GoogleSearchResults -> GoogleSearch), fix gem issue with 2.6+ Ruby, Out Of Box step to verify the package before delivery.
  • 1.3.2 - rename variable client to search for naming consistency
  • 1.3 - support for all major search engine
  • 1.2 - stable versino to support goole and few more search engine
  • 1.1 - client connection improvement to allow multi threading and fiber support
  • 1.0 - first stable version with Google engine search with Google image

Roadmap

  • 2.1 Improve exception / HTTP status handling

Conclusion

SerpApi supports all the major search engines. Google has the more advance support with all the major services available: Images, News, Shopping and more..
To enable a type of search, the field tbm (to be matched) must be set to:

  • isch: Google Images API.
  • nws: Google News API.
  • shop: Google Shopping API.
  • any other Google service should work out of the box.
  • (no tbm parameter): regular Google search.

The field tbs allows to customize the search even more.

The full documentation is available here.

Contributing

Contributions are welcome, feel to submit a pull request!

To run the tests:

  1. export API_KEY="your api key"
  2. rake test