项目作者: fb55

项目描述 :
transform streaming html using css selectors
高级语言: JavaScript
项目地址: git://github.com/fb55/cornet.git
创建时间: 2012-09-22T13:59:00Z
项目社区:https://github.com/fb55/cornet

开源协议:BSD 2-Clause "Simplified" License

下载


About

The cornet is a brass instrument very similar to the trumpet,
distinguished by its conical bore, compact shape, and mellower tone quality. -
Wikipedia

This project is demonstrating how to use a couple of my libraries to replace
substack/node-trumpet in just a
couple of LOC.

Even better, there are some advantages over trumpet:

  • The ammount of usable CSS selectors is increased dramatically thanks to
    fb55/css-select.
  • cornet works as a handler for
    fb55/htmlparser2, the probably
    fastest HTML parser currently available for node. And it’s much less strict
    than the sax module used by trumpet.
  • By using the great
    cheeriojs/cheerio module, you can
    do everything with your document that would be possible with jQuery.

Please note that callbacks are fired as soon as an element was retrieved. That
means that no content past the element will be available, so cheerio won’t find
anything, and, as the element is at this time the last child of it’s parent,
selectors like :nth-last-child won’t work as expected.

Install

  1. npm install cornet

Example

  1. const Parser = require("htmlparser2").WritableStream;
  2. const Cornet = require("cornet");
  3. const minreq = require("minreq");
  4. const $ = require("cheerio");
  5. const cornet = new Cornet();
  6. minreq.get("http://github.com/fb55").pipe(new Parser(cornet));
  7. cornet.remove("script"); //remove all scripts
  8. //show all repos
  9. cornet.select(".repo_list", function (elem) {
  10. $(elem)
  11. .find("h3")
  12. .each(function (i) {
  13. console.log("repo %d: %s", i + 1, $(this).text().trim());
  14. });
  15. });
  16. //does the same
  17. const i = 0;
  18. cornet.select(".repo_list h3", function (elem) {
  19. console.log("repo %d: %s", ++i, $(elem).text().trim());
  20. });
  21. //sometimes, you only want to get a single element
  22. const onTitle = cornet.select("title", function (title) {
  23. console.log("Page title:", $(title).text().trim());
  24. cornet.removeLister("element", onTitle);
  25. });

API

cornet(options)

The constructor. options are the same you can pass to
fb55/DomHandler.

It’s an EventEmitter that emits two events:

  • element is emitted whenever an element was added to the DOM.
  • dom is emitted when the DOM is complete.

cornet#select(selector | fn, cb)

Calls the callback when the selector is matched or a passed function returns
true (or any value that evaluates to true).

Internally, listenes for any element event and checks then if the selector is
matched.

Returns the listening function, so you can remove it afterwards (as shown in the
example above).

cornet#remove(selector | fn)

Removes all elements that match the selector. Also returns the listener.