transform streaming html using css selectors
The cornet is a brass instrument very similar to the trumpet,
distinguished by its conical bore, compact shape, and mellower tone quality. -
Wikipedia
This project is demonstrating how to use a couple of my libraries to replacesubstack/node-trumpet
in just a
couple of LOC.
Even better, there are some advantages over trumpet
:
fb55/css-select
.cornet
works as a handler forfb55/htmlparser2
, the probablysax
module used by trumpet
.cheeriojs/cheerio
module, you canPlease note that callbacks are fired as soon as an element was retrieved. That
means that no content past the element will be available, so cheerio won’t find
anything, and, as the element is at this time the last child of it’s parent,
selectors like :nth-last-child
won’t work as expected.
npm install cornet
const Parser = require("htmlparser2").WritableStream;
const Cornet = require("cornet");
const minreq = require("minreq");
const $ = require("cheerio");
const cornet = new Cornet();
minreq.get("http://github.com/fb55").pipe(new Parser(cornet));
cornet.remove("script"); //remove all scripts
//show all repos
cornet.select(".repo_list", function (elem) {
$(elem)
.find("h3")
.each(function (i) {
console.log("repo %d: %s", i + 1, $(this).text().trim());
});
});
//does the same
const i = 0;
cornet.select(".repo_list h3", function (elem) {
console.log("repo %d: %s", ++i, $(elem).text().trim());
});
//sometimes, you only want to get a single element
const onTitle = cornet.select("title", function (title) {
console.log("Page title:", $(title).text().trim());
cornet.removeLister("element", onTitle);
});
cornet(options)
The constructor. options
are the same you can pass tofb55/DomHandler
.
It’s an EventEmitter
that emits two events:
element
is emitted whenever an element was added to the DOM.dom
is emitted when the DOM is complete.cornet#select(selector | fn, cb)
Calls the callback when the selector is matched or a passed function returnstrue
(or any value that evaluates to true).
Internally, listenes for any element
event and checks then if the selector is
matched.
Returns the listening function, so you can remove it afterwards (as shown in the
example above).
cornet#remove(selector | fn)
Removes all elements that match the selector. Also returns the listener.