项目作者: spencertipping

项目描述 :
Say "ni" to data of any size
高级语言: Perl
项目地址: git://github.com/spencertipping/ni.git
创建时间: 2015-02-22T20:56:20Z
项目社区:https://github.com/spencertipping/ni

开源协议:Other

下载





ni


Installing ni

  1. curl -sSL https://tipping.haus/install-ni | bash
  2. ni --upgrade # update from master (stable mode)
  3. ni --upgrade develop # update from develop (fun mode)

ni has no dependencies except for your system’s perl; the above installation
command just drops it into ~/bin/ and adds a path extension to ~/.profile if
you don’t have one yet.

Once ni is installed, you can run ni --upgrade to keep it up to date.

You only need to install ni on the machine you’re using. ni will
nondestructively install itself on machines you point it at, e.g. by using ssh
or Hadoop to move sections of pipelines.

What is ni?

ni is a way to write data processing pipelines in bash. It prioritizes
brevity, low latency, high throughput, portability, and ease of iteration.
Here’s an example workflow to look at attempted SSH logins in
/var/log/auth.log:

ni basics


ni explain

Documentation

Running ni without options will print a usage summary covering the most common
options (also included at the bottom of this README).

ni --inspect provides interactive documentation and a literate source
explorer.

Ni By Example

An excellent guide by Michael Bilow:


ni license

MIT license

Copyright (c) 2016-2021 Spencer Tipping

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the “Software”), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

Contributors

ni //help/usage

  1. USAGE
  2. ni [commands...] Run a data pipeline
  3. ni --explain [commands...] Explain a data pipeline
  4. ni --inspect Interactive documentation and literate source
  5. ni --js Interactive 3D visualization
  6. ni --upgrade Upgrade to latest version
  7. ni --version
  8. This documentation is not exhaustive; see 'ni --inspect' for everything.
  9. ni outputs progress while it's running; see ni //help/monitor for details,
  10. or export NI_MONITOR=no to disable.
  11. ADVANCED
  12. ni --upgrade develop Specify branch for upgrade (default is master)
  13. Note that this may downgrade ni; this allows
  14. you to use --upgrade to switch branches.
  15. SYNTAX (ni //help/stream)
  16. ni chains commands (operators) with shell pipes, which means these two
  17. commands are equivalent:
  18. $ ni //help r5
  19. $ ni //help | ni r5
  20. Operators that write data usually append to existing streams:
  21. $ ni //help //help
  22. $ ni n10 //help
  23. You can omit whitespace and brackets unless the omission makes the pipeline
  24. ambiguous:
  25. $ ni //help FW g c O r5
  26. $ ni //help FWgcOr5
  27. Numbers can be written several ways:
  28. $ ni n100
  29. $ ni nE2
  30. $ ni n='10 * 10'
  31. DOCUMENTATION (ni //help)
  32. //help/ni_by_example_1 or //help/ex1
  33. //help/ni_by_example_2 or //help/ex2
  34. ...
  35. //help/ni_by_example_6 or //help/ex6
  36. //help/ni_fu
  37. //help/cookbook
  38. ni --inspect Webserver with interactive docs and source explorer
  39. ADVANCED
  40. //ni/keys r/^doc/ All documentation pages
  41. //ni/doc/net.md Open a documentation page by addressing ni's state
  42. //help/net Shorthand to open doc/net.md
  43. //ni Output ni's source code
  44. //ni/keys Output ni's internal data state
  45. $ ni //ni/keys r/^doc/ List all builtin documentation pages
  46. Note that //ni and //ni/keys will differ if you bind data closures or
  47. runtime libraries.
  48. --explain [commands...] Explain a data pipeline before meta-expansion
  49. --explain-meta [commands...] Explain a data pipeline after meta-expansion
  50. GENERATING DATA (ni //help/stream)
  51. filename Write the contents of a file, decompressing if necessary
  52. dirname Write a list of directory contents
  53. http[s]://url Write HTTP GET stream using curl
  54. file:///path Write contents of a file, decompressing if necessary
  55. i'text' Write 'text' as output
  56. i[x y z] Write 'x y z' as a tab-delimited output line
  57. k'text' Write 'text' forever
  58. k[x y z] Write 'x y z' as tab-delimited output forever
  59. n100 Write 1..100 to output, each on its own line
  60. n='3 + 4 * 5' Write 1..23
  61. n0500 Write 0..499
  62. n Write 1.. as an infinite list of integers
  63. n0 Write 0.. as an infinite list of integers
  64. n%7 Write 0..6,0..6,... as an infinite list of integers
  65. _ Vertically align line contents within 1024-row groups
  66. It's common to end a stream with '_', which vertically aligns multi-column
  67. data.
  68. ADVANCED
  69. fileseek://64:foo Contents of file 'foo' beginning at byte 64
  70. filepart://5:2:foo Two bytes of file 'foo' starting at byte 5
  71. zip://file.zip List contents of zip archive (each is a ni URL)
  72. tar://file.tar List contents of tar archive (also handles tgz etc)
  73. 7z://file.7z List contents of 7z archive
  74. git://dir List git sub-URLs for git-managed dir
  75. dir:///path List URI-form filenames in a path, unsorted
  76. ls:///path List plain-text filenames in a path, unsorted (fastest)
  77. s3u://bucket/path Contents of 'aws s3 cp s3://bucket/path -', unsigned
  78. s3://bucket/path Contents of 'aws s3 cp s3://bucket/path -', signed
  79. s3r://bucket/path Same, but with --request-payer (requires NI_DANGER_MODE)
  80. s3lsu://path 'aws s3 ls --recursive --no-sign-request'
  81. s3ls://path 'aws s3 ls --recursive'
  82. s3lsr://path 'aws s3 ls --recursive --request-payer' (NI_DANGER_MODE)
  83. sqlite://file.db List tables in database (each is a ni URL)
  84. wiki://JPEG Read English-wikipedia JPEG article as source
  85. enws://JPEG Read EN Wikipedia article as Source
  86. simplews://JPEG Read Simple Wikipedia article as Source
  87. zhws://北京市 Read ZH-language article on Beijing
  88. COLUMNS AND FIELDS (ni //help/col)
  89. fA or f Select first TSV column (columns are A..Z)
  90. fBA Swap first two columns, discard others
  91. fBA. Swap first two columns, followed by everything else
  92. fAA Duplicate first column, discard others
  93. fA-D Select first four columns
  94. f^D Copy fourth column to front (== ni fDABCD.)
  95. x Swap first two columns, keep others (== ni fBA.)
  96. xE Swap first and fifth columns (== ni fEBCDA.)
  97. Columns can also be specified numerically: f#0,#1,#2 == fABC.
  98. F:/ Use '/' as a field delimiter; i.e. split on '/'
  99. F/foo/ Split on text matched by /foo/
  100. Fm/foo/ Scan each row for /foo/, emitting each as a field
  101. FC Split on commas (== ni F:,)
  102. FD Split on '/' (== ni F:/)
  103. FV Split CSV, except fields that contain newlines
  104. FS Split on horizontal whitespace
  105. FW Split on non-word characters
  106. FP Split on pipe symbols
  107. F^S Join fields with space characters (inverts FS)
  108. F^C Join fields with commas
  109. F^:% Join fields with '%'
  110. ADVANCED
  111. w[...] Zip each line with a line from 'ni ...', joined on the right
  112. W[...] Zip each line with a line from 'ni ...', joined on the left
  113. W[n] or Wn Common idiom: prepend line numbers
  114. W[kfoo] or Wkfoo Prepend each line with 'foo'
  115. w[k[x y]] or wk[x y] Append 'x y' to each line (as TSV)
  116. ROWS (ni //help/row)
  117. r10 Take first 10 rows: head -n10
  118. rs10 Print first 10 rows but consume all input
  119. r+10 Take last 10 rows: tail -n10
  120. r-10 Drop first 10 rows
  121. rx10 Take every 10th row, evenly spaced
  122. r.1 Take 10% of rows, sampled randomly but deterministically
  123. r/foo/ Take rows matching the Perl regex /foo/
  124. r'/foo b*/' Take rows matching /foo b*/
  125. rp'length > 5' Take rows for which the Perl expression 'length > 5' is true
  126. (see ni //help/perl)
  127. rA Take rows for which column A is non-blank
  128. rpa Take rows for which column A is non-blank and nonzero
  129. R^ Collapse lines by replacing \n with \r
  130. R^4K Collapse lines with \r until they're at least 4KiB long
  131. R, Fold lines by replacing \n with \t
  132. R,128 Fold lines with \t until they're at least 128 bytes long
  133. R=foo Replace 'foo' with \n within collapsed stream
  134. R/foo.*?bar/ Emit each match of /foo.*?bar/ on its own line
  135. ADVANCED
  136. JA[...] Outer join unsorted stream with 'ni ...' on column A value
  137. jB[...] Outer join sorted stream with 'ni ...' on columns A and B
  138. riA[...] Take rows whose A field is included in 'ni ...'
  139. rIB[...] Take rows whose B field is excluded from 'ni ...'
  140. rbC[...] Take rows whose C field is in the bloom filter generated by
  141. 'ni ...' (see zB operator to create bloom filters)
  142. r^bC[...] Take rows whose C field is _not_ in the bloom filter
  143. generated by 'ni ...'
  144. ry'...' Take rows for which the Python expression '...' is true
  145. (see ni //help/python)
  146. rm'...' Take rows for which the Ruby expression '...' is true
  147. (see ni //help/ruby)
  148. rl'...' Take rows for which the Lisp expression '...' is true
  149. (see ni //help/lisp)
  150. rjs'...' Take rows for which the NodeJS expression '...' is true
  151. CELLS (ni //help/cell)
  152. ,Cd Clean cells in column A as integer (remove [^-0-9])
  153. ,CfD Clean cells in column D as float
  154. ,Cw Clean non-word characters
  155. ,Cx Clean non-hex characters
  156. ,s Calculate running sum of cells in column A
  157. ,sAC Calculate running sums of columns A and C
  158. ,d Calculate delta of first column
  159. ,0 Column -= first value (offset to set first value to zero)
  160. ,a Calculate running average of first column
  161. ,aw5 Calculate running 5-windowed average of first column
  162. ,q Quantize cell values (round to nearest integer)
  163. ,qAB.05 Quantize cells in columns A and B to nearest 0.05
  164. ,l Log-transform values in column A
  165. ,lC2 Calculate base-2 log of values in column C
  166. ,e Exp-transform values in column A
  167. ,eB2 Calculate 2^x for values in column B
  168. ,L Log transformation for signed data
  169. ,agA Group rows with the same A-column, then calculate the
  170. average B-value for each group (output is A, mean(B))
  171. ,sgC Group rows with the same A, B, and C columns, then calculate
  172. the sum of D-values for each group
  173. (output is A, B, C, sum(D))
  174. ,qgB4 Calculate quantiles for C values within each (A,B) group;
  175. "4" means you'll have min, 25%, 50%, 75%, max -- i.e.
  176. quartiles with bounds
  177. ,z Dictionary-compress each distinct cell value to an integer
  178. ,Z Count changes in the cell value
  179. ,h Murmurhash each cell value
  180. ,H Murmurhash each cell value, adjusted to unit interval
  181. ,m MD5 each cell value
  182. ,t Convert UNIX epochs to ISO-formatted timestamps
  183. ,g Encode geohashes from "lat,lng" strings
  184. ,g5 Encode geohashes at precision 5
  185. ,G Decode geohashes into "lat,lng" strings
  186. Consecutive cell operators can share the initial comma: ,ls == ,l,s
  187. IO AND COMPRESSION (ni //help/stream)
  188. \>foo Write stream to file, then output the filename when done
  189. \> Identical to \>, but generates a tempfile name for you
  190. \< Read stream from filename (inverts \>)
  191. :foo Save data checkpoint in file 'foo', reusing it if it exists
  192. z4 Compress with LZ4 (with lz4)
  193. zo Compress with LZO (with lzop)
  194. z or zg Compress as gzip
  195. zb Compress with bzip2
  196. zx Compress with xz
  197. zz Compress with zstd
  198. zn Redirect to /dev/null (lossy compression)
  199. zd Decompress stream contents, autodetecting type (includes
  200. bare zlib/deflate streams)
  201. z9 Compress with gzip -9
  202. zz19 Compress with zstd -19
  203. z42 Compress with lz4 -2
  204. ADVANCED
  205. zB42 Produce a bloom filter for 10000 items with 0.01 FP rate
  206. \| Lazily write stream to FIFO, output fifo name immediately
  207. \<# Like \<, but unlink after reading (requires NI_DANGER_MODE)
  208. W\< Like \<, but prepend the input filename to each line of data
  209. W\> Write each line to filename in column A (column-A values
  210. should be contiguous)
  211. W\>[...] Write each line to filename in column A, preprocessing each
  212. stream with 'ni ...' (column-A values should be contiguous)
  213. S\>[...] Like W\>, but keeps files open so column A doesn't have to
  214. be contiguous (if you have too many distinct values, this
  215. will fail with "too many open files")
  216. \*[...] Like S\>, but keep outputs instead of writing files -- note
  217. that lines are not merged (for structured merging, use the
  218. S[...] operator in //help/scale)
  219. SORTING AND COUNTING (ni //help/row)
  220. g Sort the whole stream using constant memory
  221. o Like 'g', but numeric sort
  222. O Descending numeric sort
  223. c Count and merge adjacent identical rows (uniq -c)
  224. u uniq
  225. U uniq -c across all rows, done in memory (output rows are
  226. randomly shuffled)
  227. wcl Shorthand for wc -l
  228. gBA Sort using columns B and A to determine order
  229. gBnA- Sort using B numeric, A reverse lexical to determine order
  230. gCg Sort on C numerically, parsing scientific notation (not all
  231. 'sort' programs support this)
  232. ggAB Group rows with the same A-column value, and sort each group
  233. by value in column B
  234. ggABnCn- Sort A-groups ordered by B numeric and C descending numeric
  235. ADVANCED
  236. g_100 Split input into 100-line chunks, sort each individually
  237. gM Use 'sort -m' to merge sorted streams, each specified as a
  238. filename
  239. gMB- Merge sorted streams on column B descending (streams must be
  240. sorted this way too)
  241. $ ni //ni/conf r/^row/ _
  242. Show current sort parameters, including compression,
  243. parallelism, and buffer size (used only with GNU coreutils
  244. sort)
  245. $ export NI_ROW_SORT_BUFFER=64M
  246. Set maximum allowed memory for a sort operation: anything
  247. larger will be compressed and mergesorted on disk, which
  248. creates IO overhead
  249. DATA CLOSURES (ni //help/closure)
  250. ::foo[...] Store the output of 'ni ...' into "foo" memory dataclosure
  251. :@foo[...] Store the output of 'ni ...' into "foo" file dataclosure
  252. //:foo Output contents of "foo" memory dataclosure
  253. //@foo Output contents of "foo" file dataclosure
  254. Data closures move across SSH and Hadoop boundaries, although you may
  255. overflow memory if you store large values. Data closure contents are also
  256. accessible to stream transformation code, e.g. p'foo()'.
  257. Example:
  258. $ ni ::words[ /usr/share/dict/words ] //help FW Z1 riA[ //:words ] g c O
  259. $ ni ::words /usr/share/dict/words //help FWZ1riA//:words gcO
  260. STREAM TRANSFORMATION (ni //help/stream)
  261. +[...] Append lines from 'ni ...'
  262. ^[...] Prepend lines from 'ni ...'
  263. %[...] Interleave lines with 'ni ...', as data is available
  264. %4[...] Interleave four lines for every one from 'ni ...'
  265. %-4[...] Interleave one line for every four from 'ni ...'
  266. =[...] Duplicate stream into 'ni ... > /dev/null'
  267. : Copy data verbatim
  268. e'...' Filter stream with '...' shell command
  269. e[grep -v foo] Filter stream with 'grep -v foo' shell command
  270. S4[...] Run four copies of pipeline section '...', shard data across
  271. them, and merge outputs -- note that this reorders your data
  272. ADVANCED
  273. p'...' Run Perl code '...' on each input line (see //help/perl)
  274. pR[...] Preload Perl code generated by 'ni ...' (see //help/perl)
  275. y'...' Run Python code '...' on each input line (see //help/python)
  276. yR[...] Preload Python code generated by 'ni ...'
  277. (see //help/python)
  278. yI'...' Identical to i'...', but indents Python code correctly when
  279. used for multiline strings
  280. m'...' Run Ruby code '...' on each input line (see //help/ruby)
  281. l'...' Run Lisp code '...' on each input line (see //help/lisp)
  282. js'...' Run NodeJS code '...' on each input line (documentation TBD)
  283. c99'...' Compile '...' as C99 and run the result on entire stream
  284. (see //help/c)
  285. c++'...' Compile '...' as C++ and run the result on entire stream
  286. (requires 'c++' compiler; see //help/c)
  287. Bd64M Copy stream through a 64M disk-backed FIFO
  288. shost[...] Run pipeline section '...' on 'host' via SSH (ni will
  289. self-install in memory, and data closures are forwarded)
  290. See also //help/binary to parse non-text streams.
  291. FUNCTIONS AND LET-BINDINGS (ni //help/fn)
  292. f[%x %y : ...] For each line of input, bind %x and %y as TSV and
  293. run 'ni ...' with values substituted for %x and %y
  294. l[%x=5 %y=10 : ...] Replace %x with 5 and %y with 10 in 'ni ...'
  295. fx8[%x : ...] Use 'xargs -P8' to run 8 parallel 'ni ...'
  296. processes, each a single f[] from the input
  297. EXAMPLES
  298. f[%f : i%f \<wcl] Filenames -> line counts
  299. fx8[%f : i%f \<wcl] Same, but read 8x at a time
  300. Note that you can call your arguments pretty much whatever you want to. The
  301. % prefix is optional; it just prevents your args from colliding with ni
  302. operators.
  303. Also note that ni parses the function body only once. This means your
  304. function arguments need to occur in positions where they don't change how
  305. your function is parsed; for example '%f' as a filename needs to be written
  306. as 'i%f \<' rather than directly.
  307. ni --explain -> explain without let-expansion
  308. ni --explain-meta -> explain after let-expansion (and other expansions)
  309. Also also note that because of xargs, fxN[] is subject to write corruption
  310. for large outputs. It's safer to have fxN[] output filenames and read them
  311. with \< or \<# -- for example, 'fx4[%f : ... \>] \<#'.
  312. PERL STREAM CODE (ni //help/perl)
  313. Used both by p'...' and rp'...'.
  314. Perl stream processors run in a loop that invokes your code once per input
  315. line. You can use BEGIN and END blocks for cross-row state, or use multiline
  316. functions to read blocks of lines.
  317. FUNCTIONS
  318. a() .. l() Values in columns A-L on current row
  319. F_(@indexes) Values in indexed columns, or all if @indexes == ()
  320. $_ Current line, with trailing newline
  321. FR($i) All fields inclusive-rightwards from column $i
  322. FT($i) All fields exclusive-until column $i
  323. FM() Index of rightmost column on this line
  324. r(@values) Write an output row of TSV @values, return ()
  325. MULTILINE FUNCTIONS
  326. Note that these move the current-line context forward, so a() .. l()
  327. will reflect the last-read line -- not the one you started with. It's
  328. common to say 'my $x = a; ...' when reading ahead.
  329. reA() .. reL() Read while Equivalent along A .. (A-L) -- returns a
  330. list of lines
  331. a_(@ls) .. l_(@ls) Extract one column of data from a list of lines
  332. ab_(@ls) .. kl_() Extract two columns of data with values interleaved
  333. rw{/foo/} Read and return list of lines that satisfy /foo/
  334. ru{/foo/} Read and return list of lines until the next one
  335. satisfies /foo/
  336. rl($n = 1) Advance and return $n lines ahead of the current one
  337. pl($n) Peek and return $n lines ahead of the current one
  338. (does not update a() .. l())
  339. UTILITY FUNCTIONS
  340. Below is an incomplete list; use 'ni --inspect' and explore the
  341. 'core/pl' library for source definitions.
  342. rf($filename) Read file into string, return it
  343. rfl($filename) Read file into list of lines, return them
  344. ri(my $var, "< $f") Read file $f into $var
  345. ri(my $var, "ls |") Read output of "ls" into $var
  346. wf($f, $contents) Write string $contents into file $f
  347. af($f, $contents) Append string $contents to file $f
  348. je($thing) JSON-encode a value ($thing can be a ref)
  349. jd($str) JSON-decode a value into a Perl scalar
  350. tpe($ts =~ /\d+/g) Time pieces to epoch (YmdHMS ordering)
  351. tep($e) Time epoch to pieces (YmdHMS)
  352. tef($e) Time epoch to formatted
  353. max(@values) Returns maximum value under numeric comparison
  354. min(@values)
  355. maxstr(@values) Returns maximum value under string comparison
  356. minstr(@values)
  357. any($f, @xs) True iff $f->($x) for any $x in @xs
  358. all($f, @xs) True iff $f->($x) for all $x in @xs
  359. argmax($f, @xs) Returns $x from @xs maximizing $f->($x)
  360. argmin($f, @xs)
  361. indmax($f, @xs) Returns $i from 0..$#xs maximizing $f->($xs[$i])
  362. indmin($f, @xs)
  363. sum(@values) Math utils; see core/pl/math.pm in 'ni --inspect'
  364. prod(@values) for more definitions
  365. mean(@values)
  366. median(@values)
  367. uniq(@values)
  368. var(@values) Variance
  369. std(@values) sqrt(var(@values))
  370. clip($l, $u, @xs) Returns @xs, but clips all values to range [$l, $u]
  371. linspace(a, b, n) Returns N evenly spaced values spanning [a, b]
  372. EXAMPLES
  373. Many more examples in //help/ex2 .. //help/ex6.
  374. p'a + b' Add the first two columns of data
  375. p'r "foo", a, 5' For each input row, write (foo, a, 5) as output
  376. p'length $_' Return the length of each input line
  377. p'r a + 1, FR 1' Add 1 to column A, return all other columns
  378. unmodified
  379. p'my @ls = rea; Read all lines whose A-column value is the same...
  380. sum(b_(@ls))' ...and print the sum of that group's B column
  381. p'r rw{/^a/}' Read all lines While /^a/ matches, then output them
  382. on a single row
  383. p'r ru{/^a/}' Read all lines Until /^a/ matches
  384. p'r rw{1}' Read all lines in the entire stream
  385. p'a > 5 ? r a : ()' Write cell A for rows for which it's larger than 5
  386. p'r F_(4, 5)' Write fields 4 and 5 on a row -- same as p'r e, f'
  387. p'F_(4, 5)' Write fields 4 and 5 on separate lines
  388. pF_ Idiom to flatten each row vertically
  389. ::dict[...] \ Store a stream into the ::dict data closure...
  390. p'^{%d = ab_ dict} ...within a BEGIN block (^{}), parse cols A and B
  391. from ::dict into a hash
  392. r a, $d{+a}' ...for each row, write cell A and its hash
  393. association
  394. MATRIX TRANSFORMATION (ni //help/matrix)
  395. Y Dense to sparse (each cell becomes row, col, val)
  396. X Sparse to dense
  397. Z4 Reflow cells to be 4-wide on each row
  398. ZB Flatten (a, b, c, d, e) into (a,b,c), (a,b,d), (a,b,e)
  399. Z^B Invert ZB: collect (a,b,c), (a,b,d), (a,b,e) -> (a,b,c,d,e)
  400. N'x = x + 1' Read whole stream into numpy matrix, use 'x = x + 1' as
  401. Python code to transform matrix, write resulting matrix to
  402. stream
  403. NA'x = abs(fft.fft(x))'
  404. Read groups of rows having the same column-A value; for each
  405. group, read into a numpy matrix, transform with specified
  406. code, and write to stream -- keeping the column-A prefix
  407. Note that you can write multiline Python code; ni will infer the correct
  408. indentation and adjust accordingly.
  409. If you're working with large binary matrices, by'' is likely to be more
  410. efficient than N''.
  411. BINARY PACKING (ni //help/binary)
  412. bf<template> Read fixed-length rows with pack() <template>
  413. bf^<template> Read TSV and emit fixed-length rows with pack()
  414. bp'...' Run '...' Perl code over binary data
  415. by'...' Run '...' Python code over binary data
  416. Use 'perldoc -f pack' for a full list of template elements. Note that bf
  417. handles only fixed-length templates: 'n/a' won't work, for example. If you
  418. need to unpack variable-length records, use the 'rp' (read-packed) function
  419. in bp'...', which uses buffered readahead:
  420. $ ni n10 bf^n/a bp'r rp"n/a"' # NB: bf^ allows n/a; bf does not
  421. Note that by'' doesn't preload NumPy the way N'' does; its only imports are
  422. "os" and "sys".
  423. Also note that you _must_ use sys.stdin.buffer when reading binary data; if
  424. you use sys.stdin.read() directly, its own buffering will cause premature
  425. EOF, potentially causing your code not to see the last N bytes of data.
  426. by'' is a work in progress.
  427. BINARY PERL FUNCTIONS
  428. bi() Return current stream offset in bytes
  429. available() True if stream is not at EOF
  430. rp($packstring) Read packed values, returning a list
  431. rb($nbytes) Read exactly $nbytes bytes into a string
  432. pb($nbytes) Peek (but don't consume) exactly $nbytes bytes
  433. wp($pack, @xs) Pack @xs using $pack template, then write binary
  434. ws($data) Write $data as binary, return ()
  435. FORMAT-SPECIFIC FUNCTIONS
  436. rppm() Read PPM binary header: ($bytes, $magic, $w, $h, $max)
  437. GNUPLOT
  438. G<col><term><cmd> Use gnuplot to visualize data
  439. G:<term><cmd> Plot data in a single group
  440. G:W%l Plot one or two-column data with lines, interactively
  441. G*W Plot all columns of data, keyed by col A on the X axis
  442. term can be:
  443. X (x11)
  444. Q (Qt)
  445. W (Wx)
  446. J (jpeg)
  447. PC (pngcairo)
  448. P (png)
  449. cmd is a verbatim gnuplot command, with these shorthands:
  450. %l plot "-" with lines
  451. %d plot "-" with dots
  452. %i plot "-" with impulses
  453. %v plot "-" with vectors
  454. %t'...' title "..."
  455. %u'...' using ...
  456. G<col>, e.g. GA, causes gnuplot to be run multiple times -- one per group of
  457. rows for which column A is the same. This is useful when animating data.
  458. jpeg and png terminals will create image outputs on stdout, concatenated if
  459. gnuplot is run multiple times. ffmpeg can accept these concatenated image
  460. streams as input for video assembly. For example, to create an animated
  461. plot:
  462. $ ni n100,L p'r a, sin(a*$_/100) for 0..1000' GAP%l IVavi \>animated.avi
  463. If you're looping gnuplot with a column spec, ni sets a gnuplot variable
  464. called KEY that contains the current group value. You can use this by
  465. writing gnuplot code longhand:
  466. $ ni n100,L p'r a, sin(a*$_/100) for 0..1000' \
  467. GAP'set title "coefficient = " . KEY;
  468. plot "-" with lines' IVavi \>animated-title.avi
  469. NOTE: older versions of ffmpeg had a bug in the PNG image2pipe reader;
  470. version 4.2.4 (and possibly earlier) works correctly.
  471. MEDIA
  472. yt://oHg5SJYRHA0 Stream video from youtube using youtube-dl
  473. v4l2:///dev/video0 Stream video from /dev/video0 v4l2 device
  474. x11grab://:0@640x480 Stream video from X11 display :0, clipped at 640x480
  475. m3u://https://... Stream video from M3U playlist using ffmpeg
  476. VP Play video stream using ffplay
  477. VIppm Convert video to concatenated stream of PPM images
  478. VImjpeg Convert video to concatenated stream of JPGs
  479. VIpng@1920x1080 Convert video and downsample to 1920x1080 resolution
  480. AE<mediaspec> Use ffmpeg to discard video, emit audio as <mediaspec>
  481. IV<mediaspec> Convert concatenated images to video (some older
  482. ffmpegs fail if you use PNGs as input)
  483. I[...] Split a stream of concatenated PNG, BMP, or PPM
  484. images, transform each with 'ni ...'
  485. IC[init][fold][emit] Left-fold a stream of concatenated PNG, BMP, or PPM
  486. images using ImageMagick 'convert' (see below)
  487. <mediaspec> describes the container format, codec, and bitrate. The
  488. following examples are valid:
  489. IVavi AVI container format, default codec + bitrate
  490. IVgif GIF animated image
  491. IVmatroska/libvpx Matroska with VPX codec, default bitrate
  492. AEogg/libvorbis/224k Ogg container, vorbis audio codec, 224k bitrate
  493. m3u:// defaults to FLV, but you can specify the target media container, e.g.
  494. m3u+gif://URL. This may be required if the codec doesn't work with FLV.
  495. IC[][][] is a disk-intensive way to mix data between images within a
  496. sequence. It works like this:
  497. image 0 | convert $init > reduced.png
  498. while (more images)
  499. next image | convert reduced.png $fold > reduced.png
  500. Each time reduced.png is written, 'convert reduced.png $emit png:-' is run
  501. to emit a transformed version of it to stdout. This becomes the output image
  502. stream.
  503. IC's [] blocks are all 'convert' command-line argument lists. [init] can be
  504. written as : to specify no transformation. For example, to blur/fade:
  505. IC: [-blur 1x1 - -compose blend -define compose:args=100,98 -composite] \
  506. [-resize 1920x1080]
  507. C99 JIT (ni //help/c)
  508. c99'C source' Compile C source to executable, then pipe stream through it
  509. The c99'' operator will compile a C99 program immediately before using it as
  510. a stream filter. Because the C99 program remains on disk, your program
  511. should unlink itself by deleting argv[0].
  512. Your C program will have normal stdin/stdout/stderr IO available; there is
  513. no input preprocessing or line-splitting. Indentation is inferred as for
  514. Python.
  515. EXAMPLES
  516. ni c99'#include <stdio.h>
  517. #include <stdlib.h>
  518. int main(int argc, char **argv)
  519. {
  520. unlink(argv[0]);
  521. printf("hi!\n");
  522. return 0;
  523. }'
  524. HASKELL JIT
  525. hs'HS source' Use /usr/bin/env stack to run Haskell source, then pipe
  526. stream through it
  527. This requires Haskell Stack to be runnable with "/usr/bin/env stack". Like
  528. C99 JIT, the Haskell program has stdin/stdout/stderr IO. Indentation is
  529. inferred as for Python.
  530. EXAMPLES
  531. ni hs'#!/usr/bin/env stack
  532. -- stack --resolver lts-18.3 script
  533. main :: IO ()
  534. main = putStrLn "hi!"'