项目作者: transcriptaze

项目描述 :
Renders an audio WAV file as a PNG image.
高级语言: Go
项目地址: git://github.com/transcriptaze/wav2png.git
创建时间: 2020-01-19T23:37:51Z
项目社区:https://github.com/transcriptaze/wav2png

开源协议:MIT License

下载


build

wav2png

Renders a WAV file as a PNG image, with options to draw a grid, custom colouring and anti-aliasing.

There are three implementations:

  • A command line version (in the go subdirectory), mostly intended for scripting and batch processing
  • An online version implemented by compiling this library to WASM, which can be found
    here.
  • A WebGPU version in the webgpu directory) for a faster interactive experience,
    hosted on CloudFlare Pages.

Raison d’être

wav2png was originally created as a Go utility library to render an audio file as an anti-aliased waveform for
a WASM project - it just sseemed like a good idea to add a standalone command line version, and now a WebGPU
implementation for better interactivity.




Releases

Version Description
v1.1.0 Added wav2mp4 command line utility
v1.0.0 Initial release

Installation

Platform specific executables can be downloaded from the releases
page. Installation is straightforward - download and extract the archive for your platform and place the executables in
a directory in your PATH.

Building from source

Command line

Assuming you have Go and make installed:

  1. git clone https://github.com/transcriptaze/wav2png.git
  2. cd wav2png/go
  3. make build

If you prefer not to use make:

  1. git clone https://github.com/transcriptaze/wav2png.git
  2. cd wav2png/go
  3. mkdir bin
  4. go build -o bin ./...

WebGPU

Usage

wav2png

Command line:

  1. wav2png [--debug] [options] [--out <path>] <wav>
  2. <wav> WAV file to render.
  3. --out <path> File path for PNG file - if <path> is a directory, the WAV file name is
  4. used. Defaults to the WAV file base path.
  5. --debug Displays occasionally useful diagnostic information.
  6. Options:
  7. --settings <file> JSON file with the default settings for the height, width, etc. Defaults to
  8. .settings.json if not specified, falling back to internal default values if
  9. .settings.json does not exist.
  10. --width <pixels> Width (in pixels) of the PNG image. Valid values are in the range 32 to 8192,
  11. defaults to 645px.
  12. --height <pixels> Height (in pixels) of the PNG image. Valid values are in the range 32 to 8192,
  13. defaults to 395px.
  14. --padding <pixels> Padding (in pixels) between the border of the PNG and the extent of the rendered
  15. waveform. Valid values are -16 to +32, defaults to 2px.
  16. --palette <palette> Palette used to colour the waveform. May be the name of one of the internal colour
  17. palettes or a user provided PNG file. Defaults to 'ice'
  18. --fill <fillspec> Fill specification for the background colour, in the form type:colour
  19. e.g. solid:#0000ffff. Currently the only fill type supported is 'solid', defaults
  20. to solid:#000000ff.
  21. --grid <gridspec> Grid specification for an optional rectilinear grid, in the form
  22. type:colour:size:overlay, e.g.
  23. - none
  24. - square:#008000ff:~64
  25. - rectangle:#008000ff:~64x48:overlay
  26. The size may preceded by a 'fit':
  27. - ~ approximate
  28. - = exact
  29. - at least
  30. - at most
  31. - > greater than
  32. - < less than
  33. If gridspec includes :overlay, the grid is rendered 'in front' of the waveform.
  34. The default gridspec is 'square:#008000ff:~64'
  35. --antialias <kernel> The antialising kernel applied to soften the rendered PNG. Valid values are:
  36. - none no antialiasing
  37. - horizontal blurs horizontal edges
  38. - vertical blurs vertical edges
  39. - soft blurs both horizontal and vertical edges
  40. The default kernel is 'vertical'.
  41. --scale <scale> A vertical scaling factor to size the height of the rendered waveform. The valid
  42. range is 0.2 to 5.0, defaults to 1.0.
  43. --mix <mixspec> Specifies how to combine channels from a stereo WAV file. Valid values are:
  44. - 'L' Renders the left channel only
  45. - 'R' Renders the right channel only
  46. - 'L+R' Combines the left and right channels
  47. Defaults to 'L+R'.
  48. --start <time> The start time of the segment of audio to render, in Go time format (e.g. 10s or
  49. 1m5s). Defaults to 0s.
  50. --end <time> The end time of the segment of audio to render, in Go time format (e.g. 10s or
  51. 1m5s). Defaults to the end of the audio.
  52. Example:
  53. wav2png --debug \
  54. --settings 'settings.json' \
  55. --height 390 \
  56. --width 641 \
  57. --padding 0 \
  58. --palette 'amber.png' \
  59. --fill 'solid:#0000ffff' \
  60. --grid 'rectangular:#800000ff:~32x128:overlay' \
  61. --antialias 'soft' \
  62. --scale 0.5 \
  63. --start 0.5s \
  64. --end 1.5s \
  65. --mix 'L+R' \
  66. --out example.png \
  67. example.wav

wav2mp4

Command line:

  1. wav2mp4 [--debug] [options] [--out <path>] --window <duration> --fps <frame rate> --cursor <cursor spec> <wav>
  2. <wav> WAV file to render.
  3. --out <path> File path for MP4 file - if <path> is a directory, the WAV file name is
  4. used and defaults to the WAV file base path. wav2mp4 generates a set of ffmpeg
  5. frames files in the 'frames' subdirectory of the out file directory.
  6. --window <duration> The interval allotted to a single frame, in Go time format e.g. --window 1s.
  7. The window interval must be less than the duration of the MP4.
  8. --fps <frame rate> Frame rate for the MP4 in frames per second e.g. --fps 30
  9. --cursor <cursorspec> Cursor to indicate for the current play position. A cursor is specified by the
  10. image source and dynamic:
  11. --cursor <image>:<dynamic>
  12. where image may be:
  13. - none
  14. - red (internal 'red' cursor)
  15. - blue (internal 'blue' cursor)
  16. - a PNG file with a custom cursor image
  17. The cursor 'dynamic' defaults to 'sweep' if not specified, but may be one of
  18. the following:
  19. - sweep Moves linearly from left to right over the duration of the MP4
  20. - left Fixed on left side
  21. - right Fixed on right side
  22. - center Fixed in center of frame
  23. - ease Migrates from the left to center of the frame, before moving to the
  24. right side to finish
  25. - erf Moves 'sigmoidally' from left to right over the duration of the MP4,
  26. with the sigmoid defined by the inverse error function
  27. --debug Displays occasionally useful diagnostic information.
  28. Options:
  29. --settings <file> JSON file with the default settings for the height, width, etc. Defaults to
  30. .settings.json if not specified, falling back to internal default values if
  31. .settings.json does not exist.
  32. --width <pixels> Width (in pixels) of the PNG image. Valid values are in the range 32 to 8192,
  33. defaults to 645px.
  34. --height <pixels> Height (in pixels) of the PNG image. Valid values are in the range 32 to 8192,
  35. defaults to 395px.
  36. --padding <pixels> Padding (in pixels) between the border of the PNG and the extent of the
  37. rendered waveform. Valid values are -16 to +32, defaults to 2px.
  38. --palette <palette> Palette used to colour the waveform. May be the name of one of the internal
  39. colour palettes or a user provided PNG file. Defaults to 'ice'
  40. --fill <fillspec> Fill specification for the background colour, in the form type:colour
  41. e.g. solid:#0000ffff. Currently the only fill type supported is 'solid',
  42. defaults to solid:#000000ff.
  43. --grid <gridspec> Grid specification for an optional rectilinear grid, in the form
  44. type:colour:size:overlay, e.g.
  45. - none
  46. - square:#008000ff:~64
  47. - rectangle:#008000ff:~64x48:overlay
  48. The size may preceded by a 'fit':
  49. - ~ approximate
  50. - = exact
  51. - at least
  52. - at most
  53. - > greater than
  54. - < less than
  55. If gridspec includes :overlay, the grid is rendered 'in front' of the
  56. waveform.
  57. The default gridspec is 'square:#008000ff:~64'
  58. --antialias <kernel> The antialising kernel applied to soften the rendered PNG. Valid values are:
  59. - none no antialiasing
  60. - horizontal blurs horizontal edges
  61. - vertical blurs vertical edges
  62. - soft blurs both horizontal and vertical edges
  63. The default kernel is 'vertical'.
  64. --scale <scale> A vertical scaling factor to size the height of the rendered waveform. The
  65. valid range is 0.2 to 5.0, defaults to 1.0.
  66. --mix <mixspec> Specifies how to combine channels from a stereo WAV file. Valid values are:
  67. - 'L' Renders the left channel only
  68. - 'R' Renders the right channel only
  69. - 'L+R' Combines the left and right channels
  70. Defaults to 'L+R'.
  71. --start <time> The start time of the segment of audio to render, in Go time format (e.g. 10s
  72. or 1m5s). Defaults to 0s.
  73. --end <time> The end time of the segment of audio to render, in Go time format (e.g. 10s
  74. or 1m5s). Defaults to the end of the audio.
  75. Example:
  76. wav2mp4 --debug --window 1s --fps 30 --cursor red:sweep --out ./example/chirp.mp4 ./samples/chirp.wav
  77. cd ./example/frames
  78. ffmpeg -framerate 30 -i frame-%05d.png -c:v libx264 -pix_fmt yuv420p -crf 23 -y out.mp4
  79. ffmpeg -i out.mp4 -i ./samples/chirp.wav -c:v copy -c:a aac -y ../chirp.mp4
  1. Audio File Format Specifications
  2. SoX
  3. WaveFile Gem
  4. DirectMusic: wav2png
  5. Shulz Audio: wav2png
  6. iconmonstr
  7. BBC
  8. BBC CLI
  9. stackoverflow
  10. github:waveform