项目作者: sfischer13

项目描述 :
:hamster: Collection of handy text manipulation tools
高级语言: Go
项目地址: git://github.com/sfischer13/datautils.git
创建时间: 2017-12-03T23:57:31Z
项目社区:https://github.com/sfischer13/datautils

开源协议:MIT License

下载




datautils logo



datautils



The best toolbox for processing textual data.



Release License Go Report Card



Contents

Introduction

The Data Utilities are a collection of handy text manipulation tools. These tools are supposed to make a data wrangler’s life on the command-line easier.

Much of the functionality can be solved with standard command-line tools (awk, sed, cut, sort, uniq, …), but that would often become tedious. Zealots of the Unix philosophy will probably not use these tools and create a set of sophisticated aliases instead.

On the other hand, some of the tools fix actual problems. The tools use UTF-8 by default. As a consequence, one does not have to deal with the quirks of sort and uniq w.r.t. non-ASCII input.

Installation

  1. go get -v github.com/sfischer13/datautils/...

Tools

These tools are part of the collection:

  • count
  • norm
  • rows
  • text
  • trim

Usage

count

  1. $ echo "a\na\na\nb\nb\nc"
  2. a
  3. a
  4. a
  5. b
  6. b
  7. c
  1. $ echo "a\na\na\nb\nb\nc" | count --keys
  2. 3 a
  3. 2 b
  4. 1 c
  1. $ echo "a\na\na\nb\nb\nc" | count --counts
  2. 1 c
  3. 2 b
  4. 3 a
  1. $ echo "a\na\na\nb\nb\nc" | count --flip
  2. a 3
  3. b 2
  4. c 1
  1. $ echo "a\na\na\nb\nb\nc" | count --threshold 2
  2. 3 a
  3. 2 b

norm

  1. $ echo "¹²³" | norm --nfc
  2. ¹²³
  1. $ echo "¹²³" | norm --nfkc
  2. 123

rows

  1. echo "a\nb\nc\nd\ne" | rows --rows 2:4
  2. b
  3. c
  4. d
  1. echo "a\nb\nc\nd\ne" | rows --rows 1,5
  2. a
  3. e

text

  1. $ echo abca | text chars
  2. a
  3. b
  4. c
  5. a
  1. $ echo "This is a test." | text words
  2. This
  3. is
  4. a
  5. test.

trim

  1. $ echo " abc" | trim --left
  2. abc

Credits

This project is authored and maintained by Stefan Fischer.
The source code is available under the MIT License.
See LICENSE for further details.