项目作者: takumakanari

项目描述 :
Embulk parser plugin for xml
高级语言: Ruby
项目地址: git://github.com/takumakanari/embulk-parser-xml.git
创建时间: 2015-03-14T11:52:54Z
项目社区:https://github.com/takumakanari/embulk-parser-xml

开源协议:MIT License

下载


XML parser plugin for Embulk

Parser plugin for Embulk.

Read data from input as xml and fetch each entries to output.

Overview

  • Plugin type: parser
  • Load all or nothing: yes
  • Resume supported: no

Types

  • xml: Find rows by SAX.
  • xpath: Find finds rows by Xpath, so you can process XML by more complex condition than xml type.

Configuration

XML

  1. parser:
  2. type: xml
  3. root: data/students/student
  4. schema:
  5. - {name: name, type: string}
  6. - {name: age, type: long}
  • type: specify this plugin as xml .
  • root: root property to start fetching each entries, specify in path/to/node style, required.
  • schema: specify the attribute of table and data type, required.

If you need to parse column as timestamp type, schema supports 2 optional parameters:

  1. schema:
  2. - {name: timestamp_column, type: timestamp, format: "%Y-%m-%d", timezone: "+0000"}
  • format: timestamp format to parse, required.
  • timezone: timestamp will be parsing in this timezone, "+0900" is used by default.

Xpath

  1. parser:
  2. type: xpath
  3. root: //data/students/student
  4. schema:
  5. - {path: name, type: string, name: name}
  6. - {path: age, type: long, name: age}
  7. - {path: hobbies/hobby, type: json, name: hobbies}
  • type: specify this plugin as xpath .
  • root: root property to start fetching each entries, specify in Xpath, ‘/‘’ is used by default.
  • schema: specify the attribute of table and data type, required.
  • namespaces: xml namespaces

If you need to parse column as timestamp type, schema supports 2 optional parameters:

  1. schema:
  2. - {name: timestamp_column, type: timestamp, format: "%Y-%m-%d", timezone: "+0000"}
  • format: timestamp format to parse, required.
  • timezone: timestamp will be parsing in this timezone, "+0900" is used by default.

Here is XML for xample:

  1. <data>
  2. <result>true</result>
  3. <students>
  4. <student>
  5. <name>John</name>
  6. <age>10</age>
  7. <hobbies>
  8. <hobby>music</hobby>
  9. <hobby>movie</hobby>
  10. </hobbies>
  11. </student>
  12. <student>
  13. <name>Paul</name>
  14. <age>16</age>
  15. <hobbies>
  16. <hobby>game</hobby>
  17. </hobbies>
  18. </student>
  19. <student>
  20. <name>George</name>
  21. <age>17</age>
  22. </student>
  23. <student>
  24. <name>Ringo</name>
  25. <age>18</age>
  26. </student>
  27. </students>
  28. </data>