项目作者: kislerdm

项目描述 :
Python library to convert google bigquery table schema to jsonschema
高级语言: Python
项目地址: git://github.com/kislerdm/gbqschema_converter.git
创建时间: 2020-04-07T17:16:38Z
项目社区:https://github.com/kislerdm/gbqschema_converter

开源协议:MIT License

下载


Google BigQuery Table Schema Converter

license
pyversion
coverage
test
downloads

Python library to convert Google BigQuery table schema into draft-07 json schema and vice versa.

The library includes two main modules:

  1. gbqschema_converter
  2. ├── gbqschema_to_jsonschema.py
  3. └── jsonschema_to_gbqschema.py

Each of those modules has two main functions:

  • json_representation: corresponds to json output (input for gbqschema_to_jsonschema).
  • sdk_representation: corresponds to Google Python SDK format output (input for gbqschema_to_jsonschema).

Installation

  1. python3 -m venv env && source ${PWD}/env/bin/activate
  2. (env) pip install --no-cache-dir gbqschema_converter

Usage: CLI

Convert json-schema to GBQ table schema

  1. (env) json2gbq -h
  2. usage: json2gbq [-h] (-i INPUT | -f FILE)
  3. Google BigQuery Table Schema Converter
  4. optional arguments:
  5. -h, --help show this help message and exit
  6. -i INPUT, --input INPUT
  7. Input object as string.
  8. -f FILE, --file FILE Input object as file path.

Example: stdin

Execution:

  1. (env) json2gbq -i '{
  2. "$schema": "http://json-schema.org/draft-07/schema#",
  3. "type": "array",
  4. "items": {
  5. "$ref": "#/definitions/element"
  6. },
  7. "definitions": {
  8. "element": {
  9. "type": "object",
  10. "properties": {
  11. "att_01": {
  12. "type": "integer",
  13. "description": "Att 1"
  14. },
  15. "att_02": {
  16. "type": "number",
  17. "description": "Att 2"
  18. },
  19. "att_03": {
  20. "type": "string"
  21. },
  22. "att_04": {
  23. "type": "boolean"
  24. },
  25. "att_05": {
  26. "type": "string",
  27. "format": "date"
  28. },
  29. "att_06": {
  30. "type": "string",
  31. "format": "date-time"
  32. },
  33. "att_07": {
  34. "type": "string",
  35. "format": "time"
  36. }
  37. },
  38. "required": [
  39. "att_02"
  40. ]
  41. }
  42. }
  43. }'

Output:

  1. 2020-04-08 21:42:51.700 [INFO ] [Google BigQuery Table Schema Converter] Output (5.52 ms elapsed):
  2. [
  3. {
  4. "description": "Att 1",
  5. "name": "att_01",
  6. "type": "INTEGER",
  7. "mode": "NULLABLE"
  8. },
  9. {
  10. "description": "Att 2",
  11. "name": "att_02",
  12. "type": "NUMERIC",
  13. "mode": "REQUIRED"
  14. },
  15. {
  16. "name": "att_03",
  17. "type": "STRING",
  18. "mode": "NULLABLE"
  19. },
  20. {
  21. "name": "att_04",
  22. "type": "BOOLEAN",
  23. "mode": "NULLABLE"
  24. },
  25. {
  26. "name": "att_05",
  27. "type": "DATE",
  28. "mode": "NULLABLE"
  29. },
  30. {
  31. "name": "att_06",
  32. "type": "TIMESTAMP",
  33. "mode": "NULLABLE"
  34. },
  35. {
  36. "name": "att_07",
  37. "type": "STRING",
  38. "mode": "NULLABLE"
  39. }
  40. ]

Example: file

Execution:

  1. (env) json2gbq -f ${PWD}/data/jsonschema.json

Output:

  1. 2020-04-08 21:57:25.516 [INFO ] [Google BigQuery Table Schema Converter] Output (6.39 ms elapsed):
  2. [
  3. {
  4. "description": "Att 1",
  5. "name": "att_01",
  6. "type": "INTEGER",
  7. "mode": "NULLABLE"
  8. },
  9. {
  10. "description": "Att 2",
  11. "name": "att_02",
  12. "type": "NUMERIC",
  13. "mode": "REQUIRED"
  14. },
  15. {
  16. "name": "att_03",
  17. "type": "STRING",
  18. "mode": "NULLABLE"
  19. },
  20. {
  21. "name": "att_04",
  22. "type": "BOOLEAN",
  23. "mode": "NULLABLE"
  24. },
  25. {
  26. "name": "att_05",
  27. "type": "DATE",
  28. "mode": "NULLABLE"
  29. },
  30. {
  31. "name": "att_06",
  32. "type": "TIMESTAMP",
  33. "mode": "NULLABLE"
  34. },
  35. {
  36. "name": "att_07",
  37. "type": "STRING",
  38. "mode": "NULLABLE"
  39. }
  40. ]

Convert GBQ table schema to json-schema

  1. (env) gbq2json -h
  2. usage: gbq2json [-h] (-i INPUT | -f FILE)
  3. Google BigQuery Table Schema Converter
  4. optional arguments:
  5. -h, --help show this help message and exit
  6. -i INPUT, --input INPUT
  7. Input object as string.
  8. -f FILE, --file FILE Input object as file path.

Example: stdin

Execution:

  1. (env) gbq2json -i '[
  2. {
  3. "description": "Att 1",
  4. "name": "att_01",
  5. "type": "INTEGER",
  6. "mode": "NULLABLE"
  7. },
  8. {
  9. "description": "Att 2",
  10. "name": "att_02",
  11. "type": "NUMERIC",
  12. "mode": "REQUIRED"
  13. },
  14. {
  15. "name": "att_03",
  16. "type": "STRING",
  17. "mode": "NULLABLE"
  18. },
  19. {
  20. "name": "att_04",
  21. "type": "BOOLEAN",
  22. "mode": "NULLABLE"
  23. },
  24. {
  25. "name": "att_05",
  26. "type": "DATE",
  27. "mode": "NULLABLE"
  28. },
  29. {
  30. "name": "att_06",
  31. "type": "DATETIME",
  32. "mode": "NULLABLE"
  33. },
  34. {
  35. "name": "att_07",
  36. "type": "TIMESTAMP",
  37. "mode": "NULLABLE"
  38. }
  39. ]'

Output:

  1. 2020-04-08 21:51:05.370 [INFO ] [Google BigQuery Table Schema Converter] Output (1.08 ms elapsed):
  2. {
  3. "$schema": "http://json-schema.org/draft-07/schema#",
  4. "type": "array",
  5. "items": {
  6. "$ref": "#/definitions/element"
  7. },
  8. "definitions": {
  9. "element": {
  10. "type": "object",
  11. "properties": {
  12. "att_01": {
  13. "type": "integer",
  14. "description": "Att 1"
  15. },
  16. "att_02": {
  17. "type": "number",
  18. "description": "Att 2"
  19. },
  20. "att_03": {
  21. "type": "string"
  22. },
  23. "att_04": {
  24. "type": "boolean"
  25. },
  26. "att_05": {
  27. "type": "string",
  28. "format": "date"
  29. },
  30. "att_06": {
  31. "type": "string",
  32. "pattern": "^[0-9]{4}-((|0)[1-9]|1[0-2])-((|[0-2])[1-9]|3[0-1])(|T)((|[0-1])[0-9]|2[0-3]):((|[0-5])[0-9]):((|[0-5])[0-9])(|.[0-9]{1,6})$"
  33. },
  34. "att_07": {
  35. "type": "string",
  36. "format": "date-time"
  37. }
  38. },
  39. "additionalProperties": false,
  40. "required": [
  41. "att_02"
  42. ]
  43. }
  44. }
  45. }

Example: file

Execution:

  1. (env) gbq2json -f ${PWD}/data/gbqschema.json

Output:

  1. 2020-04-08 21:55:20.275 [INFO ] [Google BigQuery Table Schema Converter] Output (1.72 ms elapsed):
  2. {
  3. "$schema": "http://json-schema.org/draft-07/schema#",
  4. "type": "array",
  5. "items": {
  6. "$ref": "#/definitions/element"
  7. },
  8. "definitions": {
  9. "element": {
  10. "type": "object",
  11. "properties": {
  12. "att_01": {
  13. "type": "integer",
  14. "description": "Att 1"
  15. },
  16. "att_02": {
  17. "type": "number",
  18. "description": "Att 2"
  19. },
  20. "att_03": {
  21. "type": "string"
  22. },
  23. "att_04": {
  24. "type": "boolean"
  25. },
  26. "att_05": {
  27. "type": "string",
  28. "format": "date"
  29. },
  30. "att_06": {
  31. "type": "string",
  32. "pattern": "^[0-9]{4}-((|0)[1-9]|1[0-2])-((|[0-2])[1-9]|3[0-1])(|T)((|[0-1])[0-9]|2[0-3]):((|[0-5])[0-9]):((|[0-5])[0-9])(|.[0-9]{1,6})$"
  33. },
  34. "att_07": {
  35. "type": "string",
  36. "format": "date-time"
  37. }
  38. },
  39. "additionalProperties": false,
  40. "required": [
  41. "att_02"
  42. ]
  43. }
  44. }
  45. }

Usage: python program

Convert json-schema to GBQ table schema

Example: output as json

  1. from gbqschema_converter.jsonschema_to_gbqschema import json_representation as converter
  2. schema_in = {
  3. "$schema": "http://json-schema.org/draft-07/schema#",
  4. "type": "array",
  5. "items": {
  6. "$ref": "#/definitions/element",
  7. },
  8. "definitions": {
  9. "element": {
  10. "type": "object",
  11. "properties": {
  12. "att_01": {
  13. "type": "integer",
  14. "description": "Att 1"
  15. },
  16. "att_02": {
  17. "type": "number",
  18. },
  19. }
  20. "required": [
  21. "att_02",
  22. ],
  23. },
  24. },
  25. }
  26. schema_out = converter(schema_in)
  27. print(schema_out)

Output:

  1. [{'description': 'Att 1', 'name': 'att_01', 'type': 'INTEGER', 'mode': 'NULLABLE'}, {'name': 'att_02', 'type': 'NUMERIC', 'mode': 'REQUIRED'}]

Example: output as list of SchemaField (SDK format)

  1. from gbqschema_converter.jsonschema_to_gbqschema import sdk_representation as converter
  2. schema_in = {
  3. "$schema": "http://json-schema.org/draft-07/schema#",
  4. "type": "array",
  5. "items": {
  6. "$ref": "#/definitions/element",
  7. },
  8. "definitions": {
  9. "element": {
  10. "type": "object",
  11. "properties": {
  12. "att_01": {
  13. "type": "integer",
  14. "description": "Att 1"
  15. },
  16. "att_02": {
  17. "type": "number",
  18. },
  19. },
  20. "required": [
  21. "att_02",
  22. ],
  23. },
  24. },
  25. }
  26. schema_out = converter(schema_in)
  27. print(schema_out)

Output:

  1. [SchemaField('att_01', 'INTEGER', 'NULLABLE', 'Att 1', ()), SchemaField('att_02', 'NUMERIC', 'REQUIRED', None, ())]

Convert GBQ table schema to json-schema

Example: output as json

  1. from gbqschema_converter.gbqschema_to_jsonschema import json_representation as converter
  2. schema_in = [
  3. {
  4. 'description': 'Att 1',
  5. 'name': 'att_01',
  6. 'type': 'INTEGER',
  7. 'mode': 'NULLABLE'
  8. },
  9. {
  10. 'name': 'att_02',
  11. 'type': 'NUMERIC',
  12. 'mode': 'REQUIRED'
  13. }
  14. ]
  15. schema_out = converter(schema_in)
  16. print(schema_out)

Output:

  1. {'$schema': 'http://json-schema.org/draft-07/schema#', 'type': 'array', 'items': {'$ref': '#/definitions/element'}, 'definitions': {'element': {'type': 'object', 'properties': {'att_01': {
  2. 'type': 'integer', 'description': 'Att 1'}, 'att_02': {'type': 'number'}}, 'additionalProperties': False, 'required': ['att_02']}}}

Example: output as list of SchemaField (SDK format)

  1. from gbqschema_converter.gbqschema_to_jsonschema import sdk_representation as converter
  2. from google.cloud.bigquery import SchemaField
  3. schema_in = [
  4. SchemaField('att_01', 'INTEGER', 'NULLABLE', 'Att 1', ()),
  5. SchemaField('att_02', 'NUMERIC', 'REQUIRED', None, ()),
  6. ]
  7. schema_out = converter(schema_in)
  8. print(schema_out)

Output:

  1. {'$schema': 'http://json-schema.org/draft-07/schema#', 'type': 'array', 'items': {'$ref': '#/definitions/element'}, 'definitions': {'element': {'type': 'object', 'properties': {'att_01': {
  2. 'type': 'integer', 'description': 'Att 1'}, 'att_02': {'type': 'number'}}, 'additionalProperties': False, 'required': ['att_02']}}}