0

我正在研究 Json 模式验证,下面是代码。我需要对元素“pii”进行架构检查。

这些值只能是在列(field1、field2 或 field3)中定义的列,并且所有列名都没有组合。列名和列数是动态的,并且每个 json 都不同。下面的当前 json 示例有 3 个列名,接下来可能会有更多。

  1. 如何为 pii 添加架构检查?
  2. 有没有更好的方法来验证 column_names(field1, field2, field3)、column_description 或 column_datatype 中是否有值?
  3. 我可以将列数据类型限制为一组数据类型值 ['integer','string','object']

谢谢你的时间!

import json
import jsonschema

sample_data= {
    "title": "aaaa.bbbb.cccc",
    "description": "customer accounts sample",
    "owner_email": "abc.def@xyz.com",
    "output_type": "parquet",
    "version": "V1.0",
    "columns":
    {   "field1":
        {   "column_description": "Description for field1",
            "column_datatype": "string"
        },
        "field2":
        {   "column_description": "Description for field2",
            "column_datatype": "string"
        },
        "field3":
        {   "column_description": "Description for field3",
            "column_datatype": "string"
        }
    },
    "pii":["field1","field2"]
}

FileSchema = {
    "definitions":
    {
        "columnschema":
        {
            "properties":
            {
                "column_description": { "type" : "string" },
                "column_datatype": { "type": "string" }
            },
            "required": ["column_description", "column_datatype"]
        }
    },

    "type": "object",
    "required": [ "title", "description", "owner_email", "output_type", "version","columns" ],
    "properties": 
    {   "title": {"type":"string"},
        "description": {"type":"string"},
        "owner_email": {"type":"string"},
        "output_type": {"type":"string"},
        "version": {"type":"string"},
        "columns":
        {
            "type": "object",
            "patternProperties": {
                "[A-Za-z0-9]" :{ "$ref": "#/definitions/columnschema" }
                                }
        }
    }
}

jsonschema.validate(instance=sample_data, schema=FileSchema)

4

0 回答 0