我正在研究 Json 模式验证,下面是代码。我需要对元素“pii”进行架构检查。
这些值只能是在列(field1、field2 或 field3)中定义的列,并且所有列名都没有组合。列名和列数是动态的,并且每个 json 都不同。下面的当前 json 示例有 3 个列名,接下来可能会有更多。
- 如何为 pii 添加架构检查?
- 有没有更好的方法来验证 column_names(field1, field2, field3)、column_description 或 column_datatype 中是否有值?
- 我可以将列数据类型限制为一组数据类型值 ['integer','string','object']
谢谢你的时间!
import json
import jsonschema
sample_data= {
"title": "aaaa.bbbb.cccc",
"description": "customer accounts sample",
"owner_email": "abc.def@xyz.com",
"output_type": "parquet",
"version": "V1.0",
"columns":
{ "field1":
{ "column_description": "Description for field1",
"column_datatype": "string"
},
"field2":
{ "column_description": "Description for field2",
"column_datatype": "string"
},
"field3":
{ "column_description": "Description for field3",
"column_datatype": "string"
}
},
"pii":["field1","field2"]
}
FileSchema = {
"definitions":
{
"columnschema":
{
"properties":
{
"column_description": { "type" : "string" },
"column_datatype": { "type": "string" }
},
"required": ["column_description", "column_datatype"]
}
},
"type": "object",
"required": [ "title", "description", "owner_email", "output_type", "version","columns" ],
"properties":
{ "title": {"type":"string"},
"description": {"type":"string"},
"owner_email": {"type":"string"},
"output_type": {"type":"string"},
"version": {"type":"string"},
"columns":
{
"type": "object",
"patternProperties": {
"[A-Za-z0-9]" :{ "$ref": "#/definitions/columnschema" }
}
}
}
}
jsonschema.validate(instance=sample_data, schema=FileSchema)