-
Hello everyone, (this may be similar to #637).
I share a small example in example.zip In the example I have a jsonschema file: (see: instance.json) I see in all the examples and reference docs that you always store a reference to the "data file" (,csv, .json) inside the schema/metadata. We do not have a real coupling between them: we store the schemas (which could be PackageSchemas in frictionless-like terminology) in our database and we want to use the schemas to validate datasets that arrive from many places (an api endpoint in our python backend, in a browser through our javascript front-end, etc.). I'm not sure how to mix a json-data (dict-like in python) object and validate with my Frictionless PackageSchema without having to write it into a file. I'm looking for something like what we currently do with jsonschema: from jsonschema import Draft7Validator
def validate_data(data):
# get the schema from a file or a database ....
schema = get_schema()
# load validator with schema:
validator = Draft7Validator(schema)
# feed the validator some data and check for errors
return validator.iter_errors(data) Is this possible? Or something that's equivalent? Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
tagging @roll to take a look :-) @pchtsp I was going to suggest that you look at #637 but you've already seen it! They are definitely related, so we'll try to answer you here, but I might eventually link/deduplicate these into one discussion. Thanks for bringing this up and for the clear explanation! |
Beta Was this translation helpful? Give feedback.
-
@pchtsp An excerpt from https://framework.frictionlessdata.io/docs/guides/describing-data/#describing-a-resource
So consider you store somewhere:
You can then use them like this: from frictionless import validate
report = validate(input['durations'], schema='durations.schema.json') Another approach, would be creating a data package template (it can be exactly the same you have created -- not having path/data):
{
"name": "DataSchema",
"resources": [
{
"name": "DurationsSchema",
"schema": {
"fields": [
{
"name": "duration",
"type": "number",
"constraints": {
"required": true
}
},
{
"name": "job",
"type": "number",
"constraints": {
"required": true
}
},
{
"name": "mode",
"type": "number",
"constraints": {
"required": true
}
}
]
}
},
{
"name": "JobsSchema",
"schema": {
"fields": [
{
"name": "id",
"type": "number",
"constraints": {
"required": true
}
},
{
"name": "successors",
"type": "number",
"constraints": {
"required": true
}
}
]
}
},
{
"name": "NeedsSchema",
"schema": {
"fields": [
{
"name": "job",
"type": "number",
"constraints": {
"required": true
}
},
{
"name": "mode",
"type": "number",
"constraints": {
"required": true
}
},
{
"name": "need",
"type": "number",
"constraints": {
"required": true
}
},
{
"name": "resource",
"type": "string",
"constraints": {
"required": true
}
}
]
}
},
{
"name": "ResourcesSchema",
"schema": {
"fields": [
{
"name": "available",
"type": "number",
"constraints": {
"required": true
}
},
{
"name": "id",
"type": "string",
"constraints": {
"required": true
}
},
{
"name": "type",
"type": "string"
}
]
}
}
]
} Then you need some code on top Frictionless to make it work: from frictionless import Package, validate
def validate_package_using_template(input):
package = Package('package.template.json')
for name, data in input.items():
# Here we link out template's resource with actual data
package.get_resource(name).data = data
return validate(package)
validate_package_using_template(input) PS. |
Beta Was this translation helpful? Give feedback.
@pchtsp
Hi, if I got you right, the migration from your JSONSchema approach to a Frictionless approach would be in creating a list of individual Table Schemas (not resources).
An excerpt from https://framework.frictionlessdata.io/docs/guides/describing-data/#describing-a-resource
So consider you store somewhere:
You can then use them like this:
Another approach, would be…