-
Hello again. I have a question regarding the checks that are done to the data in DataPackages. Again, we're migrating from jsonschema (and /or marshmallow) and so concepts may not apply or not mean the same thing. In order to guarantee backwards-compatibility for our schemas, we make use of non-mandatory fields in our data. For example, with the following frictionless schema:
Note the "type" field does not have a "required" constraint. This (for us) implies that the json data can include or not the type attribute in each object of the "resources" table (i.e., an optional field). For example this data should comply:
But we get from frictionless one "missing-label" error and four "extra-cell" errors:
Looking for a way to pass some
And both previous errors are part of the "Baseline Check" so I guess I cannot deactivate them. So I have three questions:
I understand this are more "json" questions than "csv" ones. But the backwards-compatibility reasons should be relevant for all cases, I believe. So now I'm wondering if it's even possible to have missing fields in json objects with respect to the schema. I understand that maybe in csv this is harder to see (because the structure is more rigid in terms of number of columns) but I thought that maybe in json data it would be |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 4 replies
-
Hi @pchtsp! I'll answer the bits I can here: You can skip checks (including baseline) like this:
(https://framework.frictionlessdata.io/docs/guides/validation-guide#pickskip-errors) You should be able to find a list of possible errors to skip in the validate checks guide you link in your question. Similarly, you can write custom checks: https://framework.frictionlessdata.io/docs/guides/validation-guide#custom-checks (ps - please let us know if these docs don't help so we can improve them :-) ) Frictionless specs definitely allow for optional (or not-required) fields. Do my above links help you with this question though? Please let us know if this doesn't help or if you have other questions! Also @roll might be able to explain this in more detail. |
Beta Was this translation helpful? Give feedback.
-
@pchtsp from pprint import pprint
from frictionless import Package, Resource, Schema, Field, Detector, validate
package = Package(
resources=[
Resource(
data=[["f1"], ["v1"], ["v2"], ["v3"]],
schema=Schema(fields=[Field(name="f1"), Field(name="f2")]),
),
]
)
pprint(validate(package).valid) # False
def validate_with_optinal_fields(package):
for resource in package.resources:
resource.detector = Detector(schema_sync=True)
return validate(package)
pprint(validate_with_optinal_fields(package).valid) # True The function, of course, might be combined with #675 |
Beta Was this translation helpful? Give feedback.
@pchtsp
Your idea was very smart and only the bug was preventing it from working. With
[email protected]
:The function, of …