Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lenient validation support to REMReM generate #171

Closed
raja-maragani opened this issue Jul 20, 2020 · 29 comments · Fixed by #170
Closed

Add lenient validation support to REMReM generate #171

raja-maragani opened this issue Jul 20, 2020 · 29 comments · Fixed by #170
Assignees
Labels
enhancement New or extended features.

Comments

@raja-maragani
Copy link

Description

The REMReM-Generate is generating the eiffel message and validating using the Eiffel protocol schemas. As an Eiffel User, I want to continue the Eiffel generate with non-fatal fields in the message.

We argue that only truly fatal errors should cause REMReM generate failure, for example when required fields have illegal contents.
If the field is optional, it is better to generate the rest of the event anyway, perhaps with a "marker" saying that the contents failed in the generated message (either directly in the field, or a CustomData mentioning it).

Motivation

REMReM-Generate avoid the message generate failures on non-fatal field errors.

Exemplification

Benefits

The REMReM generate validation will not fail on non-fetal Eiffel message field issues. This feature is an option to the user, the user can decide the enable/disable the Lenient Validation. By default it is disabled.

Possible Drawbacks

Its user choice so no drawbacks.

@magnusbaeck
Copy link
Member

If a field's contents is invalid because it doesn't pass the validation, why does it matter if the field is optional or mandatory?

What actual problem does this issue address?

@raja-maragani
Copy link
Author

If a field's contents is invalid because it doesn't pass the validation, why does it matter if the field is optional or mandatory?

What actual problem does this issue address?

This feature is an addon to REMReM generate, and REMReM generate can support one or more Eiffel protocols. If the user wants he can create his own Eiffel protocol and use it through REMReM Generate and Publish.

One of the Eiffel user is requested this new feature.
That REMReM Generate user is trying to translating the old Eiffel protocol messages to the REMReM semantics protocol messages.
Some of old protocol messages contain the invalid field content. so using this new feature users can continue the Eiffel generate with non-fatal fields in the message.

Eiffel message mandatory fields with invalid content - it is a fatal error's. so we can not give this flexibility on mandatory fields.

@magnusbaeck
Copy link
Member

This feature is an addon to REMReM generate, and REMReM generate can support one or more Eiffel protocols. If the user wants he can create his own Eiffel protocol and use it through REMReM Generate and Publish.

Yes, but surely the process for adding a new message protocol would entail creating new schema files, a new implementation of com.ericsson.eiffel.remrem.protocol.MsgService (referencing the new schemas), and building them all into the binaries?

Eiffel message mandatory fields with invalid content - it is a fatal error's. so we can not give this flexibility on mandatory fields.

Why is a field validation error fatal only if the field is mandatory? Such a difference in behavior between optional and non-optional fields matters to event consumers; suddenly they wouldn't be able to rely on present fields to contain valid data, possibly screwing up message deserialization (see below) and/or adding client-side validation requirements.

The proposal doesn't detail which kinds of validation problems are ignored for optional fields. All problems, i.e. will a message with a string value in a field declared as an integer be allowed? Or will we still require field values to have the correct type but give leniency when it comes to restrictions on e.g. numerical ranges and regexps? If we don't even require the types to match then deserialization of the messages would fail, resulting in dropped messages or clogged queues when message processing stalls.

If we are to give clients the option to partially disable validation, "lenient validation" isn't a good name. What if additional validations should be made optional, what should that option be called? "Sloppy validation"? "Even more lenient validation"? Lenient validation 2"? Instead of one bool option for each kind of validation that should be disabled some kind of enumerated set should be used, and the value space of the enum type should describe what's being disabled.

enum ValidationOption {
    IGNORE_OPTIONAL_FIELD_ERRORS,
    IGNORE_INVALID_GIT_COMMIT_URIS,
    IGNORE_SOMETHING_ELSE
}

Finally, as a REMReM operator and Eiffel bus owner within an organization I wouldn't want to make the lenient validation option available to my clients. One of the benefits of assisted publishing is that it forces clients to supply valid messages, and I don't want them to be able to get around that (even for optional fields).

@raja-maragani
Copy link
Author

Yes, but surely the process for adding a new message protocol would entail creating new schema files, a new implementation of com.ericsson.eiffel.remrem.protocol.MsgService (referencing the new schemas), and building them all into the binaries?

In the current validation library, we can specify the multiple types on the field, but I think it won't have the option to support our requirement. So if we create new schemas and new binary's we can not list out the optional field errors into the generated message.

The proposal doesn't detail which kinds of validation problems are ignored for optional fields
As per Eiffel schema, the fields are two types

  1. Required fields
  2. optional fields

On the above two types of fields, Eiffel performing the below validations:

  1. type validations
  2. pattern validations
  3. enum
  4. format

These validation errors on optional fields we are ignoring when lenientValidation enabled.
See the example
One of meta.link type is

        
        {
            "type":"COMPOSITION",
            "target":"aaaaaaaa"
         }

to ignore this optional field validation and information to the user the output will like the below

"remremGenerateFailures":[
                  {
                     "type":"pattern",
                     "message":"ECMA 262 regex \"^[0-9a-f]{8}-[0-9a-f]{4}-[1-5][0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$\" does not match input string \"aaaaaaaa\"",
                     "path":"/links/2/target"
                  }
               ]

If we are to give clients the option to partially disable validation, "lenient validation" isn't a good name.

Yes, we need a better naming word here.

enum ValidationOption {
    IGNORE_OPTIONAL_FIELD_ERRORS,
    IGNORE_INVALID_GIT_COMMIT_URIS,
    IGNORE_SOMETHING_ELSE
}

on IGNORE_INVALID_GIT_COMMIT_URIS, IGNORE_SOMETHING_ELSE we don't have the requirement ready today.
To proceed with this implementation we have the below proposal's from our side.
Proposal1: Instead of lenientValidation we will continue with the param name ignoreOptionalFieldValidationErrors and its boolean.
Proposal2: Continue with enum with single option IGNORE_OPTIONAL_FIELD_ERRORS.

One of the benefits of assisted publishing is that it forces clients to supply valid messages, and I don't want them to be able to get around that (even for optional fields).

Good point, I will add a recommendation in the documentation part.

@raja-maragani
Copy link
Author

@magnusbaeck do you have any comments on Proposals? if no comment I can proceed with Proposal1.

@magnusbaeck
Copy link
Member

On the above two types of fields, Eiffel performing the below validations:

type validations
pattern validations
enum
format
These validation errors on optional fields we are ignoring when lenientValidation enabled.

Skipping type validation is a really bad idea for the reason I mentioned in my earlier comment.

on IGNORE_INVALID_GIT_COMMIT_URIS, IGNORE_SOMETHING_ELSE we don't have the requirement ready today.

No, those were just hypothetical examples.

To proceed with this implementation we have the below proposal's from our side.
Proposal1: Instead of lenientValidation we will continue with the param name ignoreOptionalFieldValidationErrors and its boolean.
Proposal2: Continue with enum with single option IGNORE_OPTIONAL_FIELD_ERRORS.

I'd prefer the second proposal since it's more future proof but I can live with the first proposal.

Good point, I will add a recommendation in the documentation part.

This isn't just a documentation thing. I want requests with ignoreOptionalFieldValidationErrors set to be rejected since I don't want users to be able to disable validation.

@raja-maragani
Copy link
Author

Skipping type validation is a really bad idea for the reason I mentioned in my earlier comment.

Ok, I agree with you. I will modify code for type validation.

I'd prefer the second proposal since it's more future proof but I can live with the first proposal.

Ok, I will proceed with solution1.

I want requests with ignoreOptionalFieldValidationErrors set to be rejected since I don't want users to be able to disable validation.

so you mean add a new configurable param to reject the request param(ignoreOptionalFieldValidationErrors)?

@magnusbaeck
Copy link
Member

so you mean add a new configurable param to reject the request param(ignoreOptionalFieldValidationErrors)?

Yes, and it should default to not allowing ignoreOptionalFieldValidationErrors.

@raja-maragani
Copy link
Author

raja-maragani commented Aug 3, 2020

@e-backmark-ericsson and @eiffel-community/eiffel-community-maintainers do you have any comments from your side?

@sodero
Copy link

sodero commented Aug 3, 2020

The proposal sounds like a very slippery slope to me.

@raja-maragani
Copy link
Author

raja-maragani commented Aug 3, 2020

The proposal sounds like a very slippery slope to me.

Hi @sodero , could you please add a detailed comment on this?
Where do you feel it is a very slippery slope?
The entire proposal? or The option ignoreOptionalFieldValidationErrors enable and disable through configuration?

@sodero
Copy link

sodero commented Aug 3, 2020

Sorry for the not so detailed comment @raja-maragani, but it reflects my general fear of this being something that could be used to move problems around, from producer to consumer in this case. I agree with Magnus point above about enforcement being a key selling point of assisted publishing (and Eiffel in general of course). Please keep in mind that my opinion / fears come from an environment with no Eiffel legacy. I realize that you're in a different situation and therefore need to be a bit more flexible. Take my comment for what it is, I'm not saying that this is a bad proposal, it just makes me slightly nervous, perhaps without reason.

@raja-maragani
Copy link
Author

@sodero thanks for the detailed comment. But as I explained in my previous comments some of our customers are looking for this flexibility. and as @magnusbaeck said this feature does not require all the customers. So we want to proceed with the configurable parameters in configuration and the user again decides to skip or not the validations through rest endpoint.

@raja-maragani
Copy link
Author

Proposal Implementation changes:

  • generate REST endpoint will have new parameter: ignoreOptionalFieldValidationErrors (true/false)
  • Generate configuration file will have a new property lenientValidationEnableToUsers (true/false), if this value false the above rest endpoint param will always false. this is customer choice enable or disable lenient validation to users.
  • Validations will skip except (required and type)

@ebopalm
Copy link

ebopalm commented Aug 4, 2020

I like the suggested extension of the functionality but i have reservations towards using the REST interface to manage the quality control of a generated event. I would prefer if we could start by only having a configurable parameter to control it and defer the modification of the REST interface for later, if it is proven useful.

@raja-maragani
Copy link
Author

For me sounds like a good idea. and our current requirement will work with only the configurable parameter. I'm ok to defer the modification of the REST interface for later.

If no other comments, I will do the required implementation.
@magnusbaeck , @sodero and @eiffel-community/eiffel-community-maintainers any comments?

@raja-maragani
Copy link
Author

Seems most of the users are agreed to continue with the only configuration param.
The name of configuration param will be ignoreOptionalFieldValidationErrors(true/false) and the default value is false(to preserve the backward compatibility).

@e-backmark-ericsson
Copy link
Member

e-backmark-ericsson commented Aug 10, 2020

Hi there, I guess I'm representing the customer requesting this feature :) . So, what we're after is a way to send correctly formed and valid messages, even though the input given might contain data that is not valid. So, what we want REMReM to do is to skip the fields that don't pass validation and instead add some info about it in customData. We do not want REMReM to publish invalid messages. And that's the reason for differing between optional and mandatory fields. Optional fields are ok to leave out, but mandatory fields are of course not ok to skip.
In other words, even if we provide some garbage in to REMReM we want REMReM to send a valid event as long as all mandatory fields are given and well formed. Any optional fields that is not well formed should NOT be added to the resulting event.
To me, the config parameter could be called 'okToLeaveOutInvalidOptionalFields' true/false, with default value set to false.

@magnusbaeck
Copy link
Member

Oh, fields failing validation would be left out from the message. That makes total sense and wasn't clear to me. In that case, type validation problems should be treated like other validation problems, i.e. enabling ignoreOptionalFieldValidationErrors in the request would not result in a rejected request, only that mistyped fields are left out.

@raja-maragani
Copy link
Author

@e-backmark-ericsson thanks for adding more clarification.
For me okToLeaveOutInvalidOptionalFields and ignoreOptionalFieldValidationErrors both config param names giving equal meaning.
I am ok to change the name from ignoreOptionalFieldValidationErrors to okToLeaveOutInvalidOptionalFields. I will do that.
@magnusbaeck I will modify the code for type validation. it means

  • all kind of validations will ignore on optional parameters
  • invalid option fields removed from message body
  • removed invalid option fields will add into customData.

@e-backmark-ericsson
Copy link
Member

@raja-maragani , that suggested name came out of me quite quickly. I haven't checked if it is inline with existing config parameters in REMReM. I just wanted it to be more clear on what it's intended to control.

Also, following the comment from @ebopalm , are you now implementing this as an instance specific config parameter and not a REST parameter? As the REMReM instances we use are often used from multiple clients I see that they don't always share the same need, so limiting this configuration to be set for the whole REMReM instance only will cause unintended side effects. Existing users of those instances will no longer get errors back from REMReM if the parameter is set to true. For what I can see from my perspective it is the REST parameters that is important to get. An instance global parameter is less important.

Also, there are now 3 PRs referencing the same REMReM Generate issue, and with almost identical text in them. Is that how you normally do it? I'd appreciate if the text in those PRs where more tailored towards the specific repos they concern. But it is not a requirement from my side if you feel it should not be that way.

@raja-maragani
Copy link
Author

@e-backmark-ericsson, We modified this feature control by only config param. at that time we do not have much info about this use case. as @ebopalm said if we have a use case we will add REST control feature also.
We just defer the REST control until get the valid use case.

If multiple customers are using the same REMReM, to manage quality control will difficult if it is a REST control, I mean some following strict validations and some of following lenient validation.

I think domain level it causes uncertainty.

@e-backmark-ericsson
Copy link
Member

Well, I guess another config param could be added to mitigate that - allowPerCallValidationControl or something similar. Default false. When true it is ok to supply the REST parameter controlling the validation and leaving out of invalid fields.
Still, I don't really see the problem of 'uncertainty' here. Whether the client provides some optional parameters or not to REMReM is nothing that an instance owner should care about I believe. If there is a need to enforce certain optional parameters to be included then another feature is needed in REMReM, allowing instance owners to enforce a more restrictive schema on some events.

@ebopalm
Copy link

ebopalm commented Aug 11, 2020

As we do not have any clear use case for the need of a REST base quality control functionality, i would not mind if we start with a configuration parameter first. The reason behind this is that i like to leave the REST interface untouched for now with respect for quality control until we have explored policy based quality control first. A change to the REST interface is connected with usage commitments and compatibility.

@e-backmark-ericsson
Copy link
Member

We had an internal meeting about this issue and we propose that one configuration parameter and one REST parameter is implemented to support this feature.

My proposal of there names are the following:
Config param (for the whole instance): lenientValidationEnabledToUsers (default false)
REST param (for each call): okToLeaveOutInvalidOptionalFields (default false)

@raja-maragani, please verify that those names are inline with existing config params and REST params in REMReM before settling on them.

If the config param is set to false and the REST param is set to true REMReM should return with an error code stating that it is not allowed to ask for lenient validation.

If both params are true then REMReM will remove the optional event fields from the input event data that does not validate successfully, and information about why it was removed will be added to customData in the generated event.

@magnusbaeck, @sodero , are you ok with this?

@ebopalm
Copy link

ebopalm commented Aug 11, 2020

Then i REST my case.

@raja-maragani
Copy link
Author

@raja-maragani, please verify that those names are inline with existing config params and REST params in REMReM before settling on them.

Yes, those param names suitable with code changes. I will do rename param names as you suggested.

@raja-maragani
Copy link
Author

Code changes are ready for review.
@e-backmark-ericsson I am unable to change the PR title's

Also, there are now 3 PRs referencing the same REMReM Generate issue, and with almost identical text in them.

I will consider this comment for future issues.

@sodero
Copy link

sodero commented Aug 12, 2020

If the config param is set to false and the REST param is set to true REMReM should return with an error code stating that it is not allowed to ask for lenient validation.

If both params are true then REMReM will remove the optional event fields from the input event data that does not validate successfully, and information about why it was removed will be added to customData in the generated event.

@magnusbaeck, @sodero , are you ok with this?

Sounds good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New or extended features.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants