-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for setting DFDL external variables #16
Add support for setting DFDL external variables #16
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
I have suggestions ti improve phrasing in help text and doc. That's all though.
// ignored and is not added as a variable. This supports expressions that determine that | ||
// a variable does not apply to a schema and to ignore it, since all varibles passed to | ||
// withExternalVariables must be valid for that schema. | ||
final HashMap<String, String> variableMap = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I harp on this I know. Please use "LinkedHashMap" for deterministic behavior.
@@ -51,6 +53,12 @@ | |||
@Tags({"xml", "json", "daffodil", "dfdl", "schema", "xsd"}) | |||
@CapabilityDescription("Use Daffodil and a user-specified DFDL schema to transform data to an infoset, represented by either XML or JSON.") | |||
@WritesAttribute(attribute = "mime.type", description = "Sets the mime type to application/json or application/xml based on the infoset type.") | |||
@DynamicProperty( | |||
name = "Name of external variable defined in a DFDL schema", | |||
value = "Value of the external variable. Variable is ignored if value evaluates to the empty string", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe say,
value = "Value to set for the DFDL external variable. May be an expression. The DFDL variable is not set if the value expression evaluates to an empty string. ",
Presumably it is clear what the expression language is from the context?
@@ -48,6 +50,12 @@ | |||
@Tags({"xml", "json", "daffodil", "dfdl", "schema", "xsd"}) | |||
@CapabilityDescription("Use Daffodil and a user-specified DFDL schema to transform an XML or JSON representation of data back to the original data format.") | |||
@WritesAttribute(attribute = "mime.type", description = "If the FlowFile is successfully unparsed, this attriute is removed, as the MIME Type is no longer known.") | |||
@DynamicProperty( | |||
name = "Name of external variable defined in a DFDL schema", | |||
value = "Value of the external variable. Variable is ignored if value evaluates to the empty string", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same rephrase as for parser.
<p> | ||
Some DFDL schemas may require the use of plugins, such as layers, user defined functions, or custom | ||
character encodings. To make these available to NiFi Daffodil processors, these plugins can be compiled to | ||
jars, and paths to the jars or parent directory may be added to the <tt>Plugins and Schemas</tt> property. For |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe say "paths to individual jars or to a parent directory containing one or more jars"
DFDL Schema File: /com/example/foo.dfdl.xsd | ||
Plugins and Schemas: /usr/share/nifi-daffodil/schemas/ | ||
</pre></code> | ||
<p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Draw people's attention to that leading slash like
"In the above notice how the paths begin with a '/' which denotes the file is to be found at that location inside a jar file. "
</dd> | ||
<dt>Cache TTL after last access</dt> | ||
<dd> | ||
<p> | ||
Defines the cache time-to-live, or how long to keep an unused compiled DFDL schemas in the cache. To avoid | ||
compilation, it is recommended that this should be larger than the amount of time it is expected | ||
for a DFDL schema to be unused. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rephrase? I didn't understand how to set this.
Maybe describe by example: if a schema is used occasionally (once a day perhaps) then set this to 24 hours to avoid recompiling it each time it is used.
You also need to say what are the units of measure for this time limit. Seconds, minutes, hours?
You also need to say what happens if it is set to zero, i.e., what is the acceptable range of values.
Does this have a useful default, such that people are unlikely to need to change this? If so you should say that this defaults to N which works for most situations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix.
It turns out the default is 30 minutes, which feels like it might be too short? In fact, I wonder if we should default to zero seconds, which disables cache expiration? I imagine it could be very confusing to see random hiccups if we have to recompile a schema.
@@ -110,16 +122,20 @@ <h2>Compiled DFDL Schema Cache</h2> | |||
<dl> | |||
<dt>Cache Size</dt> | |||
<dd> | |||
<p> | |||
Defines the maximum number of DFDL schemas that can be compiled and saved for rapid reuse from the | |||
cache. To avoid compilation, it is recommended that this value be larger than the expected number | |||
of possible DFDL schemas. Setting this value to 0 disables the cache, though this is not | |||
recommended. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this have a useful default, such that people probably don't need to change this? Then you should say that to discourage people from frobbing this without reason.
@@ -111,16 +123,20 @@ <h2>Compiled DFDL Schema Cache</h2> | |||
<dl> | |||
<dt>Cache Size</dt> | |||
<dd> | |||
<p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just saw this same thing above. Does it really need to be repeated textually? Gaaak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, NiFi doesn't have a way to reduce this duplication. And we can't do a symlink or something because the first paragraph and title are specific to parse vs unparse.
The name/value of NiFi dynamic properties are now treated as DFDL external variables. Property values are allowed to be NiFi expressions, which are evaluated and set as the variable values. If a property value evaluates to the empty string, it is ignored--this helps when a NiFi expression determines that a variable does not apply for a schema, since DFDL requires that all variables passed in externally must be valid for the schema. This does mean it is not possible to have a variable with the value of an empty string, but this should be rare and worked around. This also refactored exceptions, so that functions throw correct exceptions instead of turning everything into an IOException, which worked but is not technically correct. Also added a new "External Variables" section to additional details pages documening this capability, and cleaned up the additional details pages.
ab8ddf4
to
6a2c1a5
Compare
The name/value of NiFi dynamic properties are now treated as DFDL external variables. Property values are allowed to be NiFi expressions, which are evaluated and set as the variable values. If a property value evaluates to the empty string, it is ignored--this helps when a NiFi expression determines that a variable does not apply for a schema, since DFDL requires that all variables passed in externally must be valid for the schema. This does mean it is not possible to have a variable with the value of an empty string, but this should be rare and worked around.
This also refactored exceptions, so that functions throw correct exceptions instead of turning everything into an IOException, which worked but is not technically correct.
Also added a new "External Variables" section to additional details pages documening this capability, and cleaned up the additional details pages.