Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autogeneration of instances #23

Open
greenTara opened this issue Feb 25, 2014 · 5 comments
Open

Autogeneration of instances #23

greenTara opened this issue Feb 25, 2014 · 5 comments

Comments

@greenTara
Copy link
Member

At several points in the development, consistency between the XSD and RNC schemas can be checked by autogeneration of instances. A very brief description is given here: http://wiki.ruleml.org/index.php/MYNG#Testing_Relax_NG_schemas

  1. Instances can be generated from the XSD schemas using oXygen
    a. From the Tools menu, select Generate Sample XML Files
    b. Browse for the local XSD (dr.xsd)
    c. Select the Root element (RuleML)
    d. Select the output folder (xsd/autogen-instances)
    e. Under Options, select the checkboxes to generate optional elements and attributes
    f. Select Random in the selections (not First)

This will typically give a large but not unmanageable file (~ 30 MB).
You might have to increase the Java heap to be able to validate this with Jing, however.
Associate the Relax NG schema dr-relaxed.rnc to this file, and validate.

Similary an RNC file can be converted to XSD using oXygen, and then used to generate instances that should validate against the XSD. The driver schema for this dr-autogen-compact.rnc . I generated a simplified RNG schema first, in /tmp, and then converted to XSD and ran the instance generator. There are some validation errors (against the schema that was used to created the instances) because of recursion limits. These can be fixed by a global find/replace, although that is annoying.

@greenTara
Copy link
Member Author

I have created a /tmp directory as a place for the artifacts of the autogeneration process. A better name might be something like /autogen.

When I perform the autogeneration of instances from a RNC schema, I have a script that creates the simplified monolithic RNG and writes it to a sibling folder of specified name (/tmp at the moment). I then use oxygen to convert the RNG to XSD, and run the oxygen tool on that XSD to create an instance. These last two files are also put into /tmp.
PROPOSAL:

  1. Change the name of the directory to "auto-gen"
  2. Put a README file into that directory, with the above information plus additional details as the testing procedure is further developed.
  3. It would be good to document the scripts for this. I'm not sure what is the best place for these. I typically have two different kinds of scripts - project-independent and project-dependent. I keep the project-independent scripts in a separate place in my file structure.

@greenTara
Copy link
Member Author

I believe we should also implement the driver schema (Issue #17, steps 1 and 2) before we go further with the instance generation. Otherwise, we have a maintenance problem. For example, in the most recent update of the dr.rnc, those changes are not reflected in dr-relaxed.rnc and dr-autogen-compact.rnc. I will start a new issue #26 for this.

@greenTara
Copy link
Member Author

I have found a subset of the DR modules that can be used to generate instances that are not too large.
The file is https://github.com/RuleML/reaction-ruleml/blob/master/relaxng/dr/dr-autogen-normal.rnc
The settings I use are:

generate optional elements (on)
generate optional attributes (off)
values of elements and attributes : Random
preferred number of repititions = 1
maximum recursivity level = 5
Type alternative strategy: Random
Choice Strategy: Random

There is still truncation. I repair that after generation using a Find/Replace with regex:

<Interval([^<>]*)/>  becomes <Interval$1><content/></Interval>
<arg([^<>]*)/> becomes <arg$1><Ind/></arg>

The instance generated in this way is only 5MB, so it is feasible to make several instances.

This also works with the original dr.xsd

@greenTara
Copy link
Member Author

The instance generation procedure is identifying several discrepancies between the XSD and Relax NG schemas. Instead of listing those here, I think it is better that each case have it's own issue.

I have opened a wiki Issue proposing that in future we allow empty elements. Similarly, would it be possible (even at this version of Reaction RuleML) to allow empty elements (with or without attributes) as syntactic sugar?

<Interval/>    

being equivalent to a kind of Null interval:

<Interval><content/></Interval>

and

<arg/>    

being equivalent to a Null argument

<arg><Ind/></arg>

greenTara added a commit to greenTara/reaction-ruleml that referenced this issue Mar 15, 2014
@greenTara
Copy link
Member Author

I have committed two config.xml files to the repository, which are the best settings I have found for instance generation. These settings may be imported in the Generate Sample XML tool in oXygen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant