Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Migrate comment documentation to nodedef doc attributes for std libraries #1950

Open
kwokcb opened this issue Jul 24, 2024 · 2 comments

Comments

@kwokcb
Copy link
Contributor

kwokcb commented Jul 24, 2024

Proposal

This is an off-shoot of a discussion about PugiXML custom changes with @jstone-lucasfilm and @ld-kerley
The impetus being to remove the custom comment and line spacing custom changes to the library.

The main reason to have this is to allow for documentation of node definitions using XML comments.
The proposal here is to move the comments into the nodedef doc meta-data tag which makes it publicly
visible to integrations. (@ashwinbhat , for glTF PBR spec ratification this will at least start to have docs on nodes
which can be enhanced with things like boundary conditions).

As an additional change, if nodegraph XML comments are to be preserved then the doc tag would be needed there.

Test

Here is some sample code which could work for nodedefs and the result. It's mostly a heuristic to try and
guess what comments go with which nodedefs. The comment immeiate before a nodedef(s) is the doc string.

def transferCommentsToNodeDefs(libFile):
    
    readOptions = mx.XmlReadOptions()
    readOptions.readComments = True
    readOptions.readNewlines = True    
    readOptions.upgradeVersion = False

    outputDoc = mx.createDocument()
    mx.readFromXmlFile(outputDoc, libFile,  mx.FileSearchPath(), readOptions)        

    # Extract out comments and nodedefs
    currentComment = []
    children = outputDoc.getChildren()
    for child in children:
        if child.getCategory() == 'comment':
            docstring = child.getAttribute('doc')
            if len(docstring) > 0:
                docstring = re.sub(r'\s+', ' ', docstring.replace('\n', ' ').lstrip())
                docstring = docstring.strip()
            # Repace end .. with .
            if docstring.endswith('..'):
                docstring = docstring[:-1]
            currentComment.append(['comment', docstring, child.getName() ])
        elif child.getCategory() == 'nodedef':
            currentComment.append(['nodedef', child.getName()])

    strippedComments = []
    # Heuristic to find comments for nodedefs:
    # 1. Accumulate nodedefs until a comment is found
    # 2. Add an association between the comment and nodedefs
    # 3. Skip if a comment is found immediately before a comment
    # 4. Keep track of comments to remove
    hitComment = False
    nodedefList = []
    removeComments = []
    for i in range(len(currentComment)-1, -1, -1):
        if not hitComment and currentComment[i][0] == 'comment':
            if len(nodedefList) > 0:
                # Keep track of comments to remove.
                # Add [ nodedef, comment ] pair 
                removeComments.append(currentComment[i][2])
                for nodedef in nodedefList:
                    strippedComments.append([nodedef, currentComment[i][1]])
                nodedefList.clear();
            hitComment = True
        elif currentComment[i][0] == 'nodedef':
            nodedefList.append(currentComment[i][1])
            hitComment = False

    # Apply comments to nodedefs:
    # 1. Find nodedefs
    # 2. Add new comments to existing comments
    print('nodedefs with new comments')
    for i in range(len(strippedComments)):
        print(strippedComments[i])

        nodedef = outputDoc.getChild(strippedComments[i][0])
        if nodedef is None:
            print('Cannot find nodedef:', strippedComments[i][0])
            continue
        currentDoc = nodedef.getAttribute('doc')
        newDoc = strippedComments[i][1]
        if len(currentDoc) > 0:
            newDoc = newDoc + " " + currentDoc
        nodedef.setAttribute('doc',  newDoc)

    # Remove comments
    for i in range(len(removeComments)):
        outputDoc.removeChild(removeComments[i])

    return outputDoc

Snippets from result on stdlib_defs.mtlx

  <nodedef name="ND_sign_float" node="sign" nodegroup="math" doc="Node: <sign>. Sign of each input channel: -1, 0 or +1">
    <input name="in" type="float" value="0.0" />
    <output name="out" type="float" defaultinput="in" />
  </nodedef>
  
  <nodedef name="ND_clamp_float" node="clamp" nodegroup="math" doc="Node: <clamp>. Clamp incoming value to a specified range of values.">
    <input name="in" type="float" value="0.0" />
    <input name="low" type="float" value="0.0" />
    <input name="high" type="float" value="1.0" />
    <output name="out" type="float" defaultinput="in" />
  </nodedef>
  
   <nodedef name="ND_range_color3" node="range" nodegroup="adjustment" doc="Node: <range> Supplemental Node. Remap a value from one range of float/color/vector values to another, optionally applying a gamma correction in the middle, and optionally clamping output values.">
    <input name="in" type="color3" value="0.0, 0.0, 0.0" />
    <input name="inlow" type="color3" value="0.0, 0.0, 0.0" />
    <input name="inhigh" type="color3" value="1.0, 1.0, 1.0" />
    <input name="gamma" type="color3" value="1.0, 1.0, 1.0" />
    <input name="outlow" type="color3" value="0.0, 0.0, 0.0" />
    <input name="outhigh" type="color3" value="1.0, 1.0, 1.0" />
    <input name="doclamp" type="boolean" value="false" />
    <output name="out" type="color3" defaultinput="in" />
  </nodedef>
@ld-kerley
Copy link
Contributor

I think it's a great idea to formalize the documentation, and move it away from comments in the code, and in to the actual data. Another win in the fight to make MaterialX a data driven library :). I think it perhaps also opens future doors to auto generated node library documentation in an easier way.

Another possible syntax to consider (not weighing any heavy opinion in either direction yet) would be to add a <doc> child tag to the <nodedef>

ie.

  <nodedef name="ND_sign_float" node="sign" nodegroup="math">
    <doc>Node: <sign>. Sign of each input channel: -1, 0 or +1</doc>
    <input name="in" type="float" value="0.0" />
    <output name="out" type="float" defaultinput="in" />
  </nodedef>
  
  <nodedef name="ND_clamp_float" node="clamp" nodegroup="math">
    <doc>Node: <clamp>. Clamp incoming value to a specified range of values.</doc>
    <input name="in" type="float" value="0.0" />
    <input name="low" type="float" value="0.0" />
    <input name="high" type="float" value="1.0" />
    <output name="out" type="float" defaultinput="in" />
  </nodedef>
  
   <nodedef name="ND_range_color3" node="range" nodegroup="adjustment" >
     <doc>Node: <range> Supplemental Node. Remap a value from one range of float/color/vector values to another, optionally applying a gamma correction in the middle, and optionally clamping output values.</doc>
    <input name="in" type="color3" value="0.0, 0.0, 0.0" />
    <input name="inlow" type="color3" value="0.0, 0.0, 0.0" />
    <input name="inhigh" type="color3" value="1.0, 1.0, 1.0" />
    <input name="gamma" type="color3" value="1.0, 1.0, 1.0" />
    <input name="outlow" type="color3" value="0.0, 0.0, 0.0" />
    <input name="outhigh" type="color3" value="1.0, 1.0, 1.0" />
    <input name="doclamp" type="boolean" value="false" />
    <output name="out" type="color3" defaultinput="in" />
  </nodedef>

Which might perhaps allow for multiple line documentation strings? I'm not sure if new-line characters are allowed inside of XML attributes....

I also don't know if it might allow for a slightly more optimized XML reader - if we knew we didn't need the documentation (ie. when we're doing shader gen only) we could skip the <doc> elements.

Or perhaps if we don't want XML text elements in the document, we could add

<doc helptext="Node: <range> Supplemental Node. Remap a value from one range of float/color/vector values to another, optionally applying a gamma correction in the middle, and optionally clamping output values."/>

still giving us the advantage of filtering out the read of <doc> if we want to avoid that, and also leaving the possibility of adding more attributes to the <doc> tag, like perhaps thumbnail="<path_to_image>" or others.

@kwokcb
Copy link
Contributor Author

kwokcb commented Jul 27, 2024

Currently only the doc attribute on nodes or inputs are used though a <doc> element does seem inline with earlier discussion of separating docs from structure. If there are separate <doc> then indeed we could add in more info. This is what I did when transforming the defs into JSON for my web editor and included things like USD, OpenPBR thumbnails and other stuff like ui color coding. At this point If we go this far, I'd just separate out docs from the nodedef and use a "reference" to docs instead embedding it. I had something like this going since I need to add in other "asset" meta-data like authoring info etc.


AFAIK it should be assumed there are no special characters such as newlines, tabs etc. The W3C XML/HTML spec says you should not put any special characters including < and > into the document as validators will fail / reject it as invalid XML. This is currently the case when you try it out with these.

I'd rather formalize formatting than keep newlines, tabs, etc, For instance if a mimetype was supported then you could (if allowed syntax in XML) add "standardized" formatting -- say using Markdown. You could even sneak in diagrams, formulas etc :).

Anyways, just "pie in the sky" stuff.

As a first step if moving comments into doc attributes is acceptable I can put up a PR for stdlib.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants