Include docstrings, maybe a schema? #10

tlambert03 · 2021-01-12T18:55:51Z

Love that there is now a central kernels repo. I think it would be great to put as much of the universal information here as possible.

I know it's a lot of work... but it would be great if the kernels here could be documented with a good (parseable) format ... I guess doxygen is the defacto C standard? (not sure there). And then that info wouldn't need to be duplicated (for instance, pyclesperanto.absolute and clij2 Absolute). Adding docstrings here could be a first step.

a related idea (maybe an alternative?) would be to have a very easily parseable schema that that does not require doxygen to parse. The downside there is that we'd need to make sure the kernels were in sync with the schema (we could write tests for that ... but it's possible that the logic for those tests would be as complicated as simply using doxygen to parse in the first place). An example schema:

version: 2.0.0.10
operations:
  - name: absolute
    description: Computes the absolute value of every individual pixel x in a given image.
    details: <pre>f(x) = |x| </pre>
    categories: [filter, in_assistant]
    dimension_support: [2, 3]
    tier: 1
    parameters:
      - name: source
        type: IMAGE_src_TYPE
        description: The input image to be processed.
      - name: destination
        type: IMAGE_dst_TYPE
        description: The output image where results are written into.
    references: ...
  - name: add_image_and_scalar
    ...

(this sort of thing would also make it much easier to create templates that let you convert scripts between languages very easily... as well as facilitate autogeneration of most of the .py files in pyclesperanto)

The text was updated successfully, but these errors were encountered:

tlambert03 · 2021-01-14T21:11:58Z

@haesleinhuepf ... curious how these thoughts strike you? Playing around with plugin autogeneration and general code-deduplication in pyclesperanto is definitely something I'd like to work on, but the solutions there will depend very much on the structure in this submodule.

StRigaud · 2021-01-27T18:00:32Z

I guess doxygen is the defacto C standard?

Indeed, that would be the first choice on my side.

haesleinhuepf · 2021-01-28T20:26:17Z

Hey @tlambert03 , your suggestion is amazing! We could copy-paste the information from the current pyclesperanto docstrings, curate it here and in the future, we auto-generate the doc-strings (in Python, Java, C++, HTML/website) from this repository.

Side-topic: How about the "kernels" which have no opencl-code associated directly. Do you think we could maintain their documentation here as well or should their documentation live somewhere else? I'm talking about filters like top-hat, for example.

tlambert03 · 2021-01-28T20:36:10Z

Hey @tlambert03 , your suggestion is amazing! We could copy-paste the information from the current pyclesperanto docstrings, curate it here and in the future, we auto-generate the doc-strings (in Python, Java, C++, HTML/website) from this repository.

that's the dream! You can tell me how easy/hard that will be for Java or C++ ... but for python and HTML, it should be no sweat. And then we all know we're in sync :)

Side-topic: How about the "kernels" which have no opencl-code associated directly. Do you think we could maintain their documentation here as well or should their documentation live somewhere else? I'm talking about filters like top-hat, for example.

good question. I'd hope we can find a place for them here too. It'd be pretty trivial to do that with the schema idea:

version: 2.0.0.10
operations:
  - name: top-hat
    description: Applies a top-hat filter for background subtraction to the input image.
    tier: 2
    parameters:
      - name: input
        ...
    comprises:
       - minimum_box
       - maximum_box
       - add_images_weighted

and with doxygen, there's probably some sort of stub concept we can use to the same effect. I'm not that familiar with doxygen though. @StRigaud do you have thoughts here? Does anyone have strong feelings for or against an independent schema vs strictly putting it in the cl code itself?

haesleinhuepf · 2021-01-28T21:06:55Z

but for python and HTML, it should be no sweat.

I think it's also pretty straightforward in Java. I was thinking of translating the current code-generation machinery from Java to Python anyway. Just another code translation project ... this clEsperanto project turns out to be translating a lot of code. I hope it will spare at least that amount of code translation afterwards.🤣

Joking aside: This massive refactoring gives me the chance to clean up some messy parts. Thanks for being with me. And thanks for your patience.

Minor comment: Can we leave out the version from the scheme you suggested? I think git tags / versions can do the job better.

tlambert03 · 2021-01-29T13:51:03Z

Can we leave out the version from the scheme you suggested? I think git tags / versions can do the job better.

In the sense that we don't need to hard-code it sure. We can pull the actual value from version control, but I do think the actual "schema instance" (i.e. the yml or json file "product" that other repos/users use) should have a version string in it.

If this concept of a schema outside of the CL code itself (but in this repo) is at least moderately attractive, I can start putting together a little more concrete example... which we can then populate more fully once we like the pattern. Sound good? We basically need two files: one to define the schema itself (i would probably use JSON Schema unless you have any objections... it has good support and validators for all of the languages at play here), and then we need the actual schema "instance" like the toy examples I put above. That too could be in JSON, but yaml obviously has very attractive human-readability.

Ultimately, if we regret putting this info outside of the CL code itself, I don't think it will be that hard to covert (and at that point all of the info will be nicely in one place anyway).

I was thinking of translating the current code-generation machinery from Java to Python anyway

I had some thoughts on that too, if you have time, I'd be curious to hear what your strategy is before you write it all out?

StRigaud · 2021-01-29T14:05:08Z

It looks good for me. 👍

This will also help us define the filter name and parameters across all the language (Python, C++, Fiji, Java...).

haesleinhuepf · 2021-01-31T10:01:51Z

I'm just adding an idea here: How about aiming compatibility with the common workflow language?
https://www.commonwl.org/

StRigaud · 2021-02-04T16:28:50Z

Got two questions from #12 :

What's the output: true at the end of your example in the PR?
Is this format allows newlines in the same field?

For the last, I assume yes but prefer to ask. Mainly for the description field.

tlambert03 · 2021-02-04T16:47:59Z

What's the output: true at the end of your example in the PR?

that was a super-preliminary way to indicate that this "input" parameter also serves as a kernel output. was going to raise that as a thing to discuss... Happy to represent that concept differently!

Is this format allows newlines in the same field?

yep, the easiest way to indicate that in YAML is a pipe character:

operation:
  description: |
    any long string
    with line breaks, etc...

Though, there are alternate syntaxes that can also be used. see: https://yaml-multiline.info/

StRigaud mentioned this issue Jan 27, 2021

Kernel signature uniformisation #11

Open

haesleinhuepf mentioned this issue Jan 30, 2021

Update roadmap / timeline clEsperanto/clesperanto.github.io#2

Open

tlambert03 mentioned this issue Jan 30, 2021

Add schema.json and a start to cle.yml (implementing schema) #12

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Include docstrings, maybe a schema? #10

Include docstrings, maybe a schema? #10

tlambert03 commented Jan 12, 2021 •

edited

Loading

tlambert03 commented Jan 14, 2021

StRigaud commented Jan 27, 2021

haesleinhuepf commented Jan 28, 2021

tlambert03 commented Jan 28, 2021 •

edited

Loading

haesleinhuepf commented Jan 28, 2021

tlambert03 commented Jan 29, 2021 •

edited

Loading

StRigaud commented Jan 29, 2021

haesleinhuepf commented Jan 31, 2021

StRigaud commented Feb 4, 2021

tlambert03 commented Feb 4, 2021

Include docstrings, maybe a schema? #10

Include docstrings, maybe a schema? #10

Comments

tlambert03 commented Jan 12, 2021 • edited Loading

tlambert03 commented Jan 14, 2021

StRigaud commented Jan 27, 2021

haesleinhuepf commented Jan 28, 2021

tlambert03 commented Jan 28, 2021 • edited Loading

haesleinhuepf commented Jan 28, 2021

tlambert03 commented Jan 29, 2021 • edited Loading

StRigaud commented Jan 29, 2021

haesleinhuepf commented Jan 31, 2021

StRigaud commented Feb 4, 2021

tlambert03 commented Feb 4, 2021

tlambert03 commented Jan 12, 2021 •

edited

Loading

tlambert03 commented Jan 28, 2021 •

edited

Loading

tlambert03 commented Jan 29, 2021 •

edited

Loading