Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update readme #790

Open
wants to merge 3 commits into
base: dev
Choose a base branch
from
Open

update readme #790

wants to merge 3 commits into from

Conversation

dtsuzuku-ibm
Copy link
Collaborator

@dtsuzuku-ibm dtsuzuku-ibm commented Nov 11, 2024

following template #753 (comment)

What I didn't include

  1. Header/Author, since we can see it in github
  2. Header/Date, since we can see it in github
  3. Changelog, since we may better consider a way to generate changelog from the commits as github action.
  4. Code Examples and Documentation, since we can provide it in jupyter notebook

Why are these changes needed?

Related issue number (if any).

#753

@shahrokhDaijavad
Copy link
Member

Thanks, @dtsuzuku-ibm. I am ok with most of the decisions you have made (like not including the information we can get from github about the Author and Date. However, @agoyal26, who created the template, may insist that we still provide such information in the README file.
Since we use your README as a model for others, let me not approve it yet, until @agoyal26 has also reviewed this and we decide what is a "must" in the README file. Again, thanks for doing this before all other transform owners!

@agoyal26
Copy link
Collaborator

I have made some suggestions to the read me. We can skip adding date of revisions but I think adding the list of contributors with their email would be good.

@shahrokhDaijavad
Copy link
Member

Where are your suggestions, @agoyal26? (other than adding the list of contributors and their emails)

@agoyal26
Copy link
Collaborator

I added them in readme.md itself as comments- are they visible? @shahrokhDaijavad

@shahrokhDaijavad
Copy link
Member

They are probably on your local copy of the file, @agoyal26. I don't see them. Do you want to just copy and paste your local README file here?

Copy link
Collaborator

@agoyal26 agoyal26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see some comments on readme to improve readability

transforms/language/doc_quality/python/README.md Outdated Show resolved Hide resolved
transforms/language/doc_quality/python/README.md Outdated Show resolved Hide resolved
transforms/language/doc_quality/python/README.md Outdated Show resolved Hide resolved
@shahrokhDaijavad
Copy link
Member

Thanks, @agoyal26 Now, I see your comments!

Signed-off-by: Daiki Tsuzuku <[email protected]>
@dtsuzuku-ibm
Copy link
Collaborator Author

dtsuzuku-ibm commented Nov 13, 2024

@agoyal26

Should we not call it content then instead of description in table header ?

I'm ok with that. Should I change the name of description column of the table in Output columns annotated by this transform to content, too? Both columns are for explaining what kind of data you'll see in each column listed in table.

@agoyal26
Copy link
Collaborator

oh my apologies - I get it now. Let it be description of the column. Also please add contributors and authors name and email id to readme. Then looks like we are good to go. Thank you

Signed-off-by: Daiki Tsuzuku <[email protected]>
Copy link
Collaborator

@agoyal26 agoyal26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@dtsuzuku-ibm
Copy link
Collaborator Author

@shahrokhDaijavad Could you assign any autorized user as reviewer?

Copy link
Member

@shahrokhDaijavad shahrokhDaijavad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you @dtsuzuku-ibm and @agoyal26

@shahrokhDaijavad
Copy link
Member

@dtsuzuku-ibm I think this is ready to merge. I will ask @touma-I to do this.

@dtsuzuku-ibm
Copy link
Collaborator Author

@shahrokhDaijavad I'm afraid I'm not authorized to merge this PR... Could you merge this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shahrokhDaijavad @dtsuzuku-ibm This shows how to run the example script once we have cloned the repo. I wonder if we should also add a section to it that would explain how to use it in a notebook or a python script without cloning the repo but only with pip install . i.e.:

!pip install data-prep-toolkit
!pip install data-prep-toolkit-transform[doc_quality]

from doc_quality_transform python import ...

params = {
...
}
...

laucher.launch()

Copy link
Collaborator Author

@dtsuzuku-ibm dtsuzuku-ibm Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@touma-I

Please add example. with a few lines of code on how one can do pip install, setup the parameter block for input/output folder and then invoke the runtime launcher using the default parameters . Thanks

We can do all of these in jupyter/collab notebook which will be added in the future.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dtsuzuku-ibm . that is great. Once you have the notebook, you can reference it. For now, I don't think this is complete until we have one or the other or both.

Copy link
Collaborator

@touma-I touma-I left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add example. with a few lines of code on how one can do pip install, setup the parameter block for input/output folder and then invoke the runtime launcher using the default parameters . Thanks

@shahrokhDaijavad
Copy link
Member

@dtsuzuku-ibm Since we don't have the notebook example yet, @touma-I is suggesting adding these few lines of code as an example of how we will be using all transforms in the future using pip install and not cloning the repo. Once we have the notebook example, we may remove these lines and just refer to the notebook.

@agoyal26
Copy link
Collaborator

very good point @touma-I . I agree too- as we are promoting that DPK transforms are easy to use via pip install - we should include these steps.

@dtsuzuku-ibm
Copy link
Collaborator Author

@touma-I @shahrokhDaijavad @agoyal26
User can refer to src/doc_quality_local_python.py. Is adding link enough?
I want to avoid copying&pasting duplicated codes.

@touma-I
Copy link
Collaborator

touma-I commented Nov 14, 2024

@touma-I @shahrokhDaijavad @agoyal26 User can refer to src/doc_quality_local_python.py. Is adding link enough? I want to avoid copying&pasting duplicated codes.

Hi @dtsuzuku-ibm I understand the desire not to copy/paste and hopefully we will get to the point where we can automate some of this stuff maybe via CI/CD. But for now, it is not clear to me what is the minimum that the user needs to do to use this transform. I think having it explained in 5-6 lines of code will help get us to that point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants