updated description of input features #356

sanjaydasgupta · 2024-03-27T06:10:24Z

No description provided.

docs/configuration/large_language_model.md

alexsherstinsky · 2024-03-28T06:20:26Z

docs/configuration/large_language_model.md

+
+If input to the LLM is just the content of a single dataset column (without any other prefixed or
+suffixed text), no `prompt` template should be provided and the `name` of the feature must 
+correspond to a column in the input dataset. See the following example.


@sanjaydasgupta likely, I would say here "and the name of the feature must correspond to that specific database column".

I took the liberty of changing your database to dataset for the sake of consistency throughout the document

docs/configuration/large_language_model.md

alexsherstinsky

@sanjaydasgupta Thank you for doing this. I think your direction is correct in general (I just left a few clarifying suggestions). Thank you.

sanjaydasgupta · 2024-03-28T07:22:12Z

Thanks @alexsherstinsky. I will work with your suggestions during the weekend.

sanjaydasgupta · 2024-03-30T05:39:30Z

Hi @alexsherstinsky, please let me know if you think the new description is too wordy.

alexsherstinsky · 2024-03-30T07:50:24Z

docs/configuration/large_language_model.md

+
+There are a couple of things to note here:
+- the prompt `template` contains named placeholders (`context` and `question`) for content 
+from the dataset's columns 


@sanjaydasgupta Just to note, these values can also come from a dictionary. For example, we can use this prompt template and format() it with those dictionary name-value pairs. Thanks!

I wasn't sure if you wanted any change here. Please clarify.

docs/configuration/large_language_model.md

sanjaydasgupta · 2024-03-31T01:13:55Z

Hi @alexsherstinsky, can you please review this and this.

alexsherstinsky

@sanjaydasgupta This LGTM! Thank you very much for making these clarifications in the documentation. I would like to please get a glance-over by @arnavgarg1 before merging. Thank you very much!

docs/configuration/large_language_model.md

arnavgarg1 · 2024-04-01T16:10:43Z

docs/configuration/large_language_model.md

+If it is intended for the input to the LLM to be just the content of a single dataset column (without 
+any other prefixed or suffixed text), no `prompt` template should be provided and the `name` of the 
+feature must correspond to that specific dataset column. See the following example.


Maybe we can also mention that it is still possible to wrap the single column with a static prompt template if desired.

Please see my last comment below

docs/configuration/large_language_model.md

arnavgarg1 · 2024-04-01T16:12:56Z

docs/configuration/large_language_model.md

+    type: text
+```
+
+There are a couple of things to note here:


nit: note in this example?

Please see my last comment below

arnavgarg1 · 2024-04-01T16:13:42Z

docs/configuration/large_language_model.md

+Also note that a prompt template can be used even with a single dataset column. 
+But that is necessary only when static text is required to be prefixed and/or 
+suffixed to the content of the dataset column.


Maybe we can move this up into the single column dataset section?

Please see my last comment below

arnavgarg1 · 2024-04-01T16:14:37Z

docs/configuration/large_language_model.md

+- the `name` of the `input_feature` (`prompt` here) is immaterial; it is not the name 
+of a dataset column as in the previous example. This example uses `prompt` just to 
+emphasize that the formatted output obtained by applying the prompt template to the 
+dataset columns is the input.


While true, I think this may be confusing to readers/first time users. What if perhaps we actually just use question as the column name but then say it is a placeholder and will be substituted by the value from the prompt template? This just reduces scope for unintended user errors!

After making the extensive changes referred in my last comment below I felt that prompt was sounding ok. However, if you still think question will be better, please let me know and I will change it.

arnavgarg1

Thanks for making this part of the docs clearer! I just left a few minor comments for consideration

sanjaydasgupta · 2024-04-02T02:39:57Z

Hi @arnavgarg1, thank you very much for your very insightful comments. I believe documentation should be widely reviewed, and having another pair of eyes was truly helpful.

Since this was a particularly troublesome section of the documentation, I though (after reading all of your comments) that it would be best to add one more yaml example. So I have added one more case (one dataset column + template). That enabled all of your structural comments to be handled in a much better way, and should give readers a much better understanding of the possible variations.

I request you and @alexsherstinsky to please review my changes critically once more and let me know if anything has fallen between the cracks.

arnavgarg1 · 2024-04-02T02:50:04Z

docs/configuration/large_language_model.md

+    Translate into French: 
+    {english_input}
+


Love it! Can we update it to add this?

prompt: template: | Translate into French Input: {english_input} Translation:

arnavgarg1

I love the clarity and the extra example YAML helps bring it home. Thanks for taking on one of the hardest parts of our documentation and simplifying it! I left one final comment, but approving since this is ready to go after that minor change.

Thanks for your contribution @sanjaydasgupta

sanjaydasgupta · 2024-04-02T03:13:09Z

@arnavgarg1 @alexsherstinsky changes completed

arnavgarg1 · 2024-04-02T04:28:55Z

Thanks! Will defer to Alex to do a final pass before merging

alexsherstinsky · 2024-04-02T07:19:11Z

docs/configuration/large_language_model.md

+### Single Dataset Column with Additional Text
+
+If the input to the LLM must be created by prefixing and/or suffixing some static text 
+to the content of one dataset column, then a `prompt` `template` should be provided to specify how 


@sanjaydasgupta Should be provided or must be provided? Sorry for seeking this level of precision -- I feel that it is important (we have come a long way to make it this good, might as well make it beyond reproach!). 😄

I agree, will change and let you know.

alexsherstinsky

@sanjaydasgupta Other than my minor comment ("should" -> "must"), I have nothing to add -- looks great. Before we merge, let us just make sure that everything, including links render properly (mkdocs) -- and then please merge, or let me know, and I will in the morning Pacific Time. Thanks a lot for this important documentation change!

sanjaydasgupta · 2024-04-02T08:57:33Z

I'm not sure how to complete the mkdocs part, but will read up. It will likely be late morning PT when I finish. Thanks for your help!

alexsherstinsky · 2024-04-02T14:49:43Z

I'm not sure how to complete the mkdocs part, but will read up. It will likely be late morning PT when I finish. Thanks for your help!

No worries, @sanjaydasgupta -- there is a README file which describes how to launch mkdocs -- you basicaly pip install ludwig and then run mkdocs and connect to localhost on a port it says, then navigate to your pages and make sure that things look good to you. Thank you!

sanjaydasgupta · 2024-04-02T14:55:19Z

I'm not sure how to complete the mkdocs part, but will read up. It will likely be late morning PT when I finish. Thanks for your help!

No worries, @sanjaydasgupta -- there is a README file which describes how to launch mkdocs -- you basicaly pip install ludwig and then run mkdocs and connect to localhost on a port it says, then navigate to your pages and make sure that things look good to you. Thank you!

Hi @alexsherstinsky, all complete now. I have checked using mkdocs too, and the text displayed has all the changes we made. The index in the right margin is also linking correctly to the three subheadings. Thanks for your help.

updated description of input features

4d63585

alexsherstinsky reviewed Mar 28, 2024

View reviewed changes

docs/configuration/large_language_model.md Outdated Show resolved Hide resolved

alexsherstinsky reviewed Mar 28, 2024

View reviewed changes

docs/configuration/large_language_model.md Show resolved Hide resolved

alexsherstinsky requested changes Mar 28, 2024

View reviewed changes

updates based on comments from Alex

de4fd80

sanjaydasgupta marked this pull request as ready for review March 30, 2024 05:40

sanjaydasgupta requested review from w4nderlust, tgaddair, justinxzhao, arnavgarg1, geoffreyangus, jeffkinnison and Infernaught as code owners March 30, 2024 05:40

alexsherstinsky reviewed Mar 30, 2024

View reviewed changes

docs/configuration/large_language_model.md Outdated Show resolved Hide resolved

sanjaydasgupta added 2 commits March 30, 2024 14:45

A few more updates based on comments from Alex

46a9d43

Another check for more suggested changes

a9d3574

alexsherstinsky approved these changes Apr 1, 2024

View reviewed changes

arnavgarg1 reviewed Apr 1, 2024

View reviewed changes

docs/configuration/large_language_model.md Outdated Show resolved Hide resolved

arnavgarg1 reviewed Apr 1, 2024

View reviewed changes

docs/configuration/large_language_model.md Outdated Show resolved Hide resolved

arnavgarg1 reviewed Apr 1, 2024

View reviewed changes

Changes after Arnav's comments

baa4dac

arnavgarg1 reviewed Apr 2, 2024

View reviewed changes

arnavgarg1 approved these changes Apr 2, 2024

View reviewed changes

More changes after Arnav's comments

7f309bf

alexsherstinsky reviewed Apr 2, 2024

View reviewed changes

alexsherstinsky approved these changes Apr 2, 2024

View reviewed changes

Minor change suggested by Alex

b7ebc43

alexsherstinsky merged commit fd21ef6 into ludwig-ai:master Apr 2, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

updated description of input features #356

updated description of input features #356

sanjaydasgupta commented Mar 27, 2024

alexsherstinsky Mar 28, 2024

sanjaydasgupta Mar 31, 2024

alexsherstinsky left a comment

sanjaydasgupta commented Mar 28, 2024

sanjaydasgupta commented Mar 30, 2024

alexsherstinsky Mar 30, 2024

sanjaydasgupta Mar 30, 2024

sanjaydasgupta commented Mar 31, 2024

alexsherstinsky left a comment

arnavgarg1 Apr 1, 2024

sanjaydasgupta Apr 2, 2024

arnavgarg1 Apr 1, 2024

sanjaydasgupta Apr 2, 2024

arnavgarg1 Apr 1, 2024

sanjaydasgupta Apr 2, 2024

arnavgarg1 Apr 1, 2024

sanjaydasgupta Apr 2, 2024

arnavgarg1 left a comment

sanjaydasgupta commented Apr 2, 2024

arnavgarg1 Apr 2, 2024

arnavgarg1 left a comment

sanjaydasgupta commented Apr 2, 2024

arnavgarg1 commented Apr 2, 2024

alexsherstinsky Apr 2, 2024

sanjaydasgupta Apr 2, 2024

alexsherstinsky left a comment

sanjaydasgupta commented Apr 2, 2024

alexsherstinsky commented Apr 2, 2024

sanjaydasgupta commented Apr 2, 2024

updated description of input features #356

updated description of input features #356

Conversation

sanjaydasgupta commented Mar 27, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexsherstinsky left a comment

Choose a reason for hiding this comment

sanjaydasgupta commented Mar 28, 2024

sanjaydasgupta commented Mar 30, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sanjaydasgupta commented Mar 31, 2024

alexsherstinsky left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arnavgarg1 left a comment

Choose a reason for hiding this comment

sanjaydasgupta commented Apr 2, 2024

Choose a reason for hiding this comment

arnavgarg1 left a comment

Choose a reason for hiding this comment

sanjaydasgupta commented Apr 2, 2024

arnavgarg1 commented Apr 2, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexsherstinsky left a comment

Choose a reason for hiding this comment

sanjaydasgupta commented Apr 2, 2024

alexsherstinsky commented Apr 2, 2024

sanjaydasgupta commented Apr 2, 2024