Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] add versionadded notes for v4.0.0 features #5948

Merged
merged 8 commits into from
Jul 6, 2023

Conversation

jameslamb
Copy link
Collaborator

@jameslamb jameslamb commented Jun 27, 2023

Contributes to #5153.

Adds notes to docs pointing out things that are new as of v4.0.0.

For the Python package, I'm proposing doing this via Sphinx .. versionadded:: directives: https://www.sphinx-doc.org/en/master/usage/restructuredtext/directives.html#directive-versionadded.

For the docs generated from config.h and for the R package, I added regular-old notes in Italics, but in exactly the same format as what .. versionadded:: does ("New in version 4.0.0").

example of the R docs in RStudio (click me) Screen Shot 2023-07-03 at 10 36 38 PM
example of the Python API docs (click me) Screen Shot 2023-07-03 at 10 49 48 PM
example of the parameter docs (click me) Screen Shot 2023-07-03 at 10 53 31 PM

Notes for Reviewers

I've temporarily enabled this branch on readthedocs.

@jameslamb jameslamb added the doc label Jun 27, 2023
@@ -233,6 +233,8 @@ You could edit your firewall rules to allow communication between any of the wor
Using Custom Objective Functions with Dask
******************************************

.. versionadded:: 4.0.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -145,6 +145,8 @@ Core Parameters

- ``goss``, Gradient-based One-Side Sampling

- *New in 4.0.0*
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -670,6 +672,8 @@ Learning Control Parameters

- **Note**: can be used only with ``device_type = cpu``

- *New in version 4.0.0*
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -678,6 +682,8 @@ Learning Control Parameters

- **Note**: can be used only with ``device_type = cpu``

- *New in 4.0.0*
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -686,10 +692,14 @@ Learning Control Parameters

- **Note**: can be used only with ``device_type = cpu``

- *New in 4.0.0*
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- ``stochastic_rounding`` :raw-html:`<a id="stochastic_rounding" title="Permalink to this parameter" href="#stochastic_rounding">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool

- whether to use stochastic rounding in gradient quantization

- *New in 4.0.0*
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -908,6 +918,8 @@ Dataset Parameters

- **Note**: ``lightgbm-transform`` is not maintained by LightGBM's maintainers. Bug reports or feature requests should go to `issues page <https://github.com/microsoft/lightgbm-transform/issues>`__

- *New in 4.0.0*
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -931,6 +931,8 @@ def predict(
If True, ensure that the features used to predict match the ones used to train.
Used only if data is pandas DataFrame.

.. versionadded:: 4.0.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -2840,6 +2842,8 @@ def num_feature(self) -> int:
def feature_num_bin(self, feature: Union[int, str]) -> int:
"""Get the number of bins for a feature.

.. versionadded:: 4.0.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -4149,19 +4153,34 @@ def refit(
will use ``leaf_output = decay_rate * old_leaf_output + (1.0 - decay_rate) * new_leaf_output`` to refit trees.
reference : Dataset or None, optional (default=None)
Reference for ``data``.

.. versionadded:: 4.0.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weight : list, numpy 1-D array, pandas Series or None, optional (default=None)
Weight for each ``data`` instance. Weights should be non-negative.

.. versionadded:: 4.0.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

group : list, numpy 1-D array, pandas Series or None, optional (default=None)
Group/query size for ``data``.
Only used in the learning-to-rank task.
sum(group) = n_samples.
For example, if you have a 100-document dataset with ``group = [10, 20, 40, 10, 10, 10]``, that means that you have 6 groups,
where the first 10 records are in the first group, records 11-30 are in the second group, records 31-70 are in the third group, etc.

.. versionadded:: 4.0.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

init_score : list, list of lists (for multi-class task), numpy array, pandas Series, pandas DataFrame (for multi-class task), or None, optional (default=None)
Init score for ``data``.

.. versionadded:: 4.0.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feature_name : list of str, or 'auto', optional (default="auto")
Feature names for ``data``.
If 'auto' and data is pandas DataFrame, data columns names are used.

.. versionadded:: 4.0.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -4172,13 +4191,25 @@ def refit(
All negative values in categorical features will be treated as missing values.
The output cannot be monotonically constrained with respect to a categorical feature.
Floating point numbers in categorical features will be rounded towards 0.

.. versionadded:: 4.0.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dataset_params : dict or None, optional (default=None)
Other parameters for Dataset ``data``.

.. versionadded:: 4.0.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

free_raw_data : bool, optional (default=True)
If True, raw data is freed after constructing inner Dataset for ``data``.

.. versionadded:: 4.0.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validate_features : bool, optional (default=False)
If True, ensure that the features used to refit the model match the original ones.
Used only if data is pandas DataFrame.

.. versionadded:: 4.0.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -4270,6 +4301,8 @@ def set_leaf_output(
) -> 'Booster':
"""Set the output of a leaf.

.. versionadded:: 4.0.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -407,6 +407,9 @@ def early_stopping(stopping_rounds: int, first_metric_only: bool = False, verbos
If float, this single value is used for all metrics.
If list, its length should match the total number of metrics.

# https://github.com/microsoft/LightGBM/pull/4580
.. versionadded:: 4.0.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -656,6 +656,9 @@ def create_tree_digraph(
example_case : numpy 2-D array, pandas DataFrame or None, optional (default=None)
Single row with the same structure as the training data.
If not None, the plot will highlight the path that sample takes through the tree.

.. versionadded:: 4.0.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -672,6 +675,8 @@ def create_tree_digraph(
graph = lgb.create_tree_digraph(clf, max_category_values=5)
HTML(graph._repr_image_svg_xml())

.. versionadded:: 4.0.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -484,6 +484,9 @@ def __init__(
threads configured for OpenMP in the system. A value of ``None`` (the default) corresponds
to using the number of physical cores in the system (its correct detection requires
either the ``joblib`` or the ``psutil`` util libraries to be installed).

.. versionchanged:: 4.0.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -968,6 +971,9 @@ def n_estimators_(self) -> int:

This might be less than parameter ``n_estimators`` if early stopping was enabled or
if boosting stopped early due to limits on complexity like ``min_gain_to_split``.

# https://github.com/microsoft/LightGBM/pull/4753
.. versionadded:: 4.0.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -979,6 +984,9 @@ def n_iter_(self) -> int:

This might be less than parameter ``n_estimators`` if early stopping was enabled or
if boosting stopped early due to limits on complexity like ``min_gain_to_split``.

# https://github.com/microsoft/LightGBM/pull/4753
.. versionadded:: 4.0.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jameslamb jameslamb mentioned this pull request Jul 2, 2023
19 tasks
@jameslamb jameslamb changed the title WIP: [docs] add versionadded notes for v4.0.0 features [docs] add versionadded notes for v4.0.0 features Jul 4, 2023
@jameslamb jameslamb marked this pull request as ready for review July 4, 2023 03:55
@jameslamb jameslamb merged commit 99ac1ef into master Jul 6, 2023
38 checks passed
@jameslamb jameslamb deleted the docs/version-added branch July 6, 2023 17:15
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 11, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants