-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add StepLR scheduler #109
base: main
Are you sure you want to change the base?
Add StepLR scheduler #109
Conversation
WalkthroughThe pull request introduces significant enhancements to the configuration documentation and files related to the Changes
Possibly related PRs
Suggested reviewers
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #109 +/- ##
==========================================
+ Coverage 96.64% 97.37% +0.73%
==========================================
Files 23 38 +15
Lines 1818 3701 +1883
==========================================
+ Hits 1757 3604 +1847
- Misses 61 97 +36 ☔ View full report in Codecov by Sentry. |
466ae62
to
2dfa585
Compare
d365129
to
2e768aa
Compare
2e768aa
to
4b31a91
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
🧹 Outside diff range and nitpick comments (13)
tests/assets/minimal_instance/initial_config.yaml (1)
13-14
: Consider parameterizing the crop dimensions.The crop dimensions are hardcoded to 160x160. Consider making these values configurable through environment variables or command-line arguments for better flexibility.
tests/assets/minimal_instance_bottomup/initial_config.yaml (2)
5-6
: Document the new data configuration parameters.Please add comments explaining:
- The impact of
user_instances_only: True
on data processing- How
chunk_size: 100
affects performance and memory usage
70-70
: Document the purpose of bin_files_path.Please add a comment explaining the purpose of this configuration field and whether an empty value is expected in this test configuration.
docs/config_bottomup.yaml (1)
6-6
: Document the purpose of chunk_size parameter.The newly added
chunk_size
parameter lacks documentation explaining its purpose and impact on data processing. Consider adding a comment to clarify its usage and recommended values.docs/config_topdown_centered_instance.yaml (1)
6-6
: Document new parameters and provide default value for min_crop_size.The newly added parameters need documentation:
chunk_size: 100
- What does this value represent and how was it determined?min_crop_size
- This parameter is empty. Please provide a default value or document if it's optional.Also applies to: 15-15
initial_config.yaml (1)
1-104
: Fix file format consistency.The file uses incorrect newline characters.
Ensure the file uses Unix-style line endings (\n) instead of Windows-style (\r\n).
🧰 Tools
🪛 yamllint
[error] 1-1: wrong new line character: expected \n
(new-lines)
sleap_nn/data/streaming_datasets.py (1)
154-155
: Consider adding input validation forinput_scale
.To prevent potential issues, consider adding validation for the
input_scale
parameter in__init__
to ensure it's positive.def __init__( self, confmap_head: DictConfig, crop_hw: Tuple[int], max_stride: int, apply_aug: bool = False, augmentation_config: DictConfig = None, input_scale: float = 1.0, *args, **kwargs, ): """Construct a CenteredInstanceStreamingDataset.""" super().__init__(*args, **kwargs) + if input_scale <= 0: + raise ValueError("input_scale must be positive") self.confmap_head = confmap_head self.crop_hw = crop_hw self.max_stride = max_stride self.apply_aug = apply_aug self.aug_config = augmentation_config self.input_scale = input_scaletests/training/test_model_trainer.py (2)
103-110
: Enhance test coverage for StepLR scheduler.While the test configures the StepLR scheduler, it doesn't verify that the learning rate actually decreases according to the schedule. Consider adding assertions to check the learning rate values at different epochs.
Add assertions like this after training:
trainer = ModelTrainer(config) trainer.train() + + # Verify learning rate schedule + df = pd.read_csv(Path(config.trainer_config.save_ckpt_path).joinpath("lightning_logs/version_0/metrics.csv")) + initial_lr = config.trainer_config.optimizer.lr + # Check that LR is halved every 10 epochs + assert abs(df.loc[0, "learning_rate"] - initial_lr) <= 1e-4 + assert abs(df.loc[10, "learning_rate"] - initial_lr * 0.5) <= 1e-4
337-342
: Enhance exception testing for invalid schedulers.The test only verifies one invalid scheduler case. Consider testing multiple invalid cases and verifying the error message content.
Add more test cases:
# check exception for lr scheduler - OmegaConf.update(config, "trainer_config.lr_scheduler.scheduler", "ReduceLR") - with pytest.raises(ValueError): - trainer = ModelTrainer(config) - trainer.train() + invalid_schedulers = ["ReduceLR", "CustomLR", "InvalidScheduler"] + for invalid_scheduler in invalid_schedulers: + OmegaConf.update(config, "trainer_config.lr_scheduler.scheduler", invalid_scheduler) + with pytest.raises(ValueError, match=f"Unsupported scheduler: {invalid_scheduler}"): + trainer = ModelTrainer(config) + trainer.train()docs/config.md (3)
179-189
: Fix markdown formatting inconsistencies.The indentation of list items is inconsistent with the rest of the document. Please adjust the indentation to match the document's style:
- Reduce indentation of
scheduler
and main sections to 4 spaces- Reduce indentation of subsections to 8 spaces
Apply this formatting:
- `lr_scheduler` - `scheduler`: (str) Name of the scheduler to use. Valid schedulers: `"StepLR"`, `"ReduceLROnPlateau"`. - `step_lr`: - `step_size`: (int) Period of learning rate decay. If `step_size`=10, then every 10 epochs, learning rate will be reduced by a factor of `gamma`. - `gamma`: (float) Multiplicative factor of learning rate decay.*Default*: 0.1. - `reduce_lr_on_plateau`: - `threshold`: (float) Threshold for measuring the new optimum, to only focus on significant changes. *Default*: 1e-4. - `threshold_mode`: (str) One of "rel", "abs". In rel mode, dynamic_threshold = best * ( 1 + threshold ) in max mode or best * ( 1 - threshold ) in min mode. In abs mode, dynamic_threshold = best + threshold in max mode or best - threshold in min mode. *Default*: "rel". - `cooldown`: (int) Number of epochs to wait before resuming normal operation after lr has been reduced. *Default*: 0 - `patience`: (int) Number of epochs with no improvement after which learning rate will be reduced. For example, if patience = 2, then we will ignore the first 2 epochs with no improvement, and will only decrease the LR after the third epoch if the loss still hasn't improved then. *Default*: 10. - `factor`: (float) Factor by which the learning rate will be reduced. new_lr = lr * factor. *Default*: 0.1. - `min_lr`: (float or List[float]) A scalar or a list of scalars. A lower bound on the learning rate of all param groups or each group respectively. *Default*: 0.🧰 Tools
🪛 LanguageTool
[uncategorized] ~180-~180: Loose punctuation mark.
Context: ...ReduceLROnPlateau". -
step_lr: -
step_size`: (int) Period...(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~183-~183: Loose punctuation mark.
Context: ...*: 0.1. -reduce_lr_on_plateau
: -threshold
: (float) Thre...(UNLIKELY_OPENING_PUNCTUATION)
🪛 Markdownlint
179-179: Expected: 4; Actual: 8
Unordered list indentation(MD007, ul-indent)
180-180: Expected: 4; Actual: 8
Unordered list indentation(MD007, ul-indent)
181-181: Expected: 6; Actual: 12
Unordered list indentation(MD007, ul-indent)
182-182: Expected: 6; Actual: 12
Unordered list indentation(MD007, ul-indent)
183-183: Expected: 4; Actual: 8
Unordered list indentation(MD007, ul-indent)
184-184: Expected: 6; Actual: 12
Unordered list indentation(MD007, ul-indent)
185-185: Expected: 6; Actual: 12
Unordered list indentation(MD007, ul-indent)
186-186: Expected: 6; Actual: 12
Unordered list indentation(MD007, ul-indent)
187-187: Expected: 6; Actual: 12
Unordered list indentation(MD007, ul-indent)
188-188: Expected: 6; Actual: 12
Unordered list indentation(MD007, ul-indent)
189-189: Expected: 6; Actual: 12
Unordered list indentation(MD007, ul-indent)
185-185: null
Spaces inside emphasis markers(MD037, no-space-in-emphasis)
185-185: null
Spaces inside emphasis markers(MD037, no-space-in-emphasis)
185-185
: Fix emphasis markers in threshold_mode description.Remove spaces inside emphasis markers in the description of
threshold_mode
.Apply this change:
- In * rel * mode, dynamic_threshold = best * ( 1 + threshold ) in max mode or best * ( 1 - threshold ) in min mode. + In *rel* mode, dynamic_threshold = best * ( 1 + threshold ) in max mode or best * ( 1 - threshold ) in min mode.🧰 Tools
🪛 Markdownlint
185-185: Expected: 6; Actual: 12
Unordered list indentation(MD007, ul-indent)
185-185: null
Spaces inside emphasis markers(MD037, no-space-in-emphasis)
185-185: null
Spaces inside emphasis markers(MD037, no-space-in-emphasis)
179-189
: Consider adding examples for each scheduler configuration.To improve documentation clarity, consider adding example configurations for both StepLR and ReduceLROnPlateau schedulers. This would help users understand how to properly configure these schedulers for their use cases.
Would you like me to generate example configurations for both schedulers?
🧰 Tools
🪛 LanguageTool
[uncategorized] ~180-~180: Loose punctuation mark.
Context: ...ReduceLROnPlateau". -
step_lr: -
step_size`: (int) Period...(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~183-~183: Loose punctuation mark.
Context: ...*: 0.1. -reduce_lr_on_plateau
: -threshold
: (float) Thre...(UNLIKELY_OPENING_PUNCTUATION)
🪛 Markdownlint
179-179: Expected: 4; Actual: 8
Unordered list indentation(MD007, ul-indent)
180-180: Expected: 4; Actual: 8
Unordered list indentation(MD007, ul-indent)
181-181: Expected: 6; Actual: 12
Unordered list indentation(MD007, ul-indent)
182-182: Expected: 6; Actual: 12
Unordered list indentation(MD007, ul-indent)
183-183: Expected: 4; Actual: 8
Unordered list indentation(MD007, ul-indent)
184-184: Expected: 6; Actual: 12
Unordered list indentation(MD007, ul-indent)
185-185: Expected: 6; Actual: 12
Unordered list indentation(MD007, ul-indent)
186-186: Expected: 6; Actual: 12
Unordered list indentation(MD007, ul-indent)
187-187: Expected: 6; Actual: 12
Unordered list indentation(MD007, ul-indent)
188-188: Expected: 6; Actual: 12
Unordered list indentation(MD007, ul-indent)
189-189: Expected: 6; Actual: 12
Unordered list indentation(MD007, ul-indent)
185-185: null
Spaces inside emphasis markers(MD037, no-space-in-emphasis)
185-185: null
Spaces inside emphasis markers(MD037, no-space-in-emphasis)
sleap_nn/training/model_trainer.py (1)
Line range hint
675-684
: Consider simplifying the return statement structure.The current implementation has two separate return statements with similar structures. This could be simplified to improve readability and maintainability.
- if self.trainer_config.lr_scheduler.scheduler is None: - return { - "optimizer": optimizer, - } - - return { - "optimizer": optimizer, - "lr_scheduler": { - "scheduler": scheduler, - "monitor": "val_loss", - }, - } + config = {"optimizer": optimizer} + if self.trainer_config.lr_scheduler.scheduler is not None: + config["lr_scheduler"] = { + "scheduler": scheduler, + "monitor": "val_loss", + } + return config
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (15)
docs/config.md
(1 hunks)docs/config_bottomup.yaml
(2 hunks)docs/config_centroid.yaml
(2 hunks)docs/config_topdown_centered_instance.yaml
(3 hunks)initial_config.yaml
(1 hunks)sleap_nn/data/streaming_datasets.py
(1 hunks)sleap_nn/training/model_trainer.py
(1 hunks)tests/assets/minimal_instance/initial_config.yaml
(3 hunks)tests/assets/minimal_instance/training_config.yaml
(4 hunks)tests/assets/minimal_instance_bottomup/initial_config.yaml
(3 hunks)tests/assets/minimal_instance_bottomup/training_config.yaml
(3 hunks)tests/assets/minimal_instance_centroid/initial_config.yaml
(3 hunks)tests/assets/minimal_instance_centroid/training_config.yaml
(3 hunks)tests/fixtures/datasets.py
(2 hunks)tests/training/test_model_trainer.py
(5 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/config.md
[uncategorized] ~180-~180: Loose punctuation mark.
Context: ...ReduceLROnPlateau". -
step_lr: -
step_size`: (int) Period...
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~183-~183: Loose punctuation mark.
Context: ...*: 0.1. - reduce_lr_on_plateau
: - threshold
: (float) Thre...
(UNLIKELY_OPENING_PUNCTUATION)
🪛 Markdownlint
docs/config.md
179-179: Expected: 4; Actual: 8
Unordered list indentation
(MD007, ul-indent)
180-180: Expected: 4; Actual: 8
Unordered list indentation
(MD007, ul-indent)
181-181: Expected: 6; Actual: 12
Unordered list indentation
(MD007, ul-indent)
182-182: Expected: 6; Actual: 12
Unordered list indentation
(MD007, ul-indent)
183-183: Expected: 4; Actual: 8
Unordered list indentation
(MD007, ul-indent)
184-184: Expected: 6; Actual: 12
Unordered list indentation
(MD007, ul-indent)
185-185: Expected: 6; Actual: 12
Unordered list indentation
(MD007, ul-indent)
186-186: Expected: 6; Actual: 12
Unordered list indentation
(MD007, ul-indent)
187-187: Expected: 6; Actual: 12
Unordered list indentation
(MD007, ul-indent)
188-188: Expected: 6; Actual: 12
Unordered list indentation
(MD007, ul-indent)
189-189: Expected: 6; Actual: 12
Unordered list indentation
(MD007, ul-indent)
185-185: null
Spaces inside emphasis markers
(MD037, no-space-in-emphasis)
185-185: null
Spaces inside emphasis markers
(MD037, no-space-in-emphasis)
🪛 yamllint
initial_config.yaml
[error] 1-1: wrong new line character: expected \n
(new-lines)
🔇 Additional comments (21)
tests/assets/minimal_instance_centroid/initial_config.yaml (3)
5-6
: Verify if data config changes are intended for this PR.
The addition of user_instances_only
and chunk_size
parameters appears unrelated to the PR's objective of adding StepLR scheduler support. Please confirm if these changes should be part of this PR.
#!/bin/bash
# Check if these parameters are referenced in related changes
echo "Searching for related changes to user_instances_only and chunk_size..."
rg -l "user_instances_only|chunk_size" --type yaml
84-91
:
Restructure scheduler config to support both ReduceLROnPlateau and StepLR.
The current changes only implement ReduceLROnPlateau configuration, but the PR's objective is to add StepLR support. Consider restructuring to support both scheduler types:
lr_scheduler:
- scheduler: ReduceLROnPlateau
- reduce_lr_on_plateau:
- threshold: 1.0e-07
- threshold_mode: abs
- cooldown: 3
- patience: 5
- factor: 0.5
- min_lr: 1.0e-08
+ type: "reduce_lr_on_plateau" # or "step_lr"
+ params:
+ # Common parameters
+ factor: 0.5
+ # ReduceLROnPlateau specific
+ threshold: 1.0e-07
+ threshold_mode: abs
+ cooldown: 3
+ patience: 5
+ min_lr: 1.0e-08
+ # StepLR specific (commented until needed)
+ # step_size: 30
+ # gamma: 0.5
This structure:
- Makes scheduler type explicit via the
type
field - Supports both scheduler types in a clean way
- Keeps common parameters at the top level
- Maintains backward compatibility
#!/bin/bash
# Check for any existing StepLR references or tests
echo "Searching for StepLR implementation..."
rg "StepLR|step_lr" --type python
64-64
: Clarify the purpose of bin_files_path
.
This new parameter is added without a value and seems unrelated to the StepLR scheduler implementation. Please provide documentation about its purpose or remove it if not needed for this PR.
tests/assets/minimal_instance/initial_config.yaml (3)
89-96
:
Add support for StepLR configuration.
The PR's objective is to add StepLR support, but the configuration only includes ReduceLROnPlateau. Consider:
- Making the scheduler type configurable rather than hardcoded
- Adding a parallel configuration section for StepLR parameters
Here's a suggested structure:
lr_scheduler:
- scheduler: ReduceLROnPlateau
+ scheduler: ${scheduler_type} # Can be "ReduceLROnPlateau" or "StepLR"
reduce_lr_on_plateau:
threshold: 1.0e-07
threshold_mode: abs
cooldown: 3
patience: 5
factor: 0.5
min_lr: 1.0e-08
+ step_lr:
+ step_size: 30
+ gamma: 0.5
+ last_epoch: -1
Let's check for any existing StepLR configurations in the codebase:
#!/bin/bash
# Search for StepLR-related configurations
rg -i "step_lr|steplr|step_size|gamma"
69-69
: Specify a value for bin_files_path or document if it's optional.
The bin_files_path
parameter is added but left empty. Please document whether this is an optional parameter.
Let's check how this parameter is used:
#!/bin/bash
# Search for bin_files_path usage to determine if it's required
rg -i "bin_files_path"
15-15
: Specify a value for min_crop_size or document if it's optional.
The min_crop_size
parameter is added but left empty. This could lead to validation issues if the parameter is required.
Let's check if this parameter is used elsewhere in the codebase:
✅ Verification successful
The empty min_crop_size parameter is valid as it's an optional configuration.
Based on the codebase analysis:
min_crop_size
is an optional parameter that defaults toNone
ornull
in YAML configs- It's used in
find_instance_crop_size()
function to set a minimum boundary for automatic crop size calculation - The documentation in
config.md
confirms it's only used whencrop_hw
isNone
- Other test configurations also set this to
null
, indicating this is a common pattern
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for min_crop_size usage to determine if it's required
rg -i "min_crop_size"
Length of output: 1636
Script:
#!/bin/bash
# Search for min_crop_size usage with case-insensitive pattern and include context
rg -i "min.?crop.?size" -A 3 -B 3
Length of output: 8970
Script:
#!/bin/bash
# Search for crop-related configurations in YAML files
fd -e yaml -e yml -x grep -i "min.?crop.?size\|crop" {}
Length of output: 269
Script:
#!/bin/bash
# Look for data configuration or preprocessing files that might use crop size
fd -t f -e py | xargs grep -l -i "crop"
Length of output: 883
tests/assets/minimal_instance_bottomup/initial_config.yaml (1)
92-97
: Verify the ReduceLROnPlateau parameters.
Please confirm:
- Is
threshold: 1.0e-07
sensitive enough for your loss function? - Is
cooldown: 3
sufficient to observe stable improvements? - Is
patience: 5
appropriate given your typical training dynamics? - Will
min_lr: 1.0e-08
allow sufficient model refinement?
Let's check for consistency across other configurations:
✅ Verification successful
Based on the code analysis, I can now provide a final response:
The ReduceLROnPlateau parameters appear appropriate for MSE loss training
The codebase reveals:
- The model uses MSE loss (
nn.MSELoss()
) as the primary loss function - The threshold of 1.0e-07 is suitable for MSE loss which typically produces small values
- Cooldown of 3 epochs provides sufficient stabilization time between LR adjustments
- Patience of 5 epochs allows reasonable time to observe genuine improvements vs fluctuations
- The min_lr of 1.0e-08 is appropriate as it's an order of magnitude below the threshold
The parameters are also consistent across all configurations in the codebase, suggesting they've been validated through usage.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Compare lr_scheduler settings across configs
fd -e yaml -e yml . -x grep -A 6 "lr_scheduler:"
Length of output: 1682
Script:
#!/bin/bash
# Let's look for loss function implementations and typical values
rg -A 5 "loss|criterion" --type py
# Also check for any training logs or documentation that might show typical loss values
fd "train|log|readme" -e md -e txt -e log
Length of output: 42950
tests/assets/minimal_instance_centroid/training_config.yaml (2)
75-75
: LGTM!
The empty bin_files_path
is appropriate for a test configuration.
5-6
: Verify the data loading parameters.
The new data loading parameters might affect training behavior:
user_instances_only: True
restricts to user-created instanceschunk_size: 100
defines the size of data chunks
Please verify that these values are appropriate for the test configuration.
✅ Verification successful
Data loading parameters are consistent with project standards
The parameters user_instances_only: True
and chunk_size: 100
are:
- Used consistently across all test configurations and documentation
- Properly documented in
docs/config.md
with clear default values - Correctly implemented in the codebase with proper handling in
sleap_nn/training/model_trainer.py
and data providers - Part of the standard data loading configuration across different model types (centroid, bottomup, etc.)
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for other test configurations to ensure consistency
rg -l "user_instances_only|chunk_size" "tests/assets/**/training_config.yaml"
Length of output: 156
Script:
#!/bin/bash
# Let's first find the correct path to test configurations
fd "training_config.yaml" tests/
# Then search for usage of these parameters in the codebase
rg "user_instances_only|chunk_size" -A 2 -B 2
Length of output: 23238
tests/assets/minimal_instance_bottomup/training_config.yaml (3)
5-6
: LGTM: Data configuration parameters are well-structured.
The addition of user_instances_only
and chunk_size
parameters is clear and follows the existing configuration pattern.
85-85
: LGTM: Optional path configuration.
The bin_files_path
parameter is appropriately added as an optional configuration field.
105-112
: Verify scheduler implementation consistency across the codebase.
Let's ensure that both schedulers are properly supported across all configuration files and the corresponding implementation.
✅ Verification successful
Scheduler implementation is consistent across the codebase
The ModelTrainer implementation properly supports both ReduceLROnPlateau
and StepLR
schedulers, and all configuration files consistently use the same scheduler configuration structure. The changes in the test configuration align with the implementation.
- All YAML configs use the same structure for scheduler configuration
- ModelTrainer explicitly handles both schedulers and validates scheduler names
- Test suite includes coverage for scheduler configuration validation
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for scheduler-related configurations and implementations
# Check all YAML configs for scheduler configurations
echo "Checking YAML configurations..."
fd -e yaml -e yml | xargs rg -l "scheduler|lr_scheduler"
# Check ModelTrainer implementation for scheduler support
echo -e "\nChecking ModelTrainer implementation..."
rg -A 10 "configure_optimizers|lr_scheduler" src/
Length of output: 777
Script:
#!/bin/bash
# Let's check the content of these config files to understand scheduler configurations
echo "Checking scheduler configurations in YAML files..."
for file in $(fd -e yaml -e yml); do
echo -e "\n=== $file ==="
rg -A 5 "scheduler:" "$file" || true
done
# Let's also search for Python files that might contain scheduler implementation
echo -e "\nSearching for Python files with scheduler implementation..."
fd -e py | xargs rg -l "scheduler|ReduceLROnPlateau|lr_scheduler"
Length of output: 4879
Script:
#!/bin/bash
# Let's examine the scheduler implementation in the model trainer
echo "Checking ModelTrainer implementation..."
rg -B 2 -A 10 "lr_scheduler|ReduceLROnPlateau" sleap_nn/training/model_trainer.py
echo -e "\nChecking test implementation..."
rg -B 2 -A 10 "lr_scheduler|ReduceLROnPlateau" tests/training/test_model_trainer.py
Length of output: 4612
tests/assets/minimal_instance/training_config.yaml (2)
5-6
: Verify the necessity of new data configuration parameters.
These new parameters (user_instances_only
, chunk_size
, min_crop_size
) appear unrelated to the PR's objective of adding StepLR scheduler support. Please clarify their purpose and whether they should be part of this PR.
Also applies to: 15-15
82-82
: Verify the necessity of bin_files_path parameter.
The addition of bin_files_path
parameter appears unrelated to the StepLR scheduler implementation. Please clarify its purpose and whether it should be part of this PR.
docs/config_topdown_centered_instance.yaml (1)
107-114
:
Restructure lr_scheduler config to support both ReduceLROnPlateau and StepLR.
The current configuration structure doesn't align with the PR objective of adding StepLR support. Consider restructuring to allow selecting between different scheduler types:
Here's a suggested structure:
lr_scheduler:
type: "ReduceLROnPlateau" # or "StepLR"
params:
# Common parameters
factor: 0.5
min_lr: 1.0e-08
# ReduceLROnPlateau specific
threshold: 1.0e-07
threshold_mode: abs
cooldown: 3
patience: 5
# StepLR specific
step_size: 30 # number of epochs between learning rate decays
This structure would:
- Make it clear which scheduler type is being used
- Support both ReduceLROnPlateau and StepLR configurations
- Maintain backward compatibility while adding new functionality
Let's verify if any other configuration files need similar updates:
initial_config.yaml (1)
26-52
: LGTM: Model configuration is well-structured.
The backbone and head configurations are properly defined with appropriate default values.
docs/config_centroid.yaml (1)
6-6
: LGTM: Consistent chunk_size configuration
The addition of chunk_size
parameter is well-placed and maintains consistency with other configuration files.
tests/fixtures/datasets.py (1)
51-55
: Verify unrelated changes to data provider configuration.
These changes appear unrelated to the StepLR scheduler implementation:
- Changed provider from "LabelsReaderDP" to "LabelsReader"
- Added new configurations for user_instances_only and chunk_size
sleap_nn/data/streaming_datasets.py (1)
154-155
: Performance improvement: Moved crop size scaling to initialization.
Good optimization! Moving the crop size scaling calculation from __getitem__
to __init__
avoids redundant calculations on every item fetch while maintaining the same functionality. This is especially beneficial when processing large datasets.
docs/config.md (1)
179-182
: LGTM! Clear and well-documented StepLR configuration.
The new StepLR scheduler configuration is well-structured and includes essential parameters with clear descriptions. The implementation aligns with the PR objectives to add StepLR support.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~180-~180: Loose punctuation mark.
Context: ...ReduceLROnPlateau". -
step_lr: -
step_size`: (int) Period...
(UNLIKELY_OPENING_PUNCTUATION)
🪛 Markdownlint
179-179: Expected: 4; Actual: 8
Unordered list indentation
(MD007, ul-indent)
180-180: Expected: 4; Actual: 8
Unordered list indentation
(MD007, ul-indent)
181-181: Expected: 6; Actual: 12
Unordered list indentation
(MD007, ul-indent)
182-182: Expected: 6; Actual: 12
Unordered list indentation
(MD007, ul-indent)
sleap_nn/training/model_trainer.py (1)
651-674
: LGTM: Scheduler selection logic is well-structured.
The implementation correctly:
- Handles both
StepLR
andReduceLROnPlateau
schedulers - Provides clear error messages for invalid scheduler types
- Maintains consistent configuration access patterns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (1)
sleap_nn/training/model_trainer.py (1)
660-662
: Standardize quotation marks in error messageThe error message uses inconsistent quotation marks around the scheduler names. For clarity and consistency, consider standardizing the quotation marks.
Apply this diff to improve the error message formatting:
raise ValueError( - f"{self.trainer_config.lr_scheduler.scheduler} is not a valid scheduler. Valid schedulers: `'StepLR'`, `'ReduceLROnPlateau'`" + f"{self.trainer_config.lr_scheduler.scheduler} is not a valid scheduler. Valid schedulers: 'StepLR', 'ReduceLROnPlateau'" )
Currently, we support
ReduceLROnPlateau
for reducing learning rate by monitoring theval_loss
. In this PR, we add support forStepLR
, to have a fixed learning rate schedule (for eg, reducing lr by 0.5 every 30 epochs).Summary by CodeRabbit
New Features
chunk_size
anduser_instances_only
.ReduceLROnPlateau
.Bug Fixes
Documentation
Tests
ModelTrainer
to validate new learning rate scheduler configurations and error handling.