Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support dumping training logs for TensorBoard visualization toolkit. #1144

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

classicsong
Copy link
Contributor

Issue #, if available:
#988

Description of changes:
Add a TensorBoard tracker that will save training loss, validation scores and test scores into TensorBoard logs.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Add a TensorBoard tracker that will save training loss, validation scores and
test scores into TensorBoard logs.
@classicsong classicsong added ready able to trigger the CI 0.4.1 labels Jan 23, 2025
@classicsong classicsong added this to the 0.4.1 release milestone Jan 23, 2025
@thvasilo
Copy link
Contributor

To support optional dependencies we do the following @classicsong

Pandas has their own pandas.compat._optional.import_optional_dependency We can use a similar mechanism.
See https://pandas.pydata.org/pandas-docs/version/1.4/development/contributing_codebase.html#optional-dependencies and https://pandas.pydata.org/pandas-docs/version/1.4/getting_started/install.html#install-optional-dependencies

Optional imports are done inline, using something like

def generate_numba_apply_func(
    func, nogil=True, nopython=True, parallel=False
) -> Callable[[npt.NDArray, Index, Index], dict[int, Any]]:
    numba = import_optional_dependency("numba")
    jitted_udf = numba.extending.register_jitable(func)

So in this case we could do

class GSTensorBoardTracker(GSSageMakerTaskTracker):
    def __init__(self, log_report_frequency, log_dir=None):
        super().__init__(log_report_frequency, log_dir)
        try:
            tensorboard = importlib.import_module("torch.utils.tensorboard")
        except ImportError as err:
            msg =  (
                "GSTensorBoardTracker requires tensorboard to run. "
                "Please install the tensorboard Python package.")
            raise ImportError(msg) from err
        
        self._writer = tensorboard.SummaryWriter(log_dir)

This ensures we only try to pull the dependency if we try to instantiate a GSTensorBoardTracker object and not
at module import time.

Then we can modify our setup.py to include tensorboard as an extra dependency:

setup(
    # Metadata
    name='graphstorm',
    version=VERSION,
    python_requires='>=3.8',
    description='GraphStorm',
    long_description_content_type='text/markdown',
    license='Apache-2.0',

    # Package info
    packages=find_packages(where="python", exclude=(
        'tests',
    )),
    package_dir={"": "python"},
    package_data={'': [os.path.join('datasets', 'dataset_checksums', '*.txt')]},
    zip_safe=True,
    include_package_data=True,
    install_requires=requirements,
    ext_modules=extensions,
    cmdclass=cmdclass,
    
    extras_require={
        'visualization': [
            'tensorboard', # Add minimum version if needed, e.g. 'tensorboard>=2.11.2'
        ]
    },
)

then when users want tensorboard visualization they can install pip install graphstorm[visualization] to pull in the extra tensorboard dependency.

Xiang Song added 2 commits January 23, 2025 22:56
classicsong added a commit that referenced this pull request Jan 24, 2025
…ndency (#1146)

*Issue #, if available:*
#988

*Description of changes:*
Follow #1144 to update graphstorm dependency.


By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

Co-authored-by: Xiang Song <[email protected]>
@classicsong classicsong requested a review from thvasilo January 25, 2025 01:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.4.1 ready able to trigger the CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants