Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spectrogram Integration #235

Open
kkappler opened this issue Dec 23, 2024 · 2 comments
Open

Spectrogram Integration #235

kkappler opened this issue Dec 23, 2024 · 2 comments
Assignees

Comments

@kkappler
Copy link
Collaborator

kkappler commented Dec 23, 2024

While adding spectrogram capabilities to mth5, there are some things that could be clarified:

  1. The term decimation is used more than it needs to be. It is possible that a factoring of metadata to describe decimation and "short-time-Fourier-transforms" (STFT or spectrograms) would add clarity. By a decimation we refer to the cascading decimation process, that was used in EMTF. This is a time series operation that downsamples data after the application of an anti-alias filter (AAF). These downsampled time-series are often referred to as "decimation levels", but the concept of a decimation level has not been formalized in a clear way. There are currently the following decimation things in mt_metadata:

1A
./processing/fourier_coefficients/standards/decimation.json
./processing/fourier_coefficients/decimation.py
1B
./processing/aurora/decimation.py
./processing/aurora/standards/decimation.json
1C
./processing/aurora/decimation_level.py
./processing/aurora/standards/decimation_level.json

1A mixes the concept of decimation with the concept of STFT. I.e. it has information about the time-domain decimation and information about the STFT parameters.

1B Specifies only the decimation factor, and an integer for the "decimation level", but has no information about the STFT. N.B. it is not clear that this is needed or used.

1C: Has only the information about the STFT that would be applied to a time series.

Part of the complexity here is based on the fact that in the old days, when disc space concerns were more of an issue, it was not common to store the decimated time series. The decimated time series were ephemeral, and did not persist after FCs were computed. This means that a FC object that is not at the zeroth decimation level would not in general have a time domain representation available. If the decimated time series were desired, they needed to be recomputed on demand.

A better way to manage the metadata could be to make standard practice to compute and store the decimated time-series.

Pros: Processing workflow is more modular.
Cons: Takes more disc space.

Other notes:

  • it would be nice to tag time series with what operations have been performed on them, i.e. if they are directly recorded field data, vs. if they are derived by application of an AAF and a downsample. The decimation and decimation level classes we have do store this metadata, but it could be better organized.
@kkappler
Copy link
Collaborator Author

kkappler commented Dec 24, 2024

After some review, it seems that

  • 1C should be made into a "TimeSeriesDecimation" metadata object, as it contains no information about the STFT.

    • To keep things modular the decimation class from aurora (which) is not aurora specific and can be promoted up one level, to processing (and its standards/decimation.json as well). This can be done before removing the existing class
    • We should add the AntiAliasFilter block of 1A TimeSeriesDecimation.
    • temporarily add a "decimation_method" block to 1C At that point there are five common metadata. In 1C they are called ["level", "factor", "method", "sample_rate", "anti_alias_filter"], and in 1A they are called ["decimation_level", "decimation_factor", "decimation_method", "decimation_sample_rate", "anti_alias_filter"],.
  • Then both 1A and 1B can have a TimeSeriesDecimation make up part of their composition.

    • Replace mt_metadata/transfer_functions/processing/aurora/decimation_level.py usage of Decimation with TSDecimation
    • Then we can leave decimation embedded in decimation_level, but also embed decimation in the FCDecimation class fourier_coefficients.py
    • The FCDecimation class can temporarily be given passthrough methods to access its previous "decimation_level", "decimation_factor", "sample_rate_decimation" etc. methods which return the corresponding data from self.time_series_decimation.
  • Cleanup

    • Remove tests/tf/processing/aurora/test_decimation.py
    • Remove mt_metadata/transfer_functions/processing/aurora/decimation.py
    • TODO: There is still an id field in FCDecimation. It is not clear if this belongs with decimation or FC.
    • mt_metadata updates: replace sample_rate_decimation with time_series_decimation.sample_rate
      • mt_metadata/tests/tf/processing/fcs/test_fc.py
      • fcs/test_decimation
    • aurora updates:
      • replace direct access to decimaton attrs in FCDecimation with access via fcdec.time_series_decimation in fc_decimations_creator in pipelines/fourier_coefficients.py
      • [ ]
    • Make all tests pass
      • mt_metadata
      • mth5
      • aurora (branch fix_mt_metadata_issue_235 is unable to import from mt_metadata the new path. Not sure if this is something to do with not overwriting mtpy-v2.)
      • mtpy-v2
  • Review minor conflicts when merging the Decimations are OK:

    • aurora's decimation.json had the default value for level=0, but in FCDecimation, the default decimation_level was null. Currently this is set to 0.
    • aurora's decimation.json had the dtype factor as float, but in FCDecimation it was integer. Currently set to float.
  • This raises the question of whether FCChannel should have a sample_rate_decimation_level value, or just a sample_rate

A future PR could address recursive grep sample_rate_decimation and fix these.

kkappler added a commit that referenced this issue Dec 26, 2024
- this is one of two main replacements that needs to be done on #235
- the next main replacement will be in FCDecimation
@kkappler
Copy link
Collaborator Author

kkappler commented Dec 27, 2024

After getting most of the way through this it seems that processing/fc/decimation.py would be more appropriately named spectrogram.py, and Decimation class inside should be Spectrogram

There is another potential factoring: aurora decimation_level.json and fc/decimation.json have much in common to merge in an STFT mt_metadata object.

@kkappler kkappler self-assigned this Dec 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant