Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chapter 10: Discrepancy between problem statement and Keras implementation in timeseries_dataset_from_array() #238

Open
juandevprojects opened this issue Apr 4, 2024 · 1 comment

Comments

@juandevprojects
Copy link

Description:
Reading the section 10 Deep learning for timeseries, there appears to be a potential discrepancy between the problem statement and the actual implementation.

Problem Statement:
The problem statement, as described in section 10.2.1, outlines a scenario where temperature data and other variables for 5 days, sampled once per hour, are provided. The objective is to predict the temperature 24 hours ahead.

Concern:
According to the problem statement, there are 120 samples in 5 days (24 samples per day). The dataset should consist of sequences representing 5 days of data, with each sequence containing a maximum of 120 samples.

Keras Implementation:
However, when utilizing the timeseries_dataset_from_array() function with parameters sampling_rate = 6 and sequence_length = 120, it generates sequences corresponding to 30 days (4 samples per day). This seems to deviate from the problem statement's objective of predicting temperature with data from 5 days, not 30.

Proposed Solution:
One potential solution could be adjusting the sequence_length parameter to 20. This adjustment would ensure that sequences contain data from 5 consecutive days (4 samples per day using sampling_rate = 6), aligning with the problem statement's requirements.

Request for Clarification:
I'd appreciate clarification on whether my analysis is accurate and if the implementation aligns with the intended problem statement. If not, guidance on how to correctly utilize the timeseries_dataset_from_array() function for the specified problem would be valuable.

Thank you for your attention to this matter.

@juandevprojects juandevprojects changed the title Discrepancy between problem statement and Keras implementation in timeseries_dataset_from_array() Chapter 10: Discrepancy between problem statement and Keras implementation in timeseries_dataset_from_array() Apr 4, 2024
@shenchenbing
Copy link

The original data contains 6 sets of data per hour. So sampling_rate=6 means 1 set of data per hour. The book description is correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants