Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General cleanup and refactoring in preparation for batch transmission #434

Merged
merged 11 commits into from
Sep 14, 2023

Conversation

tpwrules
Copy link
Contributor

My group has been working on a project to enhance ModularSensors with batch data transmission. We have successfully implemented it, it's been running for over a month, and we wish to contribute it back.

This PR contains some preliminary refactorings which make our project possible. Let's get these in first to minimize the set of changes to be reviewed. Please see individual commit messages for details.

@aufdenkampe aufdenkampe changed the base branch from master to develop March 16, 2023 20:35
@aufdenkampe
Copy link
Member

@tpwrules, thanks for your Pull Request with your contributions toward batch data transmission! I've done a quick look at your commits and really appreciate their level of granularity and documentation describing why you made the changes.

The only concern I had was with commit 1498d3b, which removed some print functions that we use for some debugging.

@SRGDamia1, will you review this?

@tpwrules
Copy link
Contributor Author

tpwrules commented Mar 16, 2023

Thank you for the vote of confidence.

The only concern I had was with commit 1498d3b, which removed some print functions that we use for some debugging.

I did some digging through the commit history and it looked to me like these were remnants of an earlier design of the data publisher stuff. I had all my debugging needs served by the printout of the request to the debug serial port, which I preserved.

Do you actively use them for debugging today? If so, then I think they would be better served with a different structure anyway. Having multiple functions to do the same thing is hard to maintain and unreliable to debug with.

@aufdenkampe
Copy link
Member

@tpwrules, that great to hear that commit 1498d3b only removed redundant functions that we aren't using for the serial monitor outputs. I hadn't looked closely enough in my quick review!

I'll try to test it out in the next week or so. In the meanwhile, we'll want @SRGDamia1 to look it over, as she's the main developer behind ModularSensors.

@neilh10
Copy link
Contributor

neilh10 commented Mar 17, 2023

Seems some reasonable refactoring. :)
Some comments

@aufdenkampe
Copy link
Member

@tpwrules, as @neilh10 alluded, we're still using a GitFlow workflow, where the main branch reflects the latest release, and we're mostly working in the develop branch. Yesterday I reoriented your PR toward the develop branch, then initiated the automated testing that @SRGDamia1 set up with GitHub Actions. Most of those are failing. Can you look through those errors to see if you can fix them?

@neilh10, indeed your similar PR has languished for a while, mostly because of lack of funding for @SRGDamia1's time on this work. The good news is that Stroud & LimnoTech have a promising proposal pending right now that is likely to reinvigorate our development efforts again!

@tpwrules
Copy link
Contributor Author

@neilh10 Batch uploading is simply storing data in RAM and transmitting multiple data points per request. The overhead in power consumption, HTTP protocol, UUID transmission, etc. is extreme. I had seen your work and while it improves reliability, it does not improve on the data cost or power consumption aspects.

I don't have the code public yet mostly for flexibility in my working style. It takes additional time to extract changes and prepare to submit them to you. I will look into publishing the other parts soon and writing up some more detailed motivation.

@aufdenkampe I looked at the tests and it looks like they are broken for PRs from a fork. They always assume the branch is in EnviroDIY's repo. I don't know that there's a way to fix this, I don't know much about GitHub Actions.

@neilh10
Copy link
Contributor

neilh10 commented Mar 17, 2023

@aufdenkampe thanks. Fantastic to hear porposal in works for Stroud/LimnoTeach. Wow.
I'm bit surprised the automated build in GitHub Actions can be called "testing" - surely "system testing" is end to end, from sensor reading to reliably delivery and viewing on MMW. Of course there are many versions of software verification testing. Maybe I'm being too pedantic as I'm mentoring engineering students :) or maybe I'm not understanding what is happening in the github actions. My version of automated testing (I used to do a lot of it for paid work) - #435

@tpwrules thanks for the feature description. I do think its good open workflow to add an issue to track a new feature/goal. Just my 2cents.
Thanks for the good individual description on each commit.

Great to see a discussion on power usage. My analysis of power overhead, which is greatly improved with the LTE CAT-M1 modems over previous generation, is the slow and often failed/timeouts of response from MMW.
Be happy to model this as coloumbs/mAsecs in a spreadsheet if its of interest.
Of course failures in one area like MMW, is a great opportunity for testing - and my fork/systems just been through a rigorous 4 weeks of testing of reliable delivery from the IoT node - couldn't want for a better test bed of a soggy MMW for testing the end node :)
Described here ODM2/ODM2DataSharingPortal#641

@tpwrules
Copy link
Contributor Author

This is the issue describing the new protocol: ODM2/ODM2DataSharingPortal#649

@SRGDamia1
Copy link
Contributor

I'm sorry I haven't had a chance to dig into this. But I think I've fixed the GitHub action to compile pull requests, so if you push an empty commit, it will re-run checks for you based on the latest action.

Correctly initialize response code variable terminator.
Saves 750 bytes of flash as the buffer can be placed in .bss to be
zero-initialized instead of .data.
Saves ~110 bytes of flash and substantial stack
Saves ~200 bytes of RAM and ~360 bytes of flash.

The equations reproduce the tables previously found in the source code
exactly. The reason for the values in the original tables is unknown.
The test for "15 seconds before the next logging interval" has been wrong
for years, possibly since this code was written, with no apparent
consequence. The behavior is additionally confusing to users deploying the
devices and causes problems with logging as the modem won't get turned
off for a long time.

Remove it completely to solve the problems.
Use cleaner interface and common functions that avoid repeated snprintf
and strlen usage to save ~2.5KB of flash and dozens of lines of code.

Removes extra \r\n from HTTP requests as a side effect, which were against
spec and caused spurious 400 Bad Request status messages from servers.
sendEveryX will be used in the future for data buffering functionality. Increase
its width to an int to allow larger buffers when desired.

Delete sendOffset completely as there is little reason for that particular
functionality. The offset will be in effect set randomly using the time the
datalogger initially powers on.
The C++ standard specifies that all objects in the same translation unit
(i.e. source file) are constructed in order of declaration. Since this is
the most common case when using Modular Sensors, the described case cannot
occur.
@tpwrules
Copy link
Contributor Author

I rebased on top of the current develop branch. Please approve running of the workflows.

@tpwrules
Copy link
Contributor Author

Reminder to approve running of the workflows.

@tpwrules
Copy link
Contributor Author

I think the workflows might still be broken, it says that 0 ran instead of saying that they failed.

@SRGDamia1
Copy link
Contributor

Do you have the code for the next step of creating the batch upload formatted data available?

@SRGDamia1
Copy link
Contributor

I've been slowly working through @neilh10's methods of queuing data onto an SD card to re-attempt failed posts, but I suspect his method wouldn't be compatible with creating your suggested batch post format here. I would like to see how you're queueing and batching the points, if you have it figured out somewhere.

@tpwrules
Copy link
Contributor Author

tpwrules commented Sep 13, 2023

The changes in this PR should not conflict with his methods at all. I also don't see how @neilh10's work would be incompatible with my suggested batch post format. It would make his method better if he could modify his code to take advantage of it but it will work without changes as the format is designed to be backwards compatible. Though of course there would be a compatibility question with the batch transmission on the ModularSensors side.

I've pushed our current production code to this branch for your review. It has been running for months now like a dream. Please see the commit messages for details. Some are contained in this PR but some are not. The big ones are here and here.

There are a number of other not-batch-related changes too you might like. But my plan was to get this PR in, then consider the best method for batch transmission with your team and if so send a follow up PR, possibly with separate ones for the other changes. These other changes should not conflict in spirit with @neilh10's work though there may be some code conflicts to resolve. There are advantages and disadvantages to each approach to batch transmission and maybe you, I, and him can discuss those somewhere before subsequent PRs.

But none of that should prevent this PR from being merged.

@SRGDamia1 SRGDamia1 merged commit a40eb7f into EnviroDIY:develop Sep 14, 2023
@tpwrules
Copy link
Contributor Author

tpwrules commented Sep 14, 2023

Thanks for merging this and looking through the other commits but they are not ready for a proper review. I should have said they are ready for reference. Please wait until I submit a PR to comment.

Maybe there is a forum for real-time discussion we could talk in?

@tpwrules tpwrules deleted the batch-refactoring branch September 14, 2023 18:12
@SRGDamia1
Copy link
Contributor

I don't have Slack or anything similar. Would you like to set up a zoom call?

@tpwrules
Copy link
Contributor Author

Yes, that would be great! If you could send an invite to my GitHub contributor e-mail that would be good. I am available for the next few hours today and any time after 2pm central time tomorrow.

@neilh10
Copy link
Contributor

neilh10 commented Sep 16, 2023

There is "Discussions" that can be enabled per repo. IHMO much better than slack. :)
As I understand this "batch transmission" is bufferin the messages in ram memory.
My onus is on reliability of all transmissions - as reliable as the boot net, (that is walking up to it and reading off values ) A recent example case for me, was where the transmission went done for some months and then I replaced the parts last week and its now working with transmitting all outstanding readings.
To me the only reliable method for delayed transmission is buffering of the readings to the uSD.
Seems to me the @tpwrules method would be layered between getting multiple readings from the uSD for the transaction of generating a JSON batch method.
Practically speaking, this creates a larger packet - and for the edge of cell radio zones with marginal reception (like https://monitormywatershed.org/sites/TUCA_GV01/), a larger messages is more unlikely to succeed. In my recent measurements the biggest power (watts) user is the time take to access the cellphone and setup tcp/ip link ~ 30seconds.
I've also documented a recent test time for uploading 9000 messages - ODM2/ODM2DataSharingPortal#673 (comment) with the server taking a lot of batches of "100 messages" in one cellphone call with no failures. Very nice to see

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants