Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setWatchdog safety checks #63

Open
jdavid opened this issue Mar 28, 2019 · 7 comments
Open

setWatchdog safety checks #63

jdavid opened this issue Mar 28, 2019 · 7 comments

Comments

@jdavid
Copy link
Member

jdavid commented Mar 28, 2019

Just like in WaspPWR::deepSleep, after setting the alarm, get it and verify the values are as expected, if they're not then return an error.

Also, in WaspPWR::deepSleep I think sleep_disable() should be called in case of an error.

@jdavid
Copy link
Member Author

jdavid commented Mar 28, 2019

Also, the RTC DS1337C specification and API source code don't match Libelium's documentation. It looks they did something hardware in v15 to implement a different behaviour.

From the specification and source code setWatchdog(5) adds 5 minutes to the current time and sets alarm 2 to trigger when the date, hour and minute match. So it's an "absolute" alarm actually (except it doesn't include month and year information).

From the Libelium's documentation setWatchdog(5) is supposed to trigger the alarm when the minute changes 5 times. See the technical guide, but it's better explained in http://www.libelium.com/development/waspmote/examples/rtc-10-set-watchdog/

Test we can do:

  • Call setWatchdog(3) and then change the time sometime in the past, see whether the watchdog is triggered in 3 minutes or not.

@jdavid
Copy link
Member Author

jdavid commented Apr 2, 2019

Pushed sketches/test_scripts/test_watchdog/test_watchdog.pde

Output is:

H#
Watchdog: test whether it's absolute or relative.
- Some Libelium docs say it's relative.
- The RTC DS1337C specification says it's absolute.

RTC time: Tue, 19/04/02, 12:06:20
Watchdog Alarm matches [Date, hh:mm] --> [02, 12:07]

RTC time: Mon, 19/04/01, 00:00:55
Watchdog Alarm matches [Date, hh:mm] --> [02, 12:07]

Will reboot at next 00s or wait for the absolute date?

Inside infinite loop. Time: Mon, 19/04/01, 00:00:55
Inside infinite loop. Time: Mon, 19/04/01, 00:00:56
Inside infinite loop. Time: Mon, 19/04/01, 00:00:57
Inside infinite loop. Time: Mon, 19/04/01, 00:00:58
Inside infinite loop. Time: Mon, 19/04/01, 00:00:59
Inside infinite loop. Time: Mon, 19/04/01, 00:01:00
Inside infinite loop. Time: Mon, 19/04/01, 00:01:01
Inside infinite loop. Time: Mon, 19/04/01, 00:01:02
Inside infinite loop. Time: Mon, 19/04/01, 00:01:03
Inside infinite loop. Time: Mon, 19/04/01, 00:01:04
Inside infinite loop. Time: Mon, 19/04/01, 00:01:05
[...]

This is bad (but expected) news. What we're testing here is what happens if for some reason the RTC goes back in time.

For instance in the case the RTC time is reset to 00/01/01 the watchdog won't trigger until a far distant future. Though if the time is reset probably the alarms will reset as well (that's asnother test).

This has been tested with firmware H, maybe it's different with firmware J.

@johnhulth
Copy link
Contributor

This is what i get.

J#
Watchdog: test whether it's absolute or relative.

  • Some Libelium docs say it's relative.
  • The RTC DS1337C specification says it's absolute.

RTC time: Tue, 19/04/02, 12:06:20
Watchdog Alarm matches [Date, hh:mm] --> [02, 12:07]

RTC time: Mon, 19/04/01, 00:00:55
Watchdog Alarm matches [Date, hh:mm] --> [02, 12:07]

Will reboot at next 00s or wait for the absolute date?

Inside infinite loop. Time: Mon, 19/04/01, 00:00:55
Inside infinite loop. Time: Mon, 19/04/01, 00:00:56
Inside infinite loop. Time: Mon, 19/04/01, 00:00:57
Inside infinite loop. Time: Mon, 19/04/01, 00:00:58
Inside infinite loop. Time: Mon, 19/04/01, 00:00:59
Inside infinite loop. Time: Mon, 19/04/01, 00:01:00
Inside infinite loop. Time: Mon, 19/04/01, 00:01:01
Inside infinite loop. Time: Mon, 19/04/01, 00:01:02
Inside infinite loop. Time: Mon, 19/04/01, 00:01:03
Inside infinite loop. Time: Mon, 19/04/01, 00:01:04

@jdavid
Copy link
Member Author

jdavid commented Apr 2, 2019

Thanks! It's the same behaviour.

@jdavid
Copy link
Member Author

jdavid commented Apr 2, 2019

A new test, with the same sketch. This one tests what happens when both alarms are triggered at the same time.

H#
What happens when both alarm 1 and 2 are triggered at the same time?

RTC time: Tue, 19/04/02, 13:05:40
Alarm 1  Alarm matches [Date, hh:mm:ss] --> [02, 13:06:00]
Watchdog Alarm matches [Date, hh:mm] --> [02, 13:06]

Sleep..

H#
What happens when both alarm 1 and 2 are triggered at the same time?

RTC time: Tue, 19/04/02, 13:05:40
Alarm 1  Alarm matches [Date, hh:mm:ss] --> [02, 13:06:00]
Watchdog Alarm matches [Date, hh:mm] --> [02, 13:06]

Sleep..

H#

The watchdog wins.

EDIT: Tests by John with J firmware show that most often Alarm 1 wins, but sometimes the watchdog wins.

@jdavid
Copy link
Member Author

jdavid commented Apr 2, 2019

TODO:

  • Should set the alarm 2 (watchdog) ourselves. The problem with RTC.setWathcdog(...) is that it uses RTC_ALM2_MODE2 so it triggers once a month. We should use RTC_ALM1_MODE3 so it triggers once a day (or even RTC_ALM1_MODE4 for once an hour). This way even if the RTC time somehow goes to the past it won't take too long to trigger the watchdog (unless of course the watchdog is reset).

  • May call PWR.sleep(ALL_OFF) instead of PWR.deepSleep(...) and set the alarm 1 ourselves.

  • Maybe set alarm1 in 01s so it never triggers at the same time as alarm2, because alarm2 always triggers at 00s.

  • Maybe remove the cooldown feature. When the battery is low it makes the mote to sleep for longer (2x or 3x). But we only have problems with battery when stuck in networking or gps. So maybe remove this or replace by something simpler (like if battery is low sleep for a fixed time, e.g. ~ 24h)

  • Re-read the interruptions chapter and check for instance whether there may be conflicts with the ACC.

@jdavid
Copy link
Member Author

jdavid commented Apr 5, 2019

Many changes have been done already, mostly details. Here a couple of highlights:

  • Now we log some messages early in the boot process, this way if something goes wrong in that stage we'll get a clue. This change is visible, instead of the three dots ... while booting we now see some messages.

  • We now set the alarms ourselves, so we've more control and it's clearer what values the alarms are set to. No need to set alarm 1 to 01s I think since now it's pretty obvious that they won't trigger at the same time.

Updated and extended TODO list:

  • Review WaspPWR::sleep, check differences with WaspPWR::deepSleep which we used before. Maybe implement it ourselves, see the Atmel comments in source code for sleep_enable(), sleep_disable(), etc. I think they don't match exactly what Libelium does.

  • Since the RTC is so critical, we should wrap writing to it. Whenever a register is written read it and check the value is correct, maybe retry, if fails panic message.

  • Reread interrupts documentation, check for issues with ACC and other I2C devices, or any devices generating interrupts, review sleep/awake again.

  • Check that I2C attached devices are on when working with the RTC, do we need it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants