Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(loop) separate client and loop #31

Open
wants to merge 65 commits into
base: master
Choose a base branch
from

Conversation

Tieske
Copy link
Contributor

@Tieske Tieske commented Oct 26, 2021

So here's a PR with a LOT of changes, more than I had anticipated, and definitely breaking. I'll try to give an overview by functionality.

ioloop

The ioloop "runs" the process. It would read on all clients and run some functions, in a loop. Each read operation would however have a small, but fixed, timeout. This measn that if you have 20 clients, that a single iteration would always take minimum 20 * 0.005 seconds, (0.1 second). If one device is busy, and the others are idle, that would slow down the busy one.

Writes and connecting would always be a blocking operation. Meaning;

  • connect: could hang with unlimited timeout
  • send: would operate under a regular timeout

Changes: a client iteration now returns the maximum amount of time the client want's to sleep. The same for any functions added. The actual sleep is then the minimum of all those reported "maximums". Th read will be on timeout 0, so non-blocking at all. If a single client responds with "requested delay = 0" then the loop would not sleep, but immediately run the next iteration.
Clients and functions can also report they are idle, in which case the wait will slowly increase with each idle iteration until a specified maximum. Once new data is received, (client no longer idle) the delay returns to 0.
Writing and connecting is still blocking, albeit connecting now uses a proper timeout.

connector signals

The client code had hardcoded checks for some error messages, and didn't check all of them. Since the only code that can interpret the actual error messages is the connector itself, since it was written for that implementation (luascoket, ssl/luasec, copas or openresty).
So instead 2 "signals" are now defined, first being "closed", when a read fails because the connection is closed by the server. The second "idle" indicating no data was available to be read, so the ioloop can increase the sleep.
The connectors now implement the transform from their error messages to the common signals.

accommodating yielding and blocking sockets

The client should be agnostic to the different sockets types, yielding ones (copas and openresty, which are schedulers, so instead of blocking they'll go of and run other threads, until the read is complete), or the blocking sockets (luasocket and luasec/ssl, which simply block the ioloop while they process socket io).
In either case the idea is to prevent a busy-loop while accommodating a responsive client. With the yielding ones the scheduler creates independent threads for each task, and calls them without any delays. The yielding in the connector implementation will allow the scheduler to process other threads while the current one is busy.
With the blocking socket types, the connectors read with a 0 timeout to ensure to never block, and return an idle-signal, and the io-loop should take care of "sleeping" enough to prevent a busy-loop.

Note that the change is NOT complete yet.

@Tieske Tieske changed the title refactor(keepalive) remove keepalive logic from ioloop refactor(loop) separate client and loop Nov 1, 2021
@Tieske
Copy link
Contributor Author

Tieske commented Nov 1, 2021

I also added logging (using LuaLogging), not required for library use, but is required when running the tests on this branch.

(actually had to write a new LuaLogging appender to enable it for the MQTT nginx connector)

@xHasKx
Copy link
Owner

xHasKx commented Nov 1, 2021

Wow, that's huge work, thanks for your efforts!
And it looks like there's huge review work ahead :)

@Tieske
Copy link
Contributor Author

Tieske commented Nov 3, 2021

yes, a bit more than anticipated 😄

anyway, I don't expect it to be fully functional yet. Just starting to implement some devices, that will probably be the proof-of-the-pudding. (it would also be a final step in testing the Copas beta, so that can be released as well)

@@ -99,17 +101,6 @@ describe("MQTT client", function()
username = "stPwSVV73Eqw5LSv0iMXbc4EguS7JyuZR9lxU5uLxI5tiNM8ToTVqNpu85pFtJv9",
}
},
{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this test case removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the "sync" loop was broken and should not be used. It might even be dangerous, it could hang copas in a spin (when a connection fails and there is an auto-reconnect). So I removed the sync loop from init.lua, and hence had to remove the tests as well (just fixed all the tests, see the last one, the copas one for a correct implementation).

Essentially the different loops (copas, nginx) require their own set up, they are too different wrt loops, timers, sleeping, etc. The sync loop was also too limited as it didn't do keepalives etc.

I was thinking of maybe adding module for copas and nginx with add/remove methods, such that

  • ioloop:add(...), copas:add(...), nginx:add(...), and
  • ioloop:remove(...), copas:remove(...), nginx:remove(...)

would be identical in function signatures, just implementing it in different environments. It would actually be pretty easy to detect the environment automatically.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sync loop can be used in the simplest mqtt case - connect, subscribe, publish command, receive answer, disconnect.
This zero-depenndence scenario have to be preserved.

Copy link
Owner

@xHasKx xHasKx Nov 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even in mqtt docs described simpler case - connect, publish, disconnect. Even without CONNACK wait and check. This is a typical behavior for IoT sensors to periodically publish it's value

See "Non normative comment" above http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/errata01/os/mqtt-v3.1.1-errata01-os-complete.html#_Toc442180846

@Tieske
Copy link
Contributor Author

Tieske commented Nov 4, 2021

🎉 all green now 😄

Comment on lines 36 to 50
-- adding another thread; copas handlers should return quickly, anything
-- that can wait should be off-loaded from the handler to a thread.
-- Especially anything that yields; socket reads/writes and sleeps, and the
-- code below does both, sleeping, and writing (implicit in 'publish')
copas.addthread(function()
for i = 1, num_pings do
copas.sleep(delay)
print("ping", i)
assert(ping:publish{ topic = "luamqtt/copas-ping/"..suffix, payload = "ping"..i, qos = 1 })
end

print("ping done")
assert(ping:publish{ topic = "luamqtt/copas-ping/"..suffix, payload = "done", qos = 1 })
ping:disconnect()
end)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xHasKx fyi, these examples might have accidentally worked, but were essentially broken in a bad way.

@Tieske
Copy link
Contributor Author

Tieske commented Nov 5, 2021

@xHasKx some review comments would be nice if you can spare some time.

I need to review the nginx examples, as they probably suffer from the same problems as the copas one (and then there's a host of nginx specifics to watch for)

When that is done, docs need checking and updating.

@Tieske
Copy link
Contributor Author

Tieske commented Nov 5, 2021

Sync loop can be used in the simplest mqtt case - connect, subscribe, publish command, receive answer, disconnect.
This zero-depenndence scenario have to be preserved.

The differences between the ioloop and sync loop examples are negligible. And since ioloop is available, the zero-dependency scenario is still working. Right?

@Tieske Tieske force-pushed the keepalive branch 4 times, most recently from ac7a11f to 3aaab29 Compare November 9, 2021 14:55
@Tieske Tieske force-pushed the keepalive branch 12 times, most recently from eee1306 to 10080da Compare August 6, 2022 11:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants