Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lights not turning off reliably #22

Open
sladey opened this issue Jan 7, 2021 · 21 comments
Open

Lights not turning off reliably #22

sladey opened this issue Jan 7, 2021 · 21 comments

Comments

@sladey
Copy link

sladey commented Jan 7, 2021

I am integrating with Home Assistant and this is such a great addition to any CBUS setup. Logging from HA suggests that the MQTT messages are being sent and received as expected. I have a HA script that is trying to turn 4 lights on and off. The 'on' script works but the 'off' script does not turn all the lights off. Here is the debug from cmqttd:

2021-01-07 14:37:03,635 send: <PointToMultipointPacket object: application=56, source_address=None, sals=[]>
2021-01-07 14:37:03,635 send: b'\0538000119A9p\r'
2021-01-07 14:37:03,636 send: <PointToMultipointPacket object: application=56, source_address=None, sals=[]>
2021-01-07 14:37:03,636 send: b'\05380001259Dq\r'
2021-01-07 14:37:03,639 send: <PointToMultipointPacket object: application=56, source_address=None, sals=[]>
2021-01-07 14:37:03,639 send: b'\0538000120A2r\r'
2021-01-07 14:37:03,640 send: <PointToMultipointPacket object: application=56, source_address=None, sals=[]>
2021-01-07 14:37:03,640 send: b'\053800011AA8s\r'
2021-01-07 14:37:03,794 Incoming data: b'p.q.r.!'
2021-01-07 14:37:03,794 Got packet: ConfirmationPacket(code=b'p', success=True)
2021-01-07 14:37:03,794 recv: confirmation: code = b'p', success = True
2021-01-07 14:37:03,795 Incoming data: b'q.r.!'
2021-01-07 14:37:03,795 Got packet: ConfirmationPacket(code=b'q', success=True)
2021-01-07 14:37:03,795 recv: confirmation: code = b'q', success = True
2021-01-07 14:37:03,795 Incoming data: b'r.!'
2021-01-07 14:37:03,795 Got packet: ConfirmationPacket(code=b'r', success=True)
2021-01-07 14:37:03,796 recv: confirmation: code = b'r', success = True
2021-01-07 14:37:03,796 Incoming data: b'!'
2021-01-07 14:37:03,796 Got packet: PCIErrorPacket()
2021-01-07 14:37:03,796 recv: PCI cannot accept data
2021-01-07 14:37:05,197 Incoming data: b'053FFF001EFFE794058CF59F\r\n'
2021-01-07 14:37:05,198 Got packet: InvalidPacket(payload=b'\xff\x00\x1e\xff\xe7\x94\x05\x8c\xf5', exception=NotImplementedError('Unknown status request type 0x1e'))
2021-01-07 14:37:05,199 hcp: unhandled other packet: InvalidPacket(payload=b'\xff\x00\x1e\xff\xe7\x94\x05\x8c\xf5', exception=NotImplementedError('Unknown status request type 0x1e'))

As you can see I get and invalid packet. When I run the script a second time I get the same issue.

2021-01-07 14:45:01,462 send: <PointToMultipointPacket object: application=56, source_address=None, sals=[]>
2021-01-07 14:45:01,462 send: b'\0538000119A9z\r'
2021-01-07 14:45:01,463 send: <PointToMultipointPacket object: application=56, source_address=None, sals=[]>
2021-01-07 14:45:01,463 send: b'\05380001259Dg\r'
2021-01-07 14:45:01,465 send: <PointToMultipointPacket object: application=56, source_address=None, sals=[]>
2021-01-07 14:45:01,465 send: b'\0538000120A2h\r'
2021-01-07 14:45:01,467 send: <PointToMultipointPacket object: application=56, source_address=None, sals=[]>
2021-01-07 14:45:01,467 send: b'\053800011AA8i\r'
2021-01-07 14:45:01,616 Incoming data: b'z.g.h'
2021-01-07 14:45:01,616 Got packet: ConfirmationPacket(code=b'z', success=True)
2021-01-07 14:45:01,617 recv: confirmation: code = b'z', success = True
2021-01-07 14:45:01,617 Incoming data: b'g.h'
2021-01-07 14:45:01,617 Got packet: ConfirmationPacket(code=b'g', success=True)
2021-01-07 14:45:01,617 recv: confirmation: code = b'g', success = True
2021-01-07 14:45:01,617 Incoming data: b'h'
2021-01-07 14:45:01,716 Incoming data: b'h.!'
2021-01-07 14:45:01,716 Got packet: ConfirmationPacket(code=b'h', success=True)
2021-01-07 14:45:01,716 recv: confirmation: code = b'h', success = True
2021-01-07 14:45:01,717 Incoming data: b'!'
2021-01-07 14:45:01,717 Got packet: PCIErrorPacket()
2021-01-07 14:45:01,717 recv: PCI cannot accept data

To get the light to turn off I need to activate another scene that includes the light in question, then try turning off again and this mostly works.

Unfortunately this does not happen every time I run the on and off scripts. Here is what a successful lights on and then lights off looks like:

2021-01-07 14:46:38,944 send: <PointToMultipointPacket object: application=56, source_address=None, sals=[]>
2021-01-07 14:46:38,944 send: b'\053800021AB4F3j\r'
2021-01-07 14:46:38,988 send: <PointToMultipointPacket object: application=56, source_address=None, sals=[]>
2021-01-07 14:46:38,989 send: b'\0538000219BEEAk\r'
2021-01-07 14:46:38,989 send: <PointToMultipointPacket object: application=56, source_address=None, sals=[]>
2021-01-07 14:46:38,989 send: b'\0538000220B4EDl\r'
2021-01-07 14:46:38,989 send: <PointToMultipointPacket object: application=56, source_address=None, sals=[]>
2021-01-07 14:46:38,989 send: b'\053800792525m\r'
2021-01-07 14:46:39,106 Incoming data: b'j.k.l.'
2021-01-07 14:46:39,106 Got packet: ConfirmationPacket(code=b'j', success=True)
2021-01-07 14:46:39,107 recv: confirmation: code = b'j', success = True
2021-01-07 14:46:39,107 Incoming data: b'k.l.'
2021-01-07 14:46:39,107 Got packet: ConfirmationPacket(code=b'k', success=True)
2021-01-07 14:46:39,107 recv: confirmation: code = b'k', success = True
2021-01-07 14:46:39,107 Incoming data: b'l.'
2021-01-07 14:46:39,107 Got packet: ConfirmationPacket(code=b'l', success=True)
2021-01-07 14:46:39,107 recv: confirmation: code = b'l', success = True
2021-01-07 14:46:39,250 Incoming data: b'm.'
2021-01-07 14:46:39,251 Got packet: ConfirmationPacket(code=b'm', success=True)
2021-01-07 14:46:39,251 recv: confirmation: code = b'm', success = True

2021-01-07 14:46:45,611 send: <PointToMultipointPacket object: application=56, source_address=None, sals=[]>
2021-01-07 14:46:45,611 send: b'\0538000119A9n\r'
2021-01-07 14:46:45,618 send: <PointToMultipointPacket object: application=56, source_address=None, sals=[]>
2021-01-07 14:46:45,619 send: b'\05380001259Do\r'
2021-01-07 14:46:45,665 send: <PointToMultipointPacket object: application=56, source_address=None, sals=[]>
2021-01-07 14:46:45,665 send: b'\0538000120A2p\r'
2021-01-07 14:46:45,665 send: <PointToMultipointPacket object: application=56, source_address=None, sals=[]>
2021-01-07 14:46:45,665 send: b'\053800011AA8q\r'
2021-01-07 14:46:45,778 Incoming data: b'n.'
2021-01-07 14:46:45,779 Got packet: ConfirmationPacket(code=b'n', success=True)
2021-01-07 14:46:45,779 recv: confirmation: code = b'n', success = True
2021-01-07 14:46:46,194 Incoming data: b'o.p.q.'
2021-01-07 14:46:46,195 Got packet: ConfirmationPacket(code=b'o', success=True)
2021-01-07 14:46:46,195 recv: confirmation: code = b'o', success = True
2021-01-07 14:46:46,195 Incoming data: b'p.q.'
2021-01-07 14:46:46,195 Got packet: ConfirmationPacket(code=b'p', success=True)
2021-01-07 14:46:46,196 recv: confirmation: code = b'p', success = True
2021-01-07 14:46:46,196 Incoming data: b'q.'
2021-01-07 14:46:46,196 Got packet: ConfirmationPacket(code=b'q', success=True)
2021-01-07 14:46:46,196 recv: confirmation: code = b'q', success = True

Any help would be really appreciated. I am running cmqttd on

Distributor ID: Ubuntu
Description: Ubuntu 20.04.1 LTS
Release: 20.04
Codename: focal

Python version: 3.8.5

@beerygaz
Copy link

I'm seeing this issue too. It seems especially prevalent when I run a script or call a light.turn_off service for multiple groups simultaneously. I see them hit MQTT and cmqttd but only the first one or two lights go off, the rest do not. I see the same PCIErrorPacket error as you do.

Could it be that there are too many messages in rapid succession?

2022-06-19 19:30:00,010 send: b'\\05380001487Aq\r'
2022-06-19 19:30:00,153 Incoming data: b'm.n.o.'
2022-06-19 19:30:00,154 Got packet: ConfirmationPacket(code=b'm', success=True)
2022-06-19 19:30:00,154 recv: confirmation: code = b'm', success = True
2022-06-19 19:30:00,154 Incoming data: b'n.o.'
2022-06-19 19:30:00,154 Got packet: ConfirmationPacket(code=b'n', success=True)
2022-06-19 19:30:00,154 recv: confirmation: code = b'n', success = True
2022-06-19 19:30:00,154 Incoming data: b'o.'
2022-06-19 19:30:00,154 Got packet: ConfirmationPacket(code=b'o', success=True)
2022-06-19 19:30:00,154 recv: confirmation: code = b'o', success = True
2022-06-19 19:30:00,297 Incoming data: b'p.'
2022-06-19 19:30:00,298 Got packet: ConfirmationPacket(code=b'p', success=True)
2022-06-19 19:30:00,298 recv: confirmation: code = b'p', success = True
2022-06-19 19:33:29,724 send: <PointToMultipointPacket object: application=223, source_address=None, sals=[ClockUpdateSAL(val=datetime.date(2022, 6, 19)), ClockUpdateSAL(val=datetime.time(19, 33, 29, 724450))]>
2022-06-19 19:33:29,724 send: b'\\05DF000E0207E60613060D0113211DFFA2r\r'
2022-06-19 19:33:29,867 Incoming data: b'!'
2022-06-19 19:33:29,867 Got packet: PCIErrorPacket()
2022-06-19 19:33:29,867 recv: PCI cannot accept data
2022-06-19 19:33:32,871 Incoming data: b'0565380002230039\r\n'
2022-06-19 19:33:32,871 Got packet: <PointToMultipointPacket object: application=56, source_address=101, sals=[<LightingRampSAL object: group_address=35, duration=0, level=0>]>

@bdnstn
Copy link

bdnstn commented Aug 28, 2022

I'm also seeing an issue where the lights don't turn off reliably when set off from an automation with a large number of other lights. The lights show as off in Home Assistant, but some actually remain on.

@sladey
Copy link
Author

sladey commented Aug 28, 2022

I am seeing the same thing. I am putting lights in a light group then turning off the group. Some lights stay on. If I trigger the Home Assistant script multiple times I typically get additional lights turn off (from the original group).

@rbswift
Copy link

rbswift commented May 21, 2023

Did anyone have any luck resolving this? It would be really great to be able to control several groups in a batch reliably.

@bdnstn
Copy link

bdnstn commented May 21, 2023

I didn’t. I went back to using my cgateweb based solution, which has been very stable.

@mitchell-johnson
Copy link

I am seeing the same issue on my setup. running scenes with multiple lights doesn't work consistently

@rbswift
Copy link

rbswift commented Jun 19, 2023

I'm using cbus for a new home and this is on my todo list to investigate and hopefully fix as I need reliable bulk group changes. I'd love to eliminate c-gate and the java runtime from my final environment as these seem quite redundant. If anyone has any hints on where to start looking that would be great! Cheers

@Nick-Adams-AU
Copy link

If you have a 5500SHAC or similar, check out https://github.com/autoSteve/acMqtt

@rbswift
Copy link

rbswift commented Jun 19, 2023

@Nick-Adams-AU that looks great but I only have the old Ethernet CNI

@mitchell-johnson
Copy link

After some investigation I believe that the issue is just that its sending commands too quickly for the CNI to handle. Here is the comment from pciprotocol.py
` def on_pci_cannot_accept_data(self):
"""
Event called whenever the PCI cannot accept the supplied data. Common
reasons for this occurring:

    * The checksum is incorrect.
    * The buffer in the PCI is full.

    Unfortunately the PCI does not tell us which requests these are
    associated with.

    This error can occur if data is being sent to the PCI too quickly, or
    if the cable connecting the PCI to the computer is faulty.

    While the PCI can operate at 9600 baud, this only applies to data it
    sends, not to data it receives.

    """
    logger.debug('recv: PCI cannot accept data')`

Actions are to figure out what the best rate is and implement a message queue

@rbswift
Copy link

rbswift commented Jun 22, 2023

That's a great lead @mitchell-johnson thanks. I wonder if someone with a SHAC or similar could send a batch of commands through and snoop for the timing it uses? Come to think of it even the toolkit can batch switch multiple groups can't it. Maybe c-gate could be a point of reference for timing?

@mitchell-johnson
Copy link

It seems its a known thing. In the setup code there is already. I have forked this PR #36 that adds a throttled queue to get the status of all devices on connection. I dunno where 97 ms came from but works for that so im going to see if I can just reuse the same thing. https://github.com/CABrouwers/cbus/blob/c8c4000b81cfa1258c466d0c45b313f8ff6759d1/cbus/daemon/cmqttd.py#L174

@rbswift
Copy link

rbswift commented Jun 22, 2023

That sounds promising! I'm usually bit averse to the old just add a sleep as a fix but sometimes working with undocumented protocols force a bit of improvisation.

@mitchell-johnson
Copy link

I have pushed the changes here if you wanted to try them out. Just adds a 100ms delay between each light. I don't love it. I believe I should be able to figure out which command fails and retry it but at least its working for me now

@rbswift
Copy link

rbswift commented Jun 22, 2023

Awesome thanks. My cbus kit is packed up ready for install at a new build but next time I fire it up I will definitely test it. Hopefully someone else will too! Great work.

@mitchell-johnson
Copy link

Feel free to email me mitchell at johnson dot fyi if you need a hand with the setup

@sladey
Copy link
Author

sladey commented Jun 23, 2023

Hey Mitchell, I tried to install your fork of cmqttd and things went pretty crazy. Previously I was running the micolous code from 2021. In your fork there have been several updates around how labels are handled. I am not using a project.cbz file and with the new version I was getting errors in publish_all_lights. Are you using a project.cbz file? Do you think I can just cherry pick your queue related changes?

@mitchell-johnson
Copy link

Hey @sladey Yeah my code is based on the changes made in PR #36 so is pretty different to what you would be running. I dont see why not. You would need to pull in the class Periodic and the delay code. let me know if you want some help.

@sladey
Copy link
Author

sladey commented Jun 26, 2023

Thanks @mitchell-johnson I have made the changes and it is working like a charm. The delay give a nice ripple effect through the house as I operate on the big groups we have here. Thanks so much for nutting out this problem I have been frustrated by it (but not enough to debug myself) for the last two years !! Thanks again.

@mitchell-johnson
Copy link

There are 2 areas that I might get to researching. The CNI obviously has some sort of buffer that gets full up. In theory rather than waiting 100ms to send each command we should be able to do 2-3 at a time then wait some amount of MS then continue. Also there is confirmation code logic that is not fully implemented that should allow you to check if a given command is executed successfully. Doing both of those would be the best solution.

@damon-atkins
Copy link

Might find this useful for debugging/tracing communications https://github.com/robert/how-to-build-a-tcp-proxy/blob/master/tcp_proxy.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants