-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
turn old gateway smart assistant into extension add-on #1040
Comments
Just FYI, @lissyx had initially done this with the voice-addon when he was rewriting it to use DeepSpeech. However, getting voice data from a (remote) browser across an IPC connection to the add-on is non-trivial. He worked around that by using a separate, hard-coded local WebSocket, which prevented it from working on outside networks. |
That's a step ahead of what I was imagining as a starting point. I was thinking of simply using the same smart assistant UI (except change the fox to a talking house), same intent-parser, and same STT back-end as before (processed in the Cloud, like Firefox Voice), except putting it all into an optional add-on. For step 2, it would be great to enable the user to configure an option of processing the STT locally on the gateway, no cloud required. If both options are available, we'd need a way to be clear to users where the STT command is being processed. Perhaps the smart assistant page could be split in half. One side = cloud, the other side = local. And if that wouldn't be technically feasible, the user could simply pick one or the other (and ideally switch at any time). |
Hi all, Could we have the assistant running on gateway with a web UI loike before, but without the STT engine, it just accepts written text like before. STT can be run on whatever device is being used to access the web UI, that way the STT runs locally at the client and the assistant runs on the gateway. Everyone can use whichever STT engine they prefer and it's not something which the WebThings project needs to worry about, we only need to maintain the assistant. Cheers 🙂 |
Great idea. Then would the assistant have the option of which STT add-on (text input) could be configured to tie into it? or would it accept the text from any/multiple STT engines? I could even imagine a configuration to allow the user to select the old cloud-based STT engine of the original assistant, in case they don't have a local STT option. |
@madb1lly Yes actually we considered that architecture when we first built it, by using the Web Speech API in the browser. Unfortunately the STT (speech recognition) part of the Web Speech API is still only supported in Chromium-based browsers as far as I know, despite years of work to try to get it turned on in Firefox by default with a choice of back ends. I would personally support using that approach in a smart assistant extension and only support text-based commands on browsers which don't support speech input. That also shifts the hardware requirements of speech recognition away from the gateway to the client which may have better suited hardware. One side effect of also relying on the browser for the STT (speech synthesis) part (which we never implemented) is that the assistant will have a different voice depending on which browser you're using, but that could be OK. @kgiori Running the STT on the client side would mean relying on the browser for speech recognition. It might be feasible to send audio directly to a cloud service from the client as a fallback, rather than going via the gateway. |
Hi @kgiori, Well, the user would have the option of which STT engine to use, but the options wouldn't necessarily be presented by the Gateway, e.g. they can use voice typing on a smartphone, or example, or perhaps Firefox Voice on desktop. These options would be completely for the user to decide, independently of the gateway, and the gateway would not care. The way I see it working is:
As an option, the Gateway admin could opt to use a cloud-hosted STT engine, but the only way I can see this working easily is for the WebThings infrastructure to host that and make its optional use part of the add-on... which is how I think it worked, before, isn't it? Configuring the gateway to use any arbitrary cloud hosted STT engine would probably be too difficult to be worth it. Cheers 🙂 PS - I see that @benfrancis gave a far more eloquent answer whilst I was typing! 😆 |
Count me in. I'd love to have STT feed a smart assistant add-on (extension add-on with its own GUI page). And if I have to use Chrome or manually enable the web speech API in Firefox, to use those for STT processing, that's fine too. With respect to my local gateway, I currently use the Voice-contoller add-on, and I've tried Voco too. Would it be possible to connect the Voice-controller Add-on or the Voco Add-on to this smart assistant extension add-on? And if so, would it be as an additional STT input? or would it only be possible as a replacement for the STT processing done by a web speech API? |
In an earlier version of the gateway, there was an integrated voice assistant experiment that was taken out of the UI. I'd like to see it come back as an add-on. And anyone know how to make a "talking home" animated GIF (to replace the fox)?
The text was updated successfully, but these errors were encountered: