Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asynchronous Deferred Response Support #74

Closed
wants to merge 1 commit into from
Closed

Conversation

jsetton
Copy link
Collaborator

@jsetton jsetton commented May 20, 2018

Updates related to #73

I have implemented the deferred response support using a DynamoDB table to store the user access tokens allowing asynchronous messages to be sent to the Alexa event gateway. That feature is configurable through the deferredResponse=<timeInSeconds> metadata parameter. During the deferral time period, the skill streams the openHAB events looking for state updates of one or multiples items depending on the command(s) initially sent. If the updates are found, that time period will stop, and the deferred response will be sent, otherwise it will time out with a deferred error response.

That implementation required to switch the Lambda function return method to use the handler callback instead of the context functions since the latter doesn't allow async functions to keep running after being called. The Lambda function timeout has been increased to 45 seconds to accommodate the 30 seconds maximum deferral time.

Additionally, in coordination with the deferred response feature, I added the ability to use a specific item (called "sensor") for property state reporting over the actionable item state. It is configurable through the itemSensor=<itemName> metadata parameter. Here is an example of a lock item with both features configured.

Contact doorLockStatus "Front Door Status"
Switch doorLock "Front Door" {alexa="Lock" [deferredResponse=20,itemSensor="doorLockStatus"]}

Some extensive code refactoring and changes were made to the rest functions to accommodate for the new Alexa API & event gateway calls. I have updated the config structure as well and improved the property map object with prototype functions including enforcing capability by item type.

I updated the supported item mapping documentation (bottom part) along with the install procedure adding steps to create the database and Lambda function role.

@jsetton jsetton force-pushed the v3 branch 7 times, most recently from caec13b to 4b3e2a5 Compare May 20, 2018 16:37
@digitaldan
Copy link
Collaborator

@jsetton this may take me a bit to review as there's a ton of changes (but thats ok). One thing that comes to mind is the extended runtime of the lambda function. Right now on our v2 skill we are averaging over 1 million requests per month, and with the short timeout we have, we don't pay for any duration costs. I need to understand how this would effect out monthly costs and if there's anything we could do to protect ourselves from runaway costs. If you have any thoughts there it would be greatly appreciated. I'll find out what our monthly duration amount is and report back.

* added ability to define deferred response time for a given item through metadata parameter
* added authorization directive support managing granted user access tokens in a dynamodb table
* changed the lambda function return method to use handler callback instead of context to accomodate for the async logic
* updated alexa response property state accuracy logic with ability to associate an item sensor through metadata parameter
* added ability to define lock property state mapping when using an item sensor through metadata parameter
* updated temperature mode binding mapping to differentiate legacy bindings from latest one
* improved property map object with prototype functions along with item type per capability enforcement
* updated config structure and refactored rest functions code to accomodate for new alexa api & gateway calls
* updated report state function to use generic alexa response property state logic
* added deploy instructions for creating aws dynamodb table and lambda function role.
@digitaldan
Copy link
Collaborator

@jsetton , with the small (128mb) lambda instance we get 3,200,000 seconds per month under the free tier, we are currently WAY under this, last month we did 1.7 million requests and ~150,000 seconds of compute time. So while its still good to watch our execution time, I think we are ok :-)

@jsetton
Copy link
Collaborator Author

jsetton commented May 20, 2018

@digitaldan, I am not sure about that. Based on the number you provided, that gives us a maximum average of 1.88 seconds per request if my calculation is correct. There is definitely more computation done in v3 compared to v2, especially with the improved property state accuracy logic I added. So pushing timeout to 45 seconds would certainly be a stretch. Also keep in mind, that with v3, Alexa is requesting report state since we set the capability retrievable property to true, which will eat up some of that compute time.

Maybe we could have a lighter version of the deferred response feature for the official skill, where it would create a CloudWatch scheduled event, at the time of the request, that would trigger the Lambda function at the end of the time period. This would be a less intelligent way but still functional. We can also limit the deferred response feature to the lockController capability only for now and also disable Alexa state polling which will just affect the control functionality in the Alexa app.

In terms of DynamoDB, it looks like the usage would be covered by the free tier. Anyway, it is only used on request for deferred response enabled items and when a user initially enables the skill under his account.

@jsetton
Copy link
Collaborator Author

jsetton commented May 20, 2018

On a side note, I have been under this Alexa promotional credit program for my own skills. Might be worth looking into. Although, I am not sure how long it will last.

@digitaldan
Copy link
Collaborator

Based on the number you provided, that gives us a maximum of 1.88 seconds per request if my calculation is correct.

Sorry, was not clear. This is the GB seconds, which is how Amazon reports billable usage., they don't report the actual runtime (at least under billing). If I work backwards from their pricing formula , I think its actually about 12 million seconds worth of execution time, which makes a lot more sense given 1.7 million requests This is about half of our free usage plan. If we did go over per month and have to pay a few bucks, then thats fine too. See https://aws.amazon.com/lambda/pricing/

@jsetton
Copy link
Collaborator Author

jsetton commented May 20, 2018

Unless I am missing something, your initial numbers are correct. In the free tier, you get 400,000 GB-seconds. So if using a 128MB function, that gives you 400,000 * 1,024 / 128 = 3,200,000 seconds per month of execution time.

Using the Lambda pricing calculator, 1.7M requests of 128MB function with a 5,000 ms average execution time is estimated at about $11 per month including the free tier. So I think the deferred response feature I implemented would be costly in its current form when deployed for mass usage.

@digitaldan
Copy link
Collaborator

The 150k on our bill was Gb seconds, out of a total of 400k GB seconds at the free tier. From the pricing page, i used the equation: { seconds * 128/1024 = 150,000 } to come up with the amount of seconds we actually ran (the 12 mil). This means a avg run time of 7 seconds. Plugging in 7 seconds @ 1.7 mill requests into the calculator , would mean we are paying $18.27 a month, which we are not. So clearly i messed up somewhere :-)

@jsetton
Copy link
Collaborator Author

jsetton commented May 20, 2018

150,000 GB-seconds * 1024 MB = 153,600,000 MB-seconds / 128 MB = 1,200,000 seconds of execution time used. So that's 1,200,000 seconds / 1,700,000 requests = 0.706 seconds on average per request per month currently.

Based on the current number of monthly requests, the max average execution time per request for a 128MB function should be just under 2 seconds if we want to stay in the free zone. At 3 seconds, monthly cost would be around $4 and that's with no request increase.

So having the Lambda functions wait for a state update before sending the deferred response is dangerous especially if users start abusing that feature. I am looking at ways to improve this, potentially using AWS Step Functions, but we will probably have to loose a bit in accuracy in regards to deferred response if we want to keep cost low. I am not familiar with the cloud connector implementation but it would be nice if we could set one time state update subscriptions that would send back a push notification to the AWS API gateway.

@digitaldan
Copy link
Collaborator

I need to wear my reading glasses, 12,000,000 != 1,200,000 that was mistake.

The cloud service does have the ability to push item change events as you probably know, this was primarily done for IFTTT. Since I am also the steward of the myopenhab service, i can also make changes there for us, although I have purposely kept the skill designed in such a way that all it relied on was the OH rest api and not anything with the cloud service. Let me think about it a bit.

@digitaldan
Copy link
Collaborator

Ok, here's a thought. Ideally what we want is to have the users OH watch the state item and push to the gateway as soon as it changes. OH 2.x has a new rule engine thats API based. It would be ideal if we could schedule a one time rule through the API that triggers on an item state change event and sends this to the Alexa service. Unfortunately, I don't think there is the concept of scheduling a rule to just run once, so the rule would need to delete itself. Also, the new rules engine is not yet enabled by default, so this would be a requirement, and finally the rule would need the token in it, which doesn't sounds secure.

Another thought is we create a simple OH 2 binding which would do this, my guess is this would be trivial to write ( and i would volunteer to do it). The binding would be optional and only needed for deferred events. Again, I was trying to make this work just using the REST api's, but if thats not going to work, then I think its ok to require a binding to get all the functionality available.

@jsetton
Copy link
Collaborator Author

jsetton commented May 20, 2018

I think creating a new binding would make sense. It would actually, to a certain extend, bring it in line with the HomeKit integration although not for the same reason. My guess is we would have a simple REST API available to communicate with the lambda function through the cloud connector (e.g. https://myopenhab.org/alexa) where we could have the main REST API items endpoint available as pass-through and have additional endpoints specific to the skill needs along with the ability to send push notification back to Alexa as you mentioned.

The more I think about it, the more I believe it makes more sense to have this integrated into the cloud connector binding directly since it will already be installed to allow the Alexa integration. I am sure the Google Assistant integration will need similar functionalities down the line as well and I think having a separate binding for just one specific feature may be overkill.

@digitaldan
Copy link
Collaborator

With a binding installed, the OH instance could send the events directly to the API gateway, as long as it receives and stores the user access token and the regional API endpoint to hit from the skill? I'm not sure thats what amazon had in mind, I wonder if there's any restrictions around rate limiting or other protections around usage? It would certainly be the most efficient route.

@jsetton
Copy link
Collaborator Author

jsetton commented May 21, 2018

Based on their smart home skill high level diagram, they expect the smart device manufacturer to handle asynchronous responses while the Lambda function just handles the synchronous responses.

Looking at their transaction flow diagram for sending events to Alexa gateway, they are also expecting the smart device manufacturer to store and manage the access tokens. Although, I still think the managing part could be done in the Lambda function as long it has access to the storage system.

Regarding rate limiting, there isn't much information but I would say it is high since they are expecting to receive ChangeReport responses which I think could be pretty chatty if you have a lot of devices integrated in Alexa.

@digitaldan
Copy link
Collaborator

they expect the smart device manufacturer to handle asynchronous responses

Right , but in our case this would be coming from thousands of different endpoints instead of being funneled through the manufacturers cloud service, which is what I don't think amazon planned for (maybe, who knows). I think its still a good plan, and worth trying out.

For the binding, what functionality are you thinking we would need? Here are my thoughts, just brainstorming at this point.

For the binding, it would be first an EventSubscriber, so it would be notified of item state events, and could push those to the gateway if the item has alexa metadata and a valid api token. That part seems straight forward

The binding would also have a REST endpoint that would allow the skill to update/refresh the token and regional API endpoint that it should hit. I have not looked much at how the tokens work here, is this something we can have the binding do on its own once it has a refresh token? Or do we need to have the skill periodically update the OH access token with the refresh token ? I'll look a little more this week, but I figured you have already thought about how this would work?

What else?

@digitaldan
Copy link
Collaborator

is this something we can have the binding do on its own once it has a refresh token?

Nm, i see this requires the skill's client secret, so the skill would need to play a part here.

@jsetton
Copy link
Collaborator Author

jsetton commented May 21, 2018

For the binding, it would be first an EventSubscriber, so it would be notified of item state events, and could push those to the gateway if the item has alexa metadata and a valid api token. That part seems straight forward

One thing I realized is that, if we intend to have the binding send directly the asynchronous response messages to the gateway, the generate properties response logic will also need to be included inside the binding. At that point, it doesn't make sense to have that logic in the skill and the binding. (For reference, the format of an asynchronous response is exactly the same than a synchronous one)

So we could either simplify the skill to just directly pass all directives to the binding which would generate the response and then either send the result back to the skill if synchronous, which will be passed on to Alexa, or directly to the event gateway if asynchronous. Or, we could keep the current logic in the skill and pass on a pre-formatted response, to the binding, including markers that will need to be replaced with proper item states before sending it to the gateway. A nice advantage going with the first approach is that we could generate ChangeReport based on a user setting, and also simplifying the skill may reduce the overall compute time usage.

In term of determining which items should be selected, we could use the property map generated from the last discovery response and obviously, a valid access token if sending message to gateway.

The binding would also have a REST endpoint that would allow the skill to update/refresh the token and regional API endpoint that it should hit. I have not looked much at how the tokens work here, is this something we can have the binding do on its own once it has a refresh token? Or do we need to have the skill periodically update the OH access token with the refresh token ? I'll look a little more this week, but I figured you have already thought about how this would work?

In terms of communication, using REST API endpoints between the skill and the binding seems the best approach I think.

In regards to the gateway access token management, based on my thoughts above, we might as well push that part inside the binding. Basically, the skill would pass on the initial authorization directive to get the initial tokens which will be stored in the binding and then would only need to refresh these tokens at the time it needs to send a message directly to the gateway. This is how I have it implemented in the skill currently. I use the stored refresh token to get new set of tokens (accessToken + refreshToken + expireTime) once the stored expired time has passed. The process is straight forward and it is completely separate of the OH access token that Alexa provides in the endpoint scope. The reason I used it in my implementation is for multi-user support where you need to manage access tokens per user. In that case, a call to the LWA API can be done to identity the user using that token.

Nm, i see this requires the skill's client secret, so the skill would need to play a part here.

The client id and secret could just be a skill configuration setting that is passed to the binding, maybe with authorization directives, along with the Alexa event gateway address (region-based), and this information would be stored in the binding for future use. These are pretty much static credentials on a per skill basis. So they would just be different for each regional skill. (For reference, authorization directives are only generated when a user initially enables the skill on his account)

I am willing to help with the development of such binding. Unfortunately, I am not very familiar with binding development. Maybe if you setup the initial skeleton, I can help with moving some of the logic I have added in the skill to the binding.

@digitaldan
Copy link
Collaborator

Ok, let me also think about this some more, as I can see this changing the nature of the skill by shifting the logic to the binding more and more. I'm traveling for work this week, so I will be on and off looking at OH stuff. BTW, I want to mention @devkid and his comment here #54 (comment) and that he suggested something similar, which I was reluctant to, but maybe he was right :-)

@digitaldan
Copy link
Collaborator

@jsetton So one issue with moving logic into a binding is existing Alexa users. We definitely do not want to break this for them, but if a binding is required, this would happen for all users until they did upgrade. We would also need to provide 2.2 and 2.3/2.4 versions of the binding as not all users upgrade right away. I would be curious if you have thoughts on this? Before we go down this path I want to make sure we have a plan to handle this.

@jsetton
Copy link
Collaborator Author

jsetton commented May 25, 2018

How about just leaving a "minimal" fallback v3 logic, in the skill, only based on v2 tags and in line to what v2 skill is currently supporting, in case it isn't able to communicate with the new binding? In terms of support, since it will rely on metadata, I don't think we should make that binding backward compatible unless official 2.3 release is far away.

@digitaldan
Copy link
Collaborator

I like that idea a lot. I would also vote to only support it for 6 - 12 months, and then require the binding at that point.

@jsetton
Copy link
Collaborator Author

jsetton commented May 25, 2018

Agreed. Although, once the fallback v3 logic in place, I am not sure if there will be much of support needed unless Amazon makes another major Skill API change. At that point, it would make sense to require users to use the binding. I think that emphasizing that new functionalities will only be included with the binding should do the trick.

In terms of the fallback v3 logic, in order to keep compute time low, we should just report the posted value, similar to what v2 logic is doing today, along with setting the retrievable property to false so that Alexa won't request for state reports.

@digitaldan
Copy link
Collaborator

I'll try and get a binding going this weekend. I think we should keep this separate from the cloud binding for a couple reasons, so I'll start a new one.

This is probably a good point to talk about what the API structure will look like. The most simplistic way to pass directives to the binding would be to have a /alexa/directive endpoint, this would move nearly all the logic to the binding, including discovery.

I'm also still not clear about how the binding would update the token needed to send events to the gateway. As discussed, In order to refresh the access token, we need the clientId and client secret for the skill, this is not something we can share or push to our OH users when offered as a service, or am I missing something? If this is the case, I would think we will need to have our skill also act as a REST endpoint that OH users can somehow request a refresh token (and we would need to somehow authenticate them).

@jsetton
Copy link
Collaborator Author

jsetton commented May 26, 2018

I'll try and get a binding going this weekend. I think we should keep this separate from the cloud binding for a couple reasons, so I'll start a new one.

I totally agree. My earlier comment was before we decided to move all the skill logic in the binding.

This is probably a good point to talk about what the API structure will look like. The most simplistic way to pass directives to the binding would be to have a /alexa/directive endpoint, this would move nearly all the logic to the binding, including discovery.

That's fine by me. We could maybe distinguish standard directives (post command to OH and return updated state) from authorization, discovery and report state ones by having separate endpoints but that's probably just minor design preference as the sorting could also be done in the binding as you mentioned. I don't have any preference for either ones.

I'm also still not clear about how the binding would update the token needed to send events to the gateway. As discussed, In order to refresh the access token, we need the clientId and client secret for the skill, this is not something we can share or push to our OH users when offered as a service, or am I missing something? If this is the case, I would think we will need to have our skill also act as a REST endpoint that OH users can somehow request a refresh token (and we would need to somehow authenticate them).

In regards to the management of access tokens logic, here are my thoughts:

When a user actives the skill on his account:

  1. Skill forwards authorization directive to binding attaching clientId, clientSecret and gatewayUrl (regional) information
  2. Binding request the access with information provided and store access tokens, expire time and added parameters from previous step to OH database.
  3. Send response to Alexa via the skill that authorization was granted if successful or failed if error (In case of error response, the user will receive a message mentioning that the skill failed to be enabled)

When a deferred response enabled directive is received by the binding:

  1. Retrieve from OH database access tokens and check token expire time.
  2. If expired, refresh tokens if necessary using stored refreshToken, clientId and clientSecret
  3. If valid access token:
    a. Send deferred response confirmation to Alexa via skill
    b. Wait for item state based on property parameters
    c. Once item state updated, send deferred response directly to Alexa event gateway using the valid access token and the stored gatewayUrl
  4. Otherwise, send synchronous property state response to Alexa via skill.

Regarding clientId, clientSecret and gatewayUrl (regional) information, they would just be skill settings included in config.js or lambda function environment variables.

One thing to note is that when implementing the "minimal" fallback v3 logic, we would need to directly accept authorization directives, without requesting for the tokens, if the user doesn't have the binding installed. Otherwise, these users wouldn't be able to enable the skill on their account, unless we want to use this case to force the binding to be installed. Anyway, this will be an issue for users that install the binding after having the skill already enabled on their account. Currently, there is no automated way provided by Alexa API to request a new authorization directive for a given user. Therefore, the user would have to disable and re-enable the skill so the initial access token request can be processed by the binding.

As an alternative solution, we could keep the access token management logic in the skill, store information in DynamoDB as I have it implemented in this PR and pass the valid access token and gateway url to the binding when a directive is received for an item configured for deferred response. However, doing so would most likely increase the overall Lambda function compute time and we wouldn't be able to have the binding automatically send change reports, to the Alexa event gateway, on Alexa-enabled item state changes.

@digitaldan
Copy link
Collaborator

Skill forwards authorization directive to binding attaching clientId, clientSecret and gatewayUrl (regional) information
...
Regarding clientId, clientSecret and gatewayUrl (regional) information, they would just be skill settings including in config.js or lambda function environment variables.

Sorry if I'm missing something but clientSecret is something that is generated by amazon for the skill, this is not something that is made public. So we can not (or should not rather) give this to OH instances. Or am I misunderstanding ?

@jsetton
Copy link
Collaborator Author

jsetton commented May 26, 2018

It's generated by Amazon for a given skill. That information is accessible in the developer console once the capability to send Alexa events to the gateway is enabled, which also actives the authorization directive mechanism, when users enable the skill on their account.

image

As far as storing that information in OH, based on the OAuth guidelines related to clientId and clientSecret, we should store a hashed/encrypted version of the secret in OH (We could use the cloud connector uuid for that). Unless you have another idea on retrieving that information from a central location, besides the skill 😄

Anyway, these credentials have a limited scope and can only be used with either an authorization code or a refresh token which are associated to the user. So, at most, users can fiddle with their own access token.

@digitaldan
Copy link
Collaborator

I'm not sure how they could be used maliciously, but sharing the skill's secret credentials for the event gateway to all of our users sounds like a big risk that I am not comfortable taking. This would essentially mean making our gateway client_id and secret public to the whole world.

To keep this truly secret, I would purpose we have the client request a refresh token through us. We could setup the skill to handle REST API gateway invocations or create some other simple service to do this. I have a few thoughts about how, but I need to noodle on it for a bit

@jsetton
Copy link
Collaborator Author

jsetton commented May 27, 2018

I'm not sure how they could be used maliciously, but sharing the skill's secret credentials for the event gateway to all of our users sounds like a big risk that I am not comfortable taking. This would essentially mean making our gateway client_id and secret public to the whole world.

As I mention in my previous message, these credentials have a limited scope and are available for the sole purpose of requesting access token from the LWA API. There are different from the OH access token or any other OAuth2 related tokens. The credentials alone cannot be used and requires to be used in combination of an authorization code or refresh token which are both user-based. So, from my understanding, the only thing a user can do maliciously, is mess with their own binding access token preventing asynchronous messages from being sent to the Alexa event gateway successfully on their behalf.

To keep this truly secret, I would purpose we have the client request a refresh token through us. We could setup the skill to handle REST API gateway invocations or create some other simple service to do this. I have a few thoughts about how, but I need to noodle on it for a bit

If you want to go that route, I could see the cloud connector handle the access token request portion by injecting the client id and secret. This is similar to how I implemented in this PR.

Having the skill handle these requests will defeat the purpose of what we are trying to achieve in the first place, which is to limit the overall number of requests and total compute time usage. Additionally, it would involve API gateway calls which will have a cost as well. Just for reference, each access tokens are valid for one hour before they need to be refreshed. Understandably, we won't need to refresh that often for deferred responses but if we plan to enable change reports then we would certainly need to request these tokens more frequently.

@jsetton
Copy link
Collaborator Author

jsetton commented Oct 16, 2018

@digitaldan It's been a while since we discussed about potentially having a new binding to interact with the Alexa skill. Is this still something you think we shall pursue? There has been a bunch of Alexa Skill API functionality updates since then and I was thinking of adding them but I am unsure if I should split this pull request into a different branch and update the current v3 branch with new functionalities which doesn't involve the skill waiting for a delayed answer, or maybe add a flag in this PR to disable by default the deferred response support.

@digitaldan
Copy link
Collaborator

Sorry for the late response, I have just started getting back to developing on this, I'm currently working on crowdin stuff atm.

I would like to still pursue this idea, and using a binding to send events, but again I can't release our secret credentials to the whole world. If there is another alternative, then I would be open to that. I need to come back and review our conversation, and get back up to speed on it. I would start another branch based on v3 for the time being.

@jsetton
Copy link
Collaborator Author

jsetton commented Feb 16, 2019

Closing this PR. I have migrated all the functionalities part of this change not relavant to the deferred response feature in the PR referenced above. We will continue talking on how to implement this feature in issue #73.

@jsetton jsetton closed this Feb 16, 2019
@jsetton jsetton deleted the v3 branch June 7, 2019 23:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants