Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First steps to lambda deploy #13

Merged
merged 7 commits into from
Aug 12, 2021
Merged

First steps to lambda deploy #13

merged 7 commits into from
Aug 12, 2021

Conversation

GeoWill
Copy link
Contributor

@GeoWill GeoWill commented Jun 16, 2021

Notes on workflow

Login to aws sso cli
aws sso login --profile dc-lgsf-dev

Build
sam build --template sam-template.yaml

Test a function locally
sam local invoke ScraperWorkerFunction --event lgsf/aws_lambda/fixtures/sqs-message.json --profile dc-lgsf-dev

nb sqs-message.json adapted from output of sam local generate-event sqs receive-message

Deploy to dev
sam deploy --profile dc-lgsf-dev

ToDo

(turn into issues)

  • commands in the aws_lambda app to trigger sned jobs to the queue
  • Use environment variables instead of self.options["aws_lambda"]
  • Build a layer with just the python dependencies installed for quicker iterations
  • use circleci to rebuild and deploy the stack.
  • Reinstate requests cache

@GeoWill GeoWill force-pushed the cloudformation-first-steps branch from 82ae813 to 3482e26 Compare June 16, 2021 11:45
requirements.txt Outdated Show resolved Hide resolved
@GeoWill GeoWill mentioned this pull request Jul 3, 2021
@GeoWill
Copy link
Contributor Author

GeoWill commented Jul 3, 2021

I also added the AmazonSQSFullAccess policy to the LGSFLambdaExecutionRole as with out it I was getting 'Queue not found, or insufficient permissions' type errors from the QueueBuilder function.

lgsf/commands/base.py Outdated Show resolved Hide resolved
lgsf/aws_lambda/handlers.py Outdated Show resolved Hide resolved
sam-template.yaml Outdated Show resolved Hide resolved
This is because requests_cache uses sqlite on disk, which won't be
possible in aws_lambda. Ideally this will be re-instated to be used when
running locally.
Makefile is mostly to create requirements.txt.
sam-template.yaml defines a codecommit repository and a scraper queue
The plan is to load scrapers into the queue with one lambda function then use
the queue to trigger another lambda per scraper to actually run.

Scraped data will then be committed to the code commit repo.


class BaseCouncillorScraper(ScraperBase):
class BaseCouncillorScraper(CodeCommitMixin, ScraperBase):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a bit brittle, as the order matters here. But I guess that's just multiple inheritance.


requests_cache.install_cache("scraper_cache", expire_after=60 * 60 * 24)
# import requests_cache
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented out because it should probably be reinstated behind a check for lambda env

@@ -97,3 +100,193 @@ def save_raw(self, filename, content):
def save_json(self, obj):
file_name = "{}.json".format(obj.as_file_name())
self._save_file("json", file_name, obj.as_json())


class CodeCommitMixin:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a stab at pulling out the codecommit logic into a mixin for the scrapers. Not sure it's the best way of doing it, and haven't made it clear what methods the child classes need to implement, but thought I would get a better handle on whether it was a working system when I do the polling station scrapers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems ok to me — the only other pattern we could investigate is the way Django does pluggable storage in some places: define a storage interface that is subclassed and then set that storage class in settings / globally / by some other logic.

So for example, we'd have

class LocalFileSystemStorage(Storage):
   pass 

class CodeCommitStorage(Storage):
   pass 

And then the BaseCouncillorScraper could assign self.storage = CodeCommitStorage() and later do self.storage.save() or whatever.

Happy to talk more about this pattern if you think it's useful. Some more reading:

lgsf/aws_lambda/handlers.py Outdated Show resolved Hide resolved
@GeoWill GeoWill merged commit 030023b into master Aug 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants