Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lambda binary responses #19

Open
dazza-codes opened this issue Jun 6, 2021 · 0 comments
Open

lambda binary responses #19

dazza-codes opened this issue Jun 6, 2021 · 0 comments

Comments

@dazza-codes
Copy link
Owner

Notes or references on lambda binary responses

https://docs.aws.amazon.com/apigateway/latest/developerguide/lambda-proxy-binary-media.html

use the Content-Type in the response to manage parsing the data correctly

binary stream data via async aiohttp might parse the data differently already?

def lambda_handler(event, context):
    number = random.randint(0,1)
    if number == 1:
        response = s3.get_object(
            Bucket='bucket-name',
            Key='image.png',
        )
        image = response['Body'].read()
        return {
            'headers': { "Content-Type": "image/png" },
            'statusCode': 200,
            'body': base64.b64encode(image).decode('utf-8'),
            'isBase64Encoded': True
        }
    else:
        return {
            'headers': { "Content-type": "text/html" },
            'statusCode': 200,
            'body': "<h1>This is text</h1>",
        }

request might need to ask for binary media type
Accept: application/octet-stream

https://pypi.org/project/pbjson/

https://github.com/mapbox/geobuf

https://www.compose.com/articles/faster-operations-with-the-jsonb-data-type-in-postgresql/

And this has some immediate benefits:

more efficiency,
significantly faster to process,
supports indexing (which can be a significant advantage, as we'll see later),
simpler schema designs (replacing entity-attribute-value (EAV) tables with jsonb columns, which can be queried, indexed and joined, allowing for performance improvements up until 1000X!)
And some drawbacks:

slightly slower input (due to added conversion overhead),
it may take more disk space than plain json due to a larger table footprint, though not always,
certain queries (especially aggregate ones) may be slower due to the lack of statistics.

The reason behind this last issue is that, for any given column, PostgreSQL saves descriptive statistics such as the number of distinct and most common values, the fraction of NULL entries, and --for ordered types-- a histogram of the data distribution. All of this will be unavailable when the info is entered as JSON fields, and you will suffer a heavy performance penalty especially when aggregating data (COUNT, AVG, SUM, etc) among tons of JSON fields.

To avoid this, you may consider storing data that you may aggregate later on regular fields.

https://www.bizety.com/2018/11/12/protocol-buffers-vs-json/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant