Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

504 Timeout error when accessing API #6

Open
eastjames opened this issue Jun 28, 2024 · 3 comments
Open

504 Timeout error when accessing API #6

eastjames opened this issue Jun 28, 2024 · 3 comments
Assignees

Comments

@eastjames
Copy link

Describe the bug
I recently started receiving a 504 Server Error when accessing the Carbon Mapper API. This is new behavior. In addition, the download seems very slow. Can the rate limit be increased?

To Reproduce
This python code block reproduces the error:

import requests

bbox = [
    -129.0625,
    10.5,
    -60.9375,
    59.25
]

q_opts = lambda dlimit, offset: [
    'bbox='+'&bbox='.join([str(i) for i in bbox]),
    'plume_gas=CH4',
    f'limit={dlimit}',
    f'offset={offset}',
]
base_url = 'https://api.carbonmapper.org'
endpoint = '/api/v1/catalog/plumes/annotated'

# first check how much data
q_params = '?' + '&'.join(q_opts(1,0))
url = base_url + endpoint + q_params
response = requests.get(url)
# Raise an exception if the API call returns an HTTP error status
response.raise_for_status()  
# Process the API response
data = response.json()

# total num data
ndata = data['bbox_count']
limit = 1000
npages = ndata // limit + 1
print(f'Found {ndata} Carbon Mapper plumes to download.')

rawdat = []
print('Downloading')
for ioffset in range(npages):
    offset = ioffset * limit
    q_params = '?' + '&'.join(q_opts(limit,offset))
    url = base_url + endpoint + q_params
    print(url)
    print()
    response = requests.get(url)
    # Raise an exception if the API call returns an HTTP error status
    response.raise_for_status() 
    # Process the API response
    data = response.json()
    rawdat += data['items']

Expected behavior
Expected to complete the download successfully. Data is saved in the list rawdat. This was the behavior until recently (problem first noticed 6/27/2024).

Screenshots
HTTP error:

HTTPError: 504 Server Error: Gateway Time-out for url: https://api.carbonmapper.org/api/v1/catalog/plumes/annotated?bbox=-129.0625&bbox=10.5&bbox=-60.9375&bbox=59.25&plume_gas=CH4&limit=1000&offset=3000

Configuration (please complete the following information):
Python version 3.11.6
requests version 2.32.2

$ cat /etc/os-release
NAME="Rocky Linux"
VERSION="8.9 (Green Obsidian)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="8.9"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Rocky Linux 8.9 (Green Obsidian)"
ANSI_COLOR="0;32"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:rocky:rocky:8:GA"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
SUPPORT_END="2029-05-31"
ROCKY_SUPPORT_PRODUCT="Rocky-Linux-8"
ROCKY_SUPPORT_PRODUCT_VERSION="8.9"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.9"
@delpaul
Copy link

delpaul commented Jul 8, 2024

Hi @eastjames ,

I am also facing the same issue, where I get the 504 Gateway timeout error. The API's were working just fine until last month. As of now, I am still running into the same problems. Does it work for you now or were you able to figure another way around the problem?

@demiurg
Copy link
Member

demiurg commented Jul 9, 2024

Looking into this.

@demiurg demiurg self-assigned this Jul 9, 2024
@demiurg
Copy link
Member

demiurg commented Jul 9, 2024

So, I noticed a few things. One, a woefully ineffecient query plan (full table index scan) was being chosen for joining s3 asset keys to the list output, but running analyze catalog_asset; fixed this. You can try the api again, and see if you get 504s now.

Second, the query to select 1000 plumes and annotate them with all the attributes is pretty expensive and is less reliable than running more queries with lower limit, so I do recommend that when running into issues.

Another optimization I would recommend is adding exclude_columns = ["plume_tif", "plume_png", "con_tif", "rgb_tif", "rgb_png"] if you do not need these files, that should speed things up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants