'/1.0/instances?recursion=2' Endpoint has missing information. #14277

Kxiru · 2024-10-15T13:00:44Z

Required information

Distribution:
Distribution version:
The output of "snap list --all lxd core20 core22 core24 snapd":
The output of "lxc info" or if that fails:
- Kernel version: 6.8.0-45-generic
- LXC version: 5.21.2 LTS
- LXD version: 5.21.2 LTS
- Storage backend in use:

Issue description

This issue documents the findings of an investigation into the limitations and issues with using certain API endpoints for fetching instance data, specifically disk and memory information. The investigation focused on two main endpoints: the /1.0/metrics endpoint and the /1.0/instances?recursion=2 endpoint. Both endpoints exhibit shortcomings in their ability to provide the required information, especially when instances are in a "stopped" state.

Findings

1. Issues with the /1.0/metrics Endpoint
The metrics endpoint is currently used on the "Detail Instances" page to calculate various instance-related metrics.

Problem 1: It does not provide data when an instance is stopped. Specifically, certain metrics (such as disk and memory totals) are unavailable when the instance is not running. I have found that this is because when an instance is not running, certain metrics such as "lxd_memory_MemFree_bytes" and "lxd_memory_MemTotal_bytes" are not available in the api response.
Problem 2: The metrics endpoint returns a large amount of data, much of which is filtered out after retrieval. This can impose a significant load on larger systems, making it a suboptimal choice for regular use. That being said, in LXD-UI we use Lazy loading to combat this, but it may still not be a sustainable solution.

Given these limitations, it is not feasible to rely on the metrics endpoint for obtaining instance data, especially when aiming for a lightweight solution that works regardless of instance status.

2. Issues with the /1.0/instances?recursion=2 Endpoint

The /1.0/instances?recursion=2 endpoint is designed to fetch comprehensive data on all instances. It should ideally return all necessary details, including disk and memory metrics, irrespective of the instance state.

Problem 1: When an instance is stopped, the total field for disk and memory metrics that is returned from the API is set to 0, meaning the data is not accurately reported. Please see the responses below for context.
Problem 2: When the instance is running, the disk attribute does not display the total correctly (shows 0, this is broken), which impacts the reliability of this endpoint for fetching disk usage metrics.

Note, when this endpoint is is used to provide memory usage totals, it is understandable that when an instance is stopped it should not return data (as memory would not be in use).

`/1.0/instances?recursion=2` on a running instance

{
    "status": "Running",
    "status_code": 103,
    "disk": {
        "root": {
            "usage": 1183744,
            "total": 0
        }
    },
    "memory": {
        "usage": 1310720,
        "usage_peak": 0,
        "total": 7823340000,
        "swap_usage": 331776,
        "swap_usage_peak": 0
    },
...
}

(Note how despite running, the 'total' data returned from the disk is 0?)

`/1.0/instances?recursion=2` on a Stopped Instance

{
    "status": "Stopped",
    "status_code": 102,
    "disk": {
        "root": {
            "usage": 1182720,
            "total": 0
        }
    },
    "memory": {
        "usage": 0,
        "usage_peak": 0,
        "total": 0,
        "swap_usage": 0,
        "swap_usage_peak": 0
    },
...
}

Note, disk data should still be available here, perhaps also memory total?

Steps to reproduce

Create an instance in LXD-UI
Attempt to view it's disk/memory usage when the instance is running vs when it is stopped.
Review the API responses from the /1.0/instances?recursion=2 endpoints.

Or

Call the API on a running or stopped instance.

Information to attach

Any relevant kernel output (dmesg)
Container log (lxc info NAME --show-log)
Container configuration (lxc config show NAME --expanded)
Main daemon log (at /var/log/lxd/lxd.log or /var/snap/lxd/common/lxd/logs/lxd.log)
Output of the client with --debug
Output of the daemon with --debug (alternatively output of lxc monitor while reproducing the issue)

The text was updated successfully, but these errors were encountered:

tomponline added the Bug Confirmed to be a bug label Oct 15, 2024

tomponline added this to the lxd-6.2 milestone Oct 15, 2024

tomponline assigned hamistao Oct 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'/1.0/instances?recursion=2' Endpoint has missing information. #14277

'/1.0/instances?recursion=2' Endpoint has missing information. #14277

Kxiru commented Oct 15, 2024

'/1.0/instances?recursion=2' Endpoint has missing information. #14277

'/1.0/instances?recursion=2' Endpoint has missing information. #14277

Comments

Kxiru commented Oct 15, 2024

Required information

Issue description

/1.0/instances?recursion=2 on a running instance

/1.0/instances?recursion=2 on a Stopped Instance

Steps to reproduce

Information to attach

`/1.0/instances?recursion=2` on a running instance

`/1.0/instances?recursion=2` on a Stopped Instance