Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Create new array pagination, and apply it to path contents #1009

Merged
merged 9 commits into from
Dec 2, 2024

Conversation

RulaKhaled
Copy link
Contributor

Purpose/Motivation

What is the feature? Why is this being done?
Since we cannot paginate the data at the database level, we’ll implement fake pagination. This approach will simulate pagination by chunking the data on the backend and rendering it incrementally on the frontend.

More context can be found here https://github.com/codecov/internal-issues/issues/656#issuecomment-2358606170

follow up to https://github.com/codecov/internal-issues/issues/656 investigation

Links to relevant tickets

#833

What does this PR do?

Include a brief description of the changes in this PR. Bullet points are your friend.

  • Create new array pagination
  • Create new deprecated path contents, which we will get rid of once frontend is migrated to using new functionality

Notes to Reviewer

Anything to note to the team? Any tips on how to review, or where to start?

Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.

@RulaKhaled RulaKhaled requested a review from a team as a code owner November 29, 2024 11:51
@@ -68,6 +69,112 @@ def page_info(self, *args, **kwargs):
}


class ArrayPaginator:
"""Cursor-based paginator for in-memory arrays."""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was debating between using stringified random cursors and numeric ones for pagination. I decided to go with numeric cursors because they are simple, efficient, and directly map to the array indices (we have cases where customers have over 1,000 files), making pagination faster and easier to debug. Since the dataset is static, there’s no need for the added complexity or security of random strings. If you have other perspectives, lmk

@RulaKhaled RulaKhaled changed the title feat: Create new array pagination, and apply this to path contents feat: Create new array pagination, and apply it to path contents Nov 29, 2024
Copy link

codecov bot commented Nov 29, 2024

Codecov Report

Attention: Patch coverage is 93.44262% with 8 lines in your changes missing coverage. Please review.

Project coverage is 96.03%. Comparing base (022c44b) to head (02a5361).
Report is 2 commits behind head on main.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
graphql_api/types/commit/commit.py 80.48% 8 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1009      +/-   ##
==========================================
- Coverage   96.06%   96.03%   -0.03%     
==========================================
  Files         828      828              
  Lines       19275    19412     +137     
==========================================
+ Hits        18516    18643     +127     
- Misses        759      769      +10     
Flag Coverage Δ
unit 92.34% <93.44%> (+<0.01%) ⬆️
unit-latest-uploader 92.34% <93.44%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@codecov-staging
Copy link

codecov-staging bot commented Nov 29, 2024

Codecov Report

Attention: Patch coverage is 93.44262% with 8 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
graphql_api/types/commit/commit.py 80.48% 8 Missing ⚠️

📢 Thoughts on this report? Let us know!

@codecov-qa
Copy link

codecov-qa bot commented Nov 29, 2024

❌ 3 Tests Failed:

Tests completed Failed Passed Skipped
2700 3 2697 6
View the top 3 failed tests by shortest run time
graphql_api/tests/test_account.py::AccountTestCase::test_fetch_organizations_desc
Stack Traces | 0.436s run time
self = <graphql_api.tests.test_account.AccountTestCase testMethod=test_fetch_organizations_desc>

    def test_fetch_organizations_desc(self) -> None:
        account = AccountFactory(name="account")
        owner = OwnerFactory(
            username="owner-0",
            plan_activated_users=[],
            account=account,
        )
        OwnerFactory(
            username="owner-1",
            plan_activated_users=[0],
            account=account,
        )
        OwnerFactory(
            username="owner-2",
            plan_activated_users=[0, 1],
            account=account,
        )
    
        query = """
            query {
                owner(username: "%s") {
                    account {
                        organizations(first: 20, orderingDirection: DESC) {
                            edges {
                                node {
                                    username
                                    activatedUserCount
                                }
                            }
                        }
                    }
                }
            }
        """ % (owner.username)
    
        result = self.gql_request(query, owner=owner)
    
        assert "errors" not in result
    
        orgs = [
            node["node"]["username"]
>           for node in result["owner"]["account"]["organizations"]["edges"]
        ]
E       TypeError: 'NoneType' object is not subscriptable

graphql_api/tests/test_account.py:171: TypeError
graphql_api/tests/test_account.py::AccountTestCase::test_fetch_organizations
Stack Traces | 0.446s run time
self = <graphql_api.tests.test_account.AccountTestCase testMethod=test_fetch_organizations>

    def test_fetch_organizations(self) -> None:
        account = AccountFactory(name="account")
        owner = OwnerFactory(
            username="owner-0",
            plan_activated_users=[],
            account=account,
        )
        OwnerFactory(
            username="owner-1",
            plan_activated_users=[0],
            account=account,
        )
        OwnerFactory(
            username="owner-2",
            plan_activated_users=[0, 1],
            account=account,
        )
    
        query = """
            query {
                owner(username: "%s") {
                    account {
                        organizations(first: 20) {
                            edges {
                                node {
                                    username
                                    activatedUserCount
                                }
                            }
                        }
                    }
                }
            }
        """ % (owner.username)
    
        result = self.gql_request(query, owner=owner)
    
        assert "errors" not in result
    
        orgs = [
            node["node"]["username"]
>           for node in result["owner"]["account"]["organizations"]["edges"]
        ]
E       TypeError: 'NoneType' object is not subscriptable

graphql_api/tests/test_account.py:125: TypeError
graphql_api/tests/test_account.py::AccountTestCase::test_fetch_organizations_pagination
Stack Traces | 0.452s run time
self = <graphql_api.tests.test_account.AccountTestCase testMethod=test_fetch_organizations_pagination>

    def test_fetch_organizations_pagination(self) -> None:
        account = AccountFactory(name="account")
        owner = OwnerFactory(
            username="owner-0",
            plan_activated_users=[],
            account=account,
        )
        OwnerFactory(
            username="owner-1",
            plan_activated_users=[0],
            account=account,
        )
        OwnerFactory(
            username="owner-2",
            plan_activated_users=[0, 1],
            account=account,
        )
    
        query = """
            query {
                owner(username: "%s") {
                    account {
                        organizations(first: 2) {
                            edges {
                                node {
                                    username
                                    activatedUserCount
                                }
                            }
                            totalCount
                            pageInfo {
                                hasNextPage
                            }
                        }
                    }
                }
            }
        """ % (owner.username)
    
        result = self.gql_request(query, owner=owner)
    
        assert "errors" not in result
    
>       totalCount = result["owner"]["account"]["organizations"]["totalCount"]
E       TypeError: 'NoneType' object is not subscriptable

graphql_api/tests/test_account.py:219: TypeError

To view more test analytics, go to the Test Analytics Dashboard
📢 Thoughts on this report? Let us know!

Copy link

❌ 3 Tests Failed:

Tests completed Failed Passed Skipped
2700 3 2697 6
View the top 3 failed tests by shortest run time
graphql_api/tests/test_account.py::AccountTestCase::test_fetch_organizations_desc
Stack Traces | 0.436s run time
self = &lt;graphql_api.tests.test_account.AccountTestCase testMethod=test_fetch_organizations_desc&gt;

    def test_fetch_organizations_desc(self) -&gt; None:
        account = AccountFactory(name="account")
        owner = OwnerFactory(
            username="owner-0",
            plan_activated_users=[],
            account=account,
        )
        OwnerFactory(
            username="owner-1",
            plan_activated_users=[0],
            account=account,
        )
        OwnerFactory(
            username="owner-2",
            plan_activated_users=[0, 1],
            account=account,
        )
    
        query = """
            query {
                owner(username: "%s") {
                    account {
                        organizations(first: 20, orderingDirection: DESC) {
                            edges {
                                node {
                                    username
                                    activatedUserCount
                                }
                            }
                        }
                    }
                }
            }
        """ % (owner.username)
    
        result = self.gql_request(query, owner=owner)
    
        assert "errors" not in result
    
        orgs = [
            node["node"]["username"]
&gt;           for node in result["owner"]["account"]["organizations"]["edges"]
        ]
E       TypeError: 'NoneType' object is not subscriptable

graphql_api/tests/test_account.py:171: TypeError
graphql_api/tests/test_account.py::AccountTestCase::test_fetch_organizations
Stack Traces | 0.446s run time
self = &lt;graphql_api.tests.test_account.AccountTestCase testMethod=test_fetch_organizations&gt;

    def test_fetch_organizations(self) -&gt; None:
        account = AccountFactory(name="account")
        owner = OwnerFactory(
            username="owner-0",
            plan_activated_users=[],
            account=account,
        )
        OwnerFactory(
            username="owner-1",
            plan_activated_users=[0],
            account=account,
        )
        OwnerFactory(
            username="owner-2",
            plan_activated_users=[0, 1],
            account=account,
        )
    
        query = """
            query {
                owner(username: "%s") {
                    account {
                        organizations(first: 20) {
                            edges {
                                node {
                                    username
                                    activatedUserCount
                                }
                            }
                        }
                    }
                }
            }
        """ % (owner.username)
    
        result = self.gql_request(query, owner=owner)
    
        assert "errors" not in result
    
        orgs = [
            node["node"]["username"]
&gt;           for node in result["owner"]["account"]["organizations"]["edges"]
        ]
E       TypeError: 'NoneType' object is not subscriptable

graphql_api/tests/test_account.py:125: TypeError
graphql_api/tests/test_account.py::AccountTestCase::test_fetch_organizations_pagination
Stack Traces | 0.452s run time
self = &lt;graphql_api.tests.test_account.AccountTestCase testMethod=test_fetch_organizations_pagination&gt;

    def test_fetch_organizations_pagination(self) -&gt; None:
        account = AccountFactory(name="account")
        owner = OwnerFactory(
            username="owner-0",
            plan_activated_users=[],
            account=account,
        )
        OwnerFactory(
            username="owner-1",
            plan_activated_users=[0],
            account=account,
        )
        OwnerFactory(
            username="owner-2",
            plan_activated_users=[0, 1],
            account=account,
        )
    
        query = """
            query {
                owner(username: "%s") {
                    account {
                        organizations(first: 2) {
                            edges {
                                node {
                                    username
                                    activatedUserCount
                                }
                            }
                            totalCount
                            pageInfo {
                                hasNextPage
                            }
                        }
                    }
                }
            }
        """ % (owner.username)
    
        result = self.gql_request(query, owner=owner)
    
        assert "errors" not in result
    
&gt;       totalCount = result["owner"]["account"]["organizations"]["totalCount"]
E       TypeError: 'NoneType' object is not subscriptable

graphql_api/tests/test_account.py:219: TypeError

To view individual test run time comparison to the main branch, go to the Test Analytics Dashboard

Copy link
Contributor

github-actions bot commented Nov 29, 2024

✅ All tests successful. No failed tests were found.

📣 Thoughts on this report? Let Codecov know! | Powered by Codecov

self.end_index = len(data)

if first and last:
raise ValueError("Cannot provide both 'first' and 'last'")
Copy link
Contributor

@JerrySentry JerrySentry Nov 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use ValidationError instead so it can be handled well in the response, I have a feeling worse case this will be an internal server error or best case something ugly in the response for the user.

self.start_index = int(after) + 1

if before is not None:
self.end_index = min(self.end_index, int(before))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's also validate that before and after can be casted to int, can wrap this in try/catch or use isdecimal() or something.


# Ensure bounds remain valid
self.start_index = max(self.start_index, 0)
self.end_index = min(self.end_index, len(data))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This final safe guard is great!

class ArrayConnection:
"""Connection wrapper for array pagination."""

def __init__(self, data: List[Any], paginator: ArrayPaginator, page: List[Any]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't actually need data and page right?
Total count will be len(self.paginator.data) and references of self.page will be self.paginator.page.
IMO adding these two params to the ArrayConnection class just adds more confusion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I was trying to decouple the two to some extent, but it's not necessary at this point, as I doubt we'll use it anywhere besides the query_to_connection function

Copy link
Contributor

@JerrySentry JerrySentry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!
Codecov is just asking for a test case for raise ValidationError("'after' cursor must be an integer")

@RulaKhaled RulaKhaled enabled auto-merge December 2, 2024 15:34
@RulaKhaled RulaKhaled added this pull request to the merge queue Dec 2, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 2, 2024
@RulaKhaled RulaKhaled added this pull request to the merge queue Dec 2, 2024
Merged via the queue into main with commit 98032a4 Dec 2, 2024
16 of 19 checks passed
@RulaKhaled RulaKhaled deleted the array-pagination branch December 2, 2024 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants