if one test fails everything after fails #99

mgheorghe · 2023-05-02T19:11:03Z

run create/remove only both will pass
run create/get/remove since get fails remove will fail also

run #94 all at once notice that half way through they will start failing
run #94 one file at a time and notice all will pass

observed same behavior with real switch and in the CI pipeline with saivs

from pprint import pprint

import pytest


class TestSaiVlan:
    # object with no parents

    def test_vlan_create(self, npu):
        commands = [
            {
                'name': 'vlan_1',
                'op': 'create',
                'type': 'SAI_OBJECT_TYPE_VLAN',
                'attributes': ['SAI_VLAN_ATTR_VLAN_ID', '10'],
            }
        ]

        results = [*npu.process_commands(commands)]
        print('======= SAI commands RETURN values create =======')
        pprint(results)
        assert all(results), 'Create error'

    def test_sai_vlan_attr_vlan_id_get(self, npu):
        commands = [
            {
                'name': 'sai_vlan_attr_vlan_id_get',
                'vlan_oid': '$vlan_1',
                'op': 'get',
                'type': 'SAI_OBJECT_TYPE_VLAN',
                'attributes': ['SAI_VLAN_ATTR_VLAN_ID', '111'],
            }
        ]
        results = [*npu.process_commands(commands)]
        print('======= SAI commands RETURN values get =======')
        pprint(results)
        assert all([result[1].value() == '10' for result in results]), 'Get error'

    def test_vlan_remove(self, npu):
        commands = [
            {
                'name': 'vlan_1',
                'op': 'remove',
                'type': 'SAI_OBJECT_TYPE_VLAN',
                'attributes': ['SAI_VLAN_ATTR_VLAN_ID', '10'],
            }
        ]

        results = [*npu.process_commands(commands)]
        print('======= SAI commands RETURN values remove =======')
        pprint(results)
        assert all(
            [result == 'SAI_STATUS_SUCCESS' for result in results]
        ), 'Remove error'

The text was updated successfully, but these errors were encountered:

andriy-kokhan · 2023-05-03T07:01:55Z

I'm assuming you are running this test with sairedis RPC. If so, syncd application is used as a SAI driver. Often, in case of some issue with SAI call, syncd application simply exits which causes subsequent TCs to fail.

If you check the logs (/var/log/syslog), at the very end, you will find something like this:

May  3 06:39:21 234e2ae01e6b syncd: :- sai_deserialize_object_id: invalid oid SAI_OBJECT_TYPE_VLAN
May  3 06:39:21 234e2ae01e6b syncd: :- run: Runtime error: :- sai_deserialize_object_id: invalid oid SAI_OBJECT_TYPE_VLAN

There are two issues with get TCs:

get command should use a key from create command as a name
result[1].value() should be replaced with result.value()

    def test_sai_vlan_attr_vlan_id_get(self, npu):
        commands = [
            {
                'name': 'vlan_1',
                'op': 'get',
                'attributes': ['SAI_VLAN_ATTR_VLAN_ID', ''],
            }
        ]
        results = [*npu.process_commands(commands)]
        print('======= SAI commands RETURN values get =======')
        pprint(results)
        assert all([result.value() == '10' for result in results]), 'Get error'

With these changes I was able to pass all 3 TCs.

# pytest --testbed=saivs_standalone -v -k "TestSaiVlan" 

collected 575 items / 572 deselected / 3 selected       

test_vlan_dd.py::TestSaiVlan::test_vlan_create PASSED                           [ 33%]
test_vlan_dd.py::TestSaiVlan::test_sai_vlan_attr_vlan_id_get PASSED             [ 66%]
test_vlan_dd.py::TestSaiVlan::test_vlan_remove PASSED                           [100%]

Obviously, SAI-C checks should be extended to catch misconfiguration like this. Will try to address this in subsequent PRs.

andriy-kokhan · 2023-05-03T14:00:52Z

As per the latest updates, the TC should be changed as follows:

    def test_sai_vlan_attr_vlan_id_get(self, npu):
        commands = [
            {
                'name': 'vlan_1',
                'op': 'get',
                'attributes': ['SAI_VLAN_ATTR_VLAN_ID'],
            }
        ]
        results = [*npu.process_commands(commands)]
        print('======= SAI commands RETURN values get =======')
        pprint(results)
        assert all([result[0].value() == '10' for result in results]), 'Get error'

mgheorghe · 2023-05-03T15:06:36Z

this issue is not about the get test case, it is to show that once one test fails everything after it fails.
the grpc explanation makes sense, if that is the root cause, can we reconnect in case we get disconnected ?
something like before each sai call check if connection if still active and if not reconnect

andriy-kokhan · 2023-05-03T15:10:37Z

It's not just about the re-connection. In fact, syncd app restarts. So, the next TC should start from switch init phase again. The approach how you can deal with it is already proposed in some TCs. E.g.,

SAI-Challenger/tests/test_l2_basic.py

Line 14 in ba5a4b1

@pytest.fixture(autouse=True)

mgheorghe · 2023-05-03T19:00:27Z

i'll have to look if the autouse will solve this:
probably too late but i finally i generated 2 examples please see will fail mgheorghe#2 will pass mgheorghe#3 using saivs

andriy-kokhan · 2023-05-03T19:41:00Z

In your example the get operation cause the failure because of redundant parameters in the command.
Should be reduced to

        commands = [
            {
                'name': 'vlan_1',
                'op': 'get',
                'attributes': ['SAI_VLAN_ATTR_VLAN_ID'],
            }
        ]

That's why I'm saying that SAI-C checks should be extended to catch misconfiguration like this. The test should not fail because of the extra parameters even they are wrong.

mgheorghe · 2023-05-04T04:36:41Z

i'm not concerned about the get test, my concern is with the remove failing when it should not.
while i see the example with the fixture, i'll give it a try for now but i still believe this should be handled at the framework level and user should not have to call npu.reset after every failure.

for now let me give the fixture a try, see how it goes and i'll come back with more feedback

andriy-kokhan · 2023-05-05T11:56:01Z

That's the way how syncd works. So, in case of Redis RPC, we simply can not handle all possible scenarios in SAI-C itself.
Adding a few more check to avoid possible misconfigurations - #105

mgheorghe · 2023-05-05T16:49:28Z

when running saivs i noticed sometimes after a test fail dyncD gets into a bad state and anything after that fails and i need to restart the containers.
ERROR api/test_vlan.py::TestSaiVlan::test_vlan_create - AssertionError: SyncD has not started yet...

what will be the best place to raise a syncd issue because this may not be a sai-challenger issue and looks more like syncd reliability issue.

andriy-kokhan · 2023-06-22T09:58:25Z

Please check #154 . It should address this issue.

andriy-kokhan · 2023-06-29T08:33:55Z

Closing as per comment above.

andriy-kokhan mentioned this issue May 3, 2023

Updated process_command() to be able to get multiple attrs #104

Merged

mgheorghe mentioned this issue May 5, 2023

redis.exceptions.ConnectionError: Error 99 connecting to localhost:6379. Cannot assign requested address. #107

Closed

andriy-kokhan mentioned this issue Jun 22, 2023

Updated "makereport" fixture description #154

Merged

andriy-kokhan closed this as completed Jun 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

if one test fails everything after fails #99

if one test fails everything after fails #99

mgheorghe commented May 2, 2023

andriy-kokhan commented May 3, 2023 •

edited

Loading

andriy-kokhan commented May 3, 2023

mgheorghe commented May 3, 2023

andriy-kokhan commented May 3, 2023

mgheorghe commented May 3, 2023

andriy-kokhan commented May 3, 2023

mgheorghe commented May 4, 2023 •

edited

Loading

andriy-kokhan commented May 5, 2023

mgheorghe commented May 5, 2023

andriy-kokhan commented Jun 22, 2023

andriy-kokhan commented Jun 29, 2023

if one test fails everything after fails #99

if one test fails everything after fails #99

Comments

mgheorghe commented May 2, 2023

andriy-kokhan commented May 3, 2023 • edited Loading

andriy-kokhan commented May 3, 2023

mgheorghe commented May 3, 2023

andriy-kokhan commented May 3, 2023

mgheorghe commented May 3, 2023

andriy-kokhan commented May 3, 2023

mgheorghe commented May 4, 2023 • edited Loading

andriy-kokhan commented May 5, 2023

mgheorghe commented May 5, 2023

andriy-kokhan commented Jun 22, 2023

andriy-kokhan commented Jun 29, 2023

andriy-kokhan commented May 3, 2023 •

edited

Loading

mgheorghe commented May 4, 2023 •

edited

Loading