Idea: New way to organize many tests #1353

marcharper · 2020-07-02T06:17:22Z

In the course of finding new seeds for many tests for #1288 it occurs to me that we can probably organize these tests in a more useful way. There are many tests in the library like the following:

 actions = [(C, C), (C, D), (D, C), (D, D), (C, C), (C, D), (C, C), (D, D), (D, C), (C, D)]
        self.versus_test(axelrod.Alternator(), expected_actions=actions, seed=887)

where the player is given by self.Player() from TestPlayer. Sometimes these tests use MockPlayer as an opponent rather than an actual opponent. Searching for a new seed involves manually extracting the info (opponent, histories, etc.) and looping over seeds until a new one producing the same behavior is found.

I think a better approach would be to have a large file of expected matches, encoding the range of expected behaviors of every strategy, with rows of the form
Player1Class, Player2Class, expected_history_1, expected_history_2, seed, other params (like noise), ...
e.g.

Cooperator, Defector, CCCCC, DDDDD, None, ...

Essentially it's a dataframe of tests. We could include a description of the test and other metadata. Maybe some other format would be better but hopefully you get the idea.

Such a structure has a few benefits:

The file can encode all expected behaviors of a strategy by the histories it should at some point yield in a simple way, rather than being scattered across various tests as is currently, reducing a lot of redundant code (regardless of whether a seed is required)
It's easier to find new seeds when we need them. An auxiliary script can easily scan for new seeds if we change something about how seeding works, or how a strategy works, etc. Right now there's no easy way to extract all the expected tests to systematically find new seeds because the necessary data is hard-coded into functions, and often there is more than one "row" per test function.
Similarly, when adding a new strategy, generic search functions can look for an opponent, a seed, etc. that generates a specific sequence of outcomes. I think we're all currently doing these as one-offs, with MockPlayers, etc.
The associated tests will be more single issue now rather than some of the test_strategy functions we have that test several different things at once. This will show all the failures rather than failures one at time now for the compound tests as each subtest fails.
The collection of expected matches might itself be useful somehow

Similarly, we have a lot of example tournaments and Moran processes with expected outputs that are seed dependent; perhaps they could be encoded in a similar manner. Not every test can be written as so but I would guess that the majority of tests could be done this way. This bumps up against #421, having a way to configure a tournament in a code-free way.

Thoughts?

The text was updated successfully, but these errors were encountered:

drvinceknight · 2020-07-03T09:12:35Z

I like this idea a lot.

We'll be able to document this dataset as well and theoretically it could be a useful research asset in it's own right. 👍 💪

marcharper · 2020-07-03T15:10:49Z

Great. I hacked out an early version to help me find seeds for #1288 by watching all the invocations of versus_test. It's not perfect because if a subtest fails, the others are not run, but it did capture a lot of matches as below, using dataclasses and YAML.

---
coplayer:
  init_kwargs: {}
  name: Cooperator
expected_outcome:
  coplayer_actions: CCCCCCCCCCCCCC
  player_actions: CCCCCCDDDDDDDD
  player_attributes: null
match_parameters:
  game: null
  noise: null
  prob_end: null
  seed: null
  turns: null
player:
  init_kwargs:
    initial_plays: null
  name: Adaptive
---
coplayer:
  init_kwargs: {}
  name: Defector
expected_outcome:
  coplayer_actions: DDDDDDDDDDDDDD
  player_actions: CCCCCCDDDDDDDD
  player_attributes: null
match_parameters:
  game: null
  noise: null
  prob_end: null
  seed: null
  turns: null
player:
  init_kwargs:
    initial_plays: null
  name: Adaptive

From there I was able to cook up a script to run these matches and look for new seeds, or potentially opponents. It's rough but here's how it works.

import axelrod
from axelrod import load_matches


def verify_match_outcomes(match, expected_actions1, expected_actions2, attrs):
    # Test expected sequence of plays from the match is as expected.
    player1, player2 = match.players
    for (play, expected_play) in zip(player1.history, expected_actions1):
        if play != expected_play:
            # print(play, expected_play)
            return False
    for (play, expected_play) in zip(player2.history, expected_actions2):
        # print(play, expected_play)
        if play != expected_play:
            return False
    # Test final player attributes are as expected
    if attrs:
        for attr, value in attrs.items():
            if getattr(player1, attr) != value:
                return False
    return True


def run_matches():
    match_configs = list(load_matches())
    for match_config in match_configs:
        try:
            match = match_config()
        except AttributeError:
            continue
        player, coplayer = match.players
        if isinstance(player, axelrod.Human) or isinstance(coplayer, axelrod.Human):
            continue

        print(match_config)

        seed = match_config.match_parameters.seed
        attrs = match_config.expected_outcome.player_attributes
        player_actions = match_config.expected_outcome.player_actions
        coplayer_actions = match_config.expected_outcome.coplayer_actions

        if seed is None:
            match.play()
            print(verify_match_outcomes(match, player_actions, coplayer_actions, attrs))
        else:
            # Search for a seed
            for seed in range(1, 200000):
                match.set_seed(seed)
                # axelrod.seed(seed)
                match.play()
                if verify_match_outcomes(match, player_actions, coplayer_actions, attrs):
                    print("Seed found:", seed)
                    break
        print()


if __name__ == "__main__":
    run_matches()

marcharper · 2020-07-03T15:11:51Z

I'd like to do the same thing for full tournaments and Moran processes.

drvinceknight · 2020-07-09T15:46:51Z

I'd like to do the same thing for full tournaments and Moran processes.

Sounds good to me.

marcharper mentioned this issue Jul 3, 2020

Add support for python 3.8 #1315

Closed

marcharper self-assigned this Jul 17, 2020

marcharper mentioned this issue Aug 4, 2020

Add script to find new seeds #1361

Open

marcharper mentioned this issue Sep 9, 2020

Adding a new strategy gotchas #1370

Open

marcharper mentioned this issue May 2, 2023

Jodoyle29 patch 1 #1417

Draft

marcharper mentioned this issue May 16, 2023

Simplify/move the ResultSet #1422

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Idea: New way to organize many tests #1353

Idea: New way to organize many tests #1353

marcharper commented Jul 2, 2020

drvinceknight commented Jul 3, 2020

marcharper commented Jul 3, 2020 •

edited

Loading

marcharper commented Jul 3, 2020

drvinceknight commented Jul 9, 2020

Idea: New way to organize many tests #1353

Idea: New way to organize many tests #1353

Comments

marcharper commented Jul 2, 2020

drvinceknight commented Jul 3, 2020

marcharper commented Jul 3, 2020 • edited Loading

marcharper commented Jul 3, 2020

drvinceknight commented Jul 9, 2020

marcharper commented Jul 3, 2020 •

edited

Loading