Skip to content

Conversation

@tturocy
Copy link
Member

@tturocy tturocy commented Dec 23, 2025

Trying to understand the output from the failing Nash equilibrium tests recently just about did my head in, so I decided it was high time that we really need to look at our tests. So I pulled together a bunch of notes/thoughts I had been having (alongside reading up on some pytest features)..

This draft pull request is a proposal for discussion.

In short - I have re-written what used to be the test for enummixed_solve with rational probabilities to be a generic Nash equilibrium solver tester (!!!)

Highlights:

  • Test cases are represented as a dataclass
  • Games are constructed using a factory function (this solves a problem we noted previously that if the game construction fails it takes down the test run at collection time)
  • The solver is likewise specified as a callable
  • Tests are given an identifier of our choosing. I have been lazy and just called them test1, test2 &c right now, they will need to be better in production of course.
  • Tolerances are configurable for regrets and probabilities on a per-test case. So no global setting is hard-coded, and we can handle both float and rational with the same test function.
  • The tests now use pytest-subtests (which has been added as a dependency in the tests requirements.txt). This allows us to isolate the individual checks within the test so even if one fails the others can still run - e.g. if somehow max_regret is correct but the probabilities change for one profile, it still will check them all.
  • Experimenting with using Q() as a shorthand for rationals just to make our intent precise, and a helper d() which visually represents a probability distribution - solely for visual layout so we are not lost in a maze of square brackets, all alike.

This is an all hands request for everyone to have a look. Not least because in the new year I will be asking everyone to kick in for rationalising our test suite using either this technique or some other one - it is clear our test suite's reaching the edge of maintainability without making some investments like the ones suggested here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants