Test suite v0.4 #637

teocns · 2025-01-10T21:22:57Z

Overview

This PR reorganizes the test infrastructure by implementing a comprehensive testing strategy that separates unit and integration tests, with a focus on provider integrations and API functionality testing.

Infrastructure Changes

Split tests into dedicated unit/ and integration/ directories
Configure separate CI jobs for unit and integration tests
Add VCR.py for HTTP interaction recording/replay
Centralize test fixtures in conftest.py
Add development tools (pytest-sugar, pdbpp)
Update pytest configuration for optimal test isolation
Set up proper test timeouts (5 minutes)

Test Fixtures

Implement JWT token management
Add provider response spying capabilities
Centralize mock_req fixture
Add session management utilities
Add package availability control
Set up VCR ignore hosts and options

Core Functionality Tests

Improve session handling and teardown
Enhance async loop lifecycle management
Add concurrent API request handling tests
Implement proper singleton cleanup
Move telemetry tests to unit test directory

Provider Integration Tests

OpenAI

Basic sync/async completions
Streaming responses
Assistants API integration

Mistral

Sync completions
Async completions
Streaming responses

Cohere

Chat completions (sync)
Chat completions (async)
Stream handling
Instrumentation control

AI21

Sync completions
Async completions
Stream management

Groq

Basic completions
Session management
Stream handling

LlamaStack

Agent configuration
Shield management
Model selection

CrewAI Integration Tests

Basic Setup
- Test initialization
- Test session creation
- Test auto-end behavior
Session Management
- Test crew lifecycle
- Test task completion tracking
- Test manual session control
Agent Monitoring
- Test single agent tracking
- Test multi-agent tracking
- Test task delegation tracking
Tool Integration
- Test built-in tools
- Test custom tools
- Test tool error handling
Example Workflows
- Test job posting flow
- Test markdown validator
- Test Instagram post creation

API Server Tests

Session lifecycle in API context
Tool recording
Event validation
Multi-session handling

Documentation

Update CONTRIBUTING.md with new test structure
Document VCR usage and best practices
Add provider test documentation

Integration

Adapt and integrate tests from PR Add LLM Integration Tests #603
Update LangChain handler tests for current API
Implement remaining provider integration tests

CrewAI Integration Tests

Basic Integration Tests

Test initialization with CrewAI
- Verify AgentOps initialization before Crew constructor
- Test auto_start_session behavior
- Test skip_auto_end_session parameter
- Verify proper session creation

Session Management Tests

Test session lifecycle with CrewAI
- Verify session starts with Crew initialization
- Test automatic session ending when tasks complete
- Verify session state after crew.kickoff()
- Test manual session ending

Event Recording Tests

Test LLM event recording
- Verify LLM calls are properly tracked
- Test streaming responses tracking
- Verify async LLM calls tracking
- Test multiple LLM calls within one session

Multi-Agent Tests

Test multi-agent scenarios
- Verify each agent's actions are tracked
- Test inter-agent communication tracking
- Verify task delegation and completion tracking
- Test parallel agent execution monitoring

Tool Usage Tests

Test tool integration
- Verify custom tool execution tracking
- Test built-in tool usage monitoring
- Verify tool error handling
- Test tool result recording

Example Workflow Tests

Test job posting workflow
- Verify researcher agent tracking
- Test writer agent monitoring
- Verify review agent tracking
- Test complete workflow execution

Error Handling Tests

Test error scenarios
- Verify failed task tracking
- Test exception handling
- Verify session state after errors
- Test recovery mechanisms

Integration with Other Tools

Test CrewAI with other integrations
- Test OpenAI provider integration
- Verify LangChain compatibility
- Test custom tool implementations
- Verify third-party tool usage

Special Features Tests

Test CrewAI-specific features
- Verify task dependency tracking
- Test sequential vs parallel execution
- Verify agent role assignments
- Test custom agent configurations

Known Issues & Mitigations

Test timing out when running integration and unit tests together
- Solution: Separated test runs in CI
Complex patching layers management
- Solution: Centralized mock configuration
VCR initialization conflicts
- Solution: Scoped VCR config to session level

codecov · 2025-01-10T21:30:41Z

Codecov Report

Attention: Patch coverage is 2.43902% with 40 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
agentops/llms/providers/openai.py	2.43%	40 Missing ⚠️

📢 Thoughts on this report? Let us know!

teocns · 2025-01-11T01:45:46Z

Some funny behavior with what I believe is the async loop lifecycle management. Tests time out, or maybe there is one particular test making the testing suite time out. Running each individually work, so far. Tried different combos of async event loop configuration without luck

teocns · 2025-01-11T03:12:34Z

I found out the issue is hit only when vcr is initialized (i.e imported)

Tests keep going until a vcr replay kicks in

I think this has to do with Async httpx

Signed-off-by: Teo <[email protected]>

…ration tests only when explicitly specified. Signed-off-by: Teo <[email protected]>

Signed-off-by: Teo <[email protected]>

the-praxs · 2025-01-14T11:21:53Z

So far we have the integration tests for all providers except Llama Stack and the partners (crewAI, Autogen, TaskWeaver, LangChain).

I have Llama Stack configured with Fireworks to make it work on my local machine but that's unreliable. I need to make it work to get the cassette recordings.

Partner integration tests I am working on the comprehensive suite as mentioned above.

…s found all provider fixtures will: Use the actual API key if it's set in the environment Fall back to "test-api-key" if no environment variable is found Signed-off-by: Teo <[email protected]>

Signed-off-by: Teo <[email protected]>

…ing contingent error Signed-off-by: Teo <[email protected]>

Signed-off-by: Teo <[email protected]>

teocns force-pushed the feat/optimal-test-suite branch 3 times, most recently from a086fa0 to 9044b65 Compare January 10, 2025 21:29

teocns force-pushed the feat/optimal-test-suite branch 2 times, most recently from 4e51bb1 to 0650e92 Compare January 12, 2025 00:23

teocns added 18 commits January 12, 2025 03:12

test: add __init__ to make tests/ a package

051a85b

test: add llm_event_spy fixture for tests

2d97251

test: add VCR.py fixture for HTTP interaction recording

4a19dab

deps: group integration-testing

51e2da2

test: add fixture to mock package availability in tests

0180e2c

test: Add integration tests for OpenAI provider and features

73a0110

test: add tests for concurrent API requests handling

538bf98

Improve vcr.py configuration

d679b93

Signed-off-by: Teo <[email protected]>

ruff

93744ce

Signed-off-by: Teo <[email protected]>

chore(pyproject): update pytest options and loop scope

512c95d

chore(tests): update vcr.py ignore_hosts and options

e29d2b2

pyproject.toml

8f02961

Signed-off-by: Teo <[email protected]>

centralize teardown in conftest.py (clear singletons, end all sessions)

a012a0f

Signed-off-by: Teo <[email protected]>

change vcr_config scope to session

f51850e

Signed-off-by: Teo <[email protected]>

integration: auto start agentops session

e22513b

Signed-off-by: Teo <[email protected]>

Move unit tests to dedicated folder (tests/unit)

cb014b2

Signed-off-by: Teo <[email protected]>

Isolate vcr_config import into tests/integration

2c3b19d

Signed-off-by: Teo <[email protected]>

configure pytest to run only unit tests by default, and include integ…

6dbe54b

…ration tests only when explicitly specified. Signed-off-by: Teo <[email protected]>

teocns force-pushed the feat/optimal-test-suite branch from 86205ba to 6dbe54b Compare January 12, 2025 02:12

ci(python-tests): separate job between unit-integration tests

fb2be21

teocns force-pushed the feat/optimal-test-suite branch from a606ea6 to fb2be21 Compare January 12, 2025 02:34

set python-tests timeout to 5 minutes

caa08df

Signed-off-by: Teo <[email protected]>

the-praxs and others added 5 commits January 14, 2025 01:48

add integration tests for other providers

558848a

remove openai version limitation

fa7325b

add providers as deps

64ce1d0

chore: add mistralai to test dependencies

9af78b9

remove mistral from dependency since its incorrect

3af0cd6

the-praxs and others added 6 commits January 14, 2025 16:52

ruff

6b94f37

re-record cassettes

80c6a07

tests/fixtures/providers: fallback to test-api-key if no provider i…

bcac9b8

…s found all provider fixtures will: Use the actual API key if it's set in the environment Fall back to "test-api-key" if no environment variable is found Signed-off-by: Teo <[email protected]>

set keys for litellm

3eb9bc9

Improve tests/integration/test_llm_providers.py openai assistants

7a6ac5a

Signed-off-by: Teo <[email protected]>

Make integration tests appropriately skip, regenerate x1 cassette

6ea858c

Signed-off-by: Teo <[email protected]>

teocns force-pushed the feat/optimal-test-suite branch from dc72575 to 0a8a5f7 Compare January 14, 2025 23:08

explicit tests/integration/conftest finxtures import

8f1a958

Signed-off-by: Teo <[email protected]>

teocns force-pushed the feat/optimal-test-suite branch 2 times, most recently from 6b9e3f0 to 8f1a958 Compare January 15, 2025 09:12

the-praxs self-requested a review January 15, 2025 13:10

teocns force-pushed the feat/optimal-test-suite branch 3 times, most recently from 6a7ff37 to 29c29d4 Compare January 15, 2025 13:41

teocns added 4 commits January 15, 2025 14:47

deps: improve dev packages versionings

ce740c1

Make integration tests run with python 3.12

9b21b5e

Signed-off-by: Teo <[email protected]>

add uv.lock

82e2105

Signed-off-by: Teo <[email protected]>

test concurrent api requests: remove matcher on method, possibly caus…

98c325c

…ing contingent error Signed-off-by: Teo <[email protected]>

teocns force-pushed the feat/optimal-test-suite branch from 29c29d4 to 98c325c Compare January 15, 2025 13:47

Run static-analysis with python 3.12.2

dd3c402

Signed-off-by: Teo <[email protected]>

teocns merged commit ae0f11b into main Jan 15, 2025
9 of 10 checks passed

teocns deleted the feat/optimal-test-suite branch January 15, 2025 14:15

teocns added a commit that referenced this pull request Jan 15, 2025

move tests/telemetry to tests/unit/telemetry/ to align with #637

ca84cd5

Signed-off-by: Teo <[email protected]>

teocns added the v0.4 label Jan 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test suite v0.4 #637

Test suite v0.4 #637

teocns commented Jan 10, 2025 •

edited by the-praxs

Loading

codecov bot commented Jan 10, 2025 •

edited

Loading

teocns commented Jan 11, 2025

teocns commented Jan 11, 2025

the-praxs commented Jan 14, 2025

Test suite v0.4 #637

Test suite v0.4 #637

Conversation

teocns commented Jan 10, 2025 • edited by the-praxs Loading

Overview

Infrastructure Changes

Test Fixtures

Core Functionality Tests

Provider Integration Tests

OpenAI

Mistral

Cohere

AI21

Groq

LlamaStack

CrewAI Integration Tests

API Server Tests

Documentation

Integration

CrewAI Integration Tests

Basic Integration Tests

Session Management Tests

Event Recording Tests

Multi-Agent Tests

Tool Usage Tests

Example Workflow Tests

Error Handling Tests

Integration with Other Tools

Special Features Tests

Known Issues & Mitigations

codecov bot commented Jan 10, 2025 • edited Loading

Codecov Report

teocns commented Jan 11, 2025

teocns commented Jan 11, 2025

the-praxs commented Jan 14, 2025

teocns commented Jan 10, 2025 •

edited by the-praxs

Loading

codecov bot commented Jan 10, 2025 •

edited

Loading