Losita/mai-bot

Fork 0

Files

tcmofashi 185361f2c3 feat: add a subagent frame

2026-04-03 22:15:53 +08:00

34 KiB

Raw Blame History

AgentLite Comprehensive Test Suite Plan

Context

AgentLite is a lightweight, async-first Agent component library for LLM applications. It provides:

Agent: Main agent class with tool calling loop and streaming support
OpenAIProvider: OpenAI-compatible provider implementation
Tool System: @tool decorator, CallableTool, CallableTool2, SimpleToolset
MCPClient: MCP server integration
Message Types: ContentPart, Message, ToolCall, etc.
Configuration: Pydantic-based config models

Test Location

/home/tcmofashi/proj/general_agent/agentlite/tests/

Task Dependency Graph

Task	Depends On	Reason
1. Test Configuration Setup	None	Foundation for all tests
2. Message Types Unit Tests	Task 1	Core data structures
3. Tool System Unit Tests	Task 1	Core tool abstractions
4. Configuration Unit Tests	Task 1	Config validation
5. Provider Protocol Unit Tests	Task 1	Provider interface
6. Mock Provider Implementation	Task 1	Required for integration tests
7. Agent Integration Tests	Tasks 2, 3, 6	Tests agent with mocked provider
8. Tool Calling Loop Tests	Tasks 3, 6	Tests tool execution flow
9. Streaming Response Tests	Tasks 2, 6	Tests streaming functionality
10. Conversation History Tests	Task 7	Tests history management
11. Real-World Scenario: Data Quality Agent	Tasks 7, 8	Practical use case
12. Real-World Scenario: Fact-Checking Agent	Tasks 7, 8	Practical use case
13. Real-World Scenario: Multi-Agent Workflow	Tasks 7, 10	Practical use case
14. MCP Mock Tests	Tasks 3, 6	Tests MCP integration with mocks
15. Error Handling Tests	Tasks 6, 7	Tests error scenarios
16. Test Coverage Analysis	All above	Verify coverage targets

Parallel Execution Graph

Wave 1 (Foundation - Start immediately):
├── Task 1: Test Configuration Setup
├── Task 2: Message Types Unit Tests
├── Task 3: Tool System Unit Tests
├── Task 4: Configuration Unit Tests
├── Task 5: Provider Protocol Unit Tests
└── Task 6: Mock Provider Implementation

Wave 2 (Core Integration - After Wave 1):
├── Task 7: Agent Integration Tests (depends: 1, 2, 3, 6)
├── Task 8: Tool Calling Loop Tests (depends: 3, 6)
└── Task 9: Streaming Response Tests (depends: 2, 6)

Wave 3 (Advanced Features - After Wave 2):
├── Task 10: Conversation History Tests (depends: 7)
├── Task 14: MCP Mock Tests (depends: 3, 6)
└── Task 15: Error Handling Tests (depends: 6, 7)

Wave 4 (Real-World Scenarios - After Wave 3):
├── Task 11: Data Quality Agent Scenario (depends: 7, 8)
├── Task 12: Fact-Checking Agent Scenario (depends: 7, 8)
└── Task 13: Multi-Agent Workflow Scenario (depends: 7, 10)

Wave 5 (Finalization - After Wave 4):
└── Task 16: Test Coverage Analysis (depends: all)

Critical Path: Task 1 → Task 6 → Task 7 → Task 10 → Task 13 → Task 16
Parallel Speedup: ~60% faster than sequential execution

Tasks

Task 1: Test Configuration Setup

Description: Create pytest configuration, conftest.py with shared fixtures, and test utilities.

Delegation Recommendation:

Category: quick - Configuration setup is straightforward
Skills: [python-programmer] - Python testing infrastructure knowledge

Skills Evaluation:

INCLUDED python-programmer: Required for pytest configuration and fixture design
OMITTED git-master: No git operations needed for this task
OMITTED frontend-ui-ux: No UI work involved

Depends On: None

Acceptance Criteria:

pytest.ini configured with asyncio mode
conftest.py with shared fixtures (mock_provider, sample_messages, temp_agent)
Test utilities module for common assertions
All tests can be run with pytest tests/

Files to Create:

/home/tcmofashi/proj/general_agent/agentlite/tests/conftest.py
/home/tcmofashi/proj/general_agent/agentlite/tests/utils.py

Commit: YES

Message: test: setup pytest configuration and shared fixtures
Files: tests/conftest.py, tests/utils.py

Task 2: Message Types Unit Tests

Description: Test all message types: ContentPart, TextPart, ImageURLPart, AudioURLPart, ToolCall, ToolCallPart, Message.

Delegation Recommendation:

Category: quick - Unit tests for data structures
Skills: [python-programmer] - Python testing patterns

Skills Evaluation:

INCLUDED python-programmer: Required for writing unit tests
OMITTED frontend-ui-ux: No UI involved

Depends On: Task 1

Acceptance Criteria:

ContentPart polymorphic validation works correctly
TextPart merge_in_place works for streaming
ToolCall merge_in_place works with ToolCallPart
Message content coercion from string works
Message.extract_text() returns correct text
Message.has_tool_calls() returns correct boolean
All edge cases covered (empty content, None values)

Test Cases:

test_content_part_registry - Verify subclass registration
test_text_part_creation - Basic TextPart instantiation
test_text_part_merge - Streaming text merge
test_image_url_part - ImageURLPart creation and serialization
test_audio_url_part - AudioURLPart creation and serialization
test_tool_call_creation - ToolCall instantiation
test_tool_call_merge - ToolCall merging with ToolCallPart
test_message_string_content - Message with string content coercion
test_message_list_content - Message with list of ContentParts
test_message_extract_text - Text extraction from mixed content
test_message_has_tool_calls - Tool call detection
test_message_serialization - Pydantic model_dump works

Files to Create:

/home/tcmofashi/proj/general_agent/agentlite/tests/unit/test_message.py

Commit: YES

Message: test: add unit tests for message types
Files: tests/unit/test_message.py

Task 3: Tool System Unit Tests

Description: Test tool system: Tool, CallableTool, CallableTool2, SimpleToolset, @tool decorator, ToolResult types.

Delegation Recommendation:

Category: unspecified-low - Moderate complexity with async patterns
Skills: [python-programmer] - Python async testing

Skills Evaluation:

INCLUDED python-programmer: Required for async tool testing
OMITTED frontend-ui-ux: No UI involved

Depends On: Task 1

Acceptance Criteria:

Tool JSON schema validation works
CallableTool validates arguments against schema
CallableTool2 uses Pydantic for validation
SimpleToolset manages tools correctly
@tool decorator creates valid tools
Tool execution handles errors gracefully
Async tool execution works correctly

Test Cases:

test_tool_schema_validation - Invalid schema raises ValueError
test_tool_ok_result - ToolOk creation and properties
test_tool_error_result - ToolError creation and properties
test_callable_tool_validation - Argument validation against schema
test_callable_tool_execution - Successful tool execution
test_callable_tool_error_handling - Exception handling in tools
test_callable_tool2_pydantic_validation - Pydantic model validation
test_callable_tool2_execution - Type-safe tool execution
test_simple_toolset_add_remove - Tool management
test_simple_toolset_handle - Tool call handling
test_simple_toolset_tool_not_found - Missing tool error
test_tool_decorator_basic - @tool creates valid tool
test_tool_decorator_with_params - @tool with custom name/description
test_tool_decorator_type_hints - Type hint to schema conversion
test_tool_concurrent_execution - Multiple tools execute concurrently

Files to Create:

/home/tcmofashi/proj/general_agent/agentlite/tests/unit/test_tool.py

Commit: YES

Message: test: add unit tests for tool system
Files: tests/unit/test_tool.py

Task 4: Configuration Unit Tests

Description: Test Pydantic configuration models: ProviderConfig, ModelConfig, ToolConfig, AgentConfig.

Delegation Recommendation:

Category: quick - Pydantic model validation tests
Skills: [python-programmer] - Pydantic testing

Skills Evaluation:

INCLUDED python-programmer: Required for Pydantic validation tests

Depends On: Task 1

Acceptance Criteria:

ProviderConfig validates base_url format
ProviderConfig stores api_key as SecretStr
ModelConfig validates temperature range
ModelConfig validates provider is not empty
AgentConfig validates default_model exists in models
AgentConfig validates all model providers exist
get_provider_config and get_model_config work correctly

Test Cases:

test_provider_config_validation - Valid config creation
test_provider_config_invalid_url - Invalid base_url raises error
test_provider_config_secret_str - API key is SecretStr
test_model_config_validation - Valid model config
test_model_config_temperature_range - Temperature bounds checking
test_model_config_empty_provider - Empty provider raises error
test_agent_config_validation - Valid agent config
test_agent_config_missing_default_model - Missing default_model raises error
test_agent_config_unknown_provider - Unknown provider raises error
test_agent_config_get_provider - get_provider_config works
test_agent_config_get_model - get_model_config works

Files to Create:

/home/tcmofashi/proj/general_agent/agentlite/tests/unit/test_config.py

Commit: YES

Message: test: add unit tests for configuration models
Files: tests/unit/test_config.py

Task 5: Provider Protocol Unit Tests

Description: Test provider protocol and exception types: ChatProvider, StreamedMessage, TokenUsage, exception hierarchy.

Delegation Recommendation:

Category: quick - Protocol and exception testing
Skills: [python-programmer] - Python protocol testing

Skills Evaluation:

INCLUDED python-programmer: Required for protocol testing

Depends On: Task 1

Acceptance Criteria:

TokenUsage calculates total correctly
Exception hierarchy is correct
APIStatusError stores status_code
ChatProvider protocol can be implemented

Test Cases:

test_token_usage_total - Total token calculation
test_token_usage_defaults - Default cached_tokens = 0
test_chat_provider_error_base - Base exception class
test_api_connection_error - APIConnectionError creation
test_api_timeout_error - APITimeoutError creation
test_api_status_error - APIStatusError with status_code
test_api_empty_response_error - APIEmptyResponseError creation
test_chat_provider_protocol - Protocol implementation check

Files to Create:

/home/tcmofashi/proj/general_agent/agentlite/tests/unit/test_provider.py

Commit: YES

Message: test: add unit tests for provider protocol
Files: tests/unit/test_provider.py

Task 6: Mock Provider Implementation

Description: Create a comprehensive mock provider for testing that simulates OpenAI API responses without real API calls.

Delegation Recommendation:

Category: unspecified-low - Requires understanding of streaming and async patterns
Skills: [python-programmer] - Async generator implementation

Skills Evaluation:

INCLUDED python-programmer: Required for mock provider implementation

Depends On: Task 1

Acceptance Criteria:

MockProvider implements ChatProvider protocol
Can simulate text responses
Can simulate tool calls
Can simulate streaming responses
Can simulate errors
Configurable response sequences
Tracks calls for verification

Implementation Details:

class MockProvider:
    """Mock provider for testing.
    
    Usage:
        provider = MockProvider()
        provider.add_response("Hello!")
        provider.add_tool_call("add", {"a": 1, "b": 2}, "3")
        
        agent = Agent(provider=provider)
        response = await agent.run("Hi")
        
        assert provider.calls == [...]
    """

Files to Create:

/home/tcmofashi/proj/general_agent/agentlite/tests/mocks/provider.py

Commit: YES

Message: test: add mock provider for testing
Files: tests/mocks/provider.py

Task 7: Agent Integration Tests

Description: Test Agent class with mocked provider: initialization, run(), generate(), history management.

Delegation Recommendation:

Category: unspecified-low - Integration testing with async
Skills: [python-programmer] - Async integration testing

Skills Evaluation:

INCLUDED python-programmer: Required for agent integration testing

Depends On: Tasks 1, 2, 3, 6

Acceptance Criteria:

Agent initializes correctly with provider
Agent.run() returns string response
Agent.run(stream=True) returns async iterator
Agent.generate() returns Message
Agent adds messages to history
Agent.clear_history() clears history
Agent respects max_iterations

Test Cases:

test_agent_initialization - Basic agent creation
test_agent_with_tools - Agent with toolset
test_agent_run_simple - Simple non-streaming run
test_agent_run_streaming - Streaming response
test_agent_generate - Generate without tool loop
test_agent_history_tracking - Messages added to history
test_agent_clear_history - History cleared correctly
test_agent_max_iterations - Respects iteration limit
test_agent_system_prompt - System prompt used

Files to Create:

/home/tcmofashi/proj/general_agent/agentlite/tests/integration/test_agent.py

Commit: YES

Message: test: add agent integration tests
Files: tests/integration/test_agent.py

Task 8: Tool Calling Loop Tests

Description: Test the complete tool calling loop: agent requests tool, tool executes, result returned.

Delegation Recommendation:

Category: unspecified-low - Complex async flow testing
Skills: [python-programmer] - Async flow testing

Skills Evaluation:

INCLUDED python-programmer: Required for tool loop testing

Depends On: Tasks 3, 6

Acceptance Criteria:

Agent calls tool when requested by LLM
Tool result is added to history
Agent continues conversation after tool result
Multiple tool calls in one response handled
Tool errors are handled gracefully
Tool calls are concurrent

Test Cases:

test_single_tool_call - One tool call in conversation
test_multiple_tool_calls - Multiple tools in one response
test_tool_call_chain - Sequential tool calls
test_tool_error_handling - Tool returns error
test_tool_not_found - Unknown tool requested
test_tool_concurrent_execution - Tools execute concurrently
test_tool_result_in_history - Tool results in conversation history
test_tool_call_with_arguments - Arguments passed correctly

Files to Create:

/home/tcmofashi/proj/general_agent/agentlite/tests/integration/test_tool_loop.py

Commit: YES

Message: test: add tool calling loop tests
Files: tests/integration/test_tool_loop.py

Task 9: Streaming Response Tests

Description: Test streaming responses: text streaming, tool call streaming, mixed content.

Delegation Recommendation:

Category: unspecified-low - Async streaming testing
Skills: [python-programmer] - Async generator testing

Skills Evaluation:

INCLUDED python-programmer: Required for streaming testing

Depends On: Tasks 2, 6

Acceptance Criteria:

Text streams in chunks
Tool calls stream correctly
Mixed content (text + tool) streams correctly
Complete response can be reconstructed
Streaming works with tool calling loop

Test Cases:

test_stream_text_only - Simple text streaming
test_stream_tool_call - Tool call streaming
test_stream_mixed_content - Text then tool call
test_stream_reconstruction - Rebuild full response
test_stream_with_tool_loop - Streaming in tool loop
test_stream_empty_response - Empty stream handling

Files to Create:

/home/tcmofashi/proj/general_agent/agentlite/tests/integration/test_streaming.py

Commit: YES

Message: test: add streaming response tests
Files: tests/integration/test_streaming.py

Task 10: Conversation History Tests

Description: Test conversation history management: message ordering, role tracking, history limits.

Delegation Recommendation:

Category: quick - History management testing
Skills: [python-programmer] - State management testing

Skills Evaluation:

INCLUDED python-programmer: Required for history testing

Depends On: Task 7

Acceptance Criteria:

Messages added in correct order
Roles tracked correctly (user, assistant, tool)
Tool call IDs preserved
History can be inspected
History can be cleared
History persists across multiple runs

Test Cases:

test_history_message_order - Messages in correct order
test_history_roles - Correct role tracking
test_history_tool_responses - Tool call IDs preserved
test_history_persistence - History across multiple runs
test_history_clear - Clear history works
test_history_manual_add - Manually add messages
test_history_copy - history property returns copy

Files to Create:

/home/tcmofashi/proj/general_agent/agentlite/tests/integration/test_history.py

Commit: YES

Message: test: add conversation history tests
Files: tests/integration/test_history.py

Task 11: Real-World Scenario - Data Quality Agent

Description: Test a realistic data quality improvement agent that validates and cleans data.

Delegation Recommendation:

Category: unspecified-high - Complex scenario testing
Skills: [python-programmer] - Complex test scenario design

Skills Evaluation:

INCLUDED python-programmer: Required for scenario implementation

Depends On: Tasks 7, 8

Acceptance Criteria:

Agent validates data format
Agent identifies data quality issues
Agent suggests corrections
Uses multiple tools (validate, clean, analyze)
Handles edge cases (empty data, invalid format)

Scenario:

# Data Quality Agent validates CSV data
# Tools: validate_csv, detect_anomalies, suggest_fixes
# Test with sample data containing errors

Test Cases:

test_data_quality_valid_data - Clean data passes validation
test_data_quality_detects_errors - Errors detected and reported
test_data_quality_suggests_fixes - Corrections suggested
test_data_quality_empty_data - Handles empty input
test_data_quality_invalid_format - Handles format errors

Files to Create:

/home/tcmofashi/proj/general_agent/agentlite/tests/scenarios/test_data_quality.py

Commit: YES

Message: test: add data quality agent scenario tests
Files: tests/scenarios/test_data_quality.py

Task 12: Real-World Scenario - Fact-Checking Agent

Description: Test a fact-checking agent that verifies claims using tools.

Delegation Recommendation:

Category: unspecified-high - Complex scenario testing
Skills: [python-programmer] - Complex test scenario design

Skills Evaluation:

INCLUDED python-programmer: Required for scenario implementation

Depends On: Tasks 7, 8

Acceptance Criteria:

Agent extracts claims from text
Agent uses search tool to verify
Agent provides verdict with evidence
Handles uncertain claims appropriately
Multiple claims in one text handled

Scenario:

# Fact-Checking Agent verifies statements
# Tools: search_facts, calculate_statistics, check_date
# Test with verifiable and unverifiable claims

Test Cases:

test_fact_check_true_claim - Correctly identifies true claim
test_fact_check_false_claim - Correctly identifies false claim
test_fact_check_multiple_claims - Multiple claims in one text
test_fact_check_uncertain - Handles uncertain claims
test_fact_check_with_evidence - Provides supporting evidence

Files to Create:

/home/tcmofashi/proj/general_agent/agentlite/tests/scenarios/test_fact_checking.py

Commit: YES

Message: test: add fact-checking agent scenario tests
Files: tests/scenarios/test_fact_checking.py

Task 13: Real-World Scenario - Multi-Agent Workflow

Description: Test multiple agents collaborating on a complex task.

Delegation Recommendation:

Category: unspecified-high - Complex multi-agent testing
Skills: [python-programmer] - Complex scenario design

Skills Evaluation:

INCLUDED python-programmer: Required for multi-agent testing

Depends On: Tasks 7, 10

Acceptance Criteria:

Multiple agents can share a provider
Agents maintain separate histories
Workflow stages execute in order
Output from one agent feeds into next
Each agent has specialized role

Scenario:

# Research → Write → Edit workflow
# Researcher gathers facts
# Writer creates content
# Editor reviews and improves

Test Cases:

test_multi_agent_research_write - Research to writer flow
test_multi_agent_with_editor - Three-agent workflow
test_multi_agent_isolated_histories - Histories don't leak
test_multi_agent_shared_provider - Provider shared correctly
test_multi_agent_error_handling - Errors don't break workflow

Files to Create:

/home/tcmofashi/proj/general_agent/agentlite/tests/scenarios/test_multi_agent.py

Commit: YES

Message: test: add multi-agent workflow scenario tests
Files: tests/scenarios/test_multi_agent.py

Task 14: MCP Mock Tests

Description: Test MCP integration with mocked MCP server.

Delegation Recommendation:

Category: unspecified-low - MCP protocol mocking
Skills: [python-programmer] - Protocol mocking

Skills Evaluation:

INCLUDED python-programmer: Required for MCP mocking

Depends On: Tasks 3, 6

Acceptance Criteria:

MCPClient connects to mock server
Tools load from mock server
MCP tools execute correctly
MCP errors handled gracefully
Connection cleanup works

Test Cases:

test_mcp_connect_stdio - STDIO connection mock
test_mcp_connect_sse - SSE connection mock
test_mcp_load_tools - Load tools from mock
test_mcp_tool_execution - Execute MCP tool
test_mcp_error_handling - MCP errors handled
test_mcp_context_manager - Async context manager works

Files to Create:

/home/tcmofashi/proj/general_agent/agentlite/tests/mocks/mcp_server.py
/home/tcmofashi/proj/general_agent/agentlite/tests/integration/test_mcp.py

Commit: YES

Message: test: add MCP integration tests with mocks
Files: tests/mocks/mcp_server.py, tests/integration/test_mcp.py

Task 15: Error Handling Tests

Description: Test error scenarios: provider errors, tool errors, timeout, connection issues.

Delegation Recommendation:

Category: unspecified-low - Error scenario testing
Skills: [python-programmer] - Error testing patterns

Skills Evaluation:

INCLUDED python-programmer: Required for error testing

Depends On: Tasks 6, 7

Acceptance Criteria:

APIConnectionError handled correctly
APITimeoutError handled correctly
APIStatusError handled correctly
Tool execution errors don't crash agent
Invalid tool arguments handled
Max iterations prevents infinite loops

Test Cases:

test_provider_connection_error - Connection failure
test_provider_timeout_error - Request timeout
test_provider_status_error - HTTP error status
test_provider_empty_response - Empty response handling
test_tool_execution_error - Tool raises exception
test_tool_invalid_arguments - Invalid args to tool
test_tool_not_found_error - Unknown tool called
test_max_iterations_reached - Loop prevention
test_json_decode_error - Invalid JSON in tool args

Files to Create:

/home/tcmofashi/proj/general_agent/agentlite/tests/integration/test_errors.py

Commit: YES

Message: test: add error handling tests
Files: tests/integration/test_errors.py

Task 16: Test Coverage Analysis

Description: Analyze test coverage and ensure targets are met.

Delegation Recommendation:

Category: quick - Coverage analysis
Skills: [python-programmer] - Coverage tooling

Skills Evaluation:

INCLUDED python-programmer: Required for coverage analysis

Depends On: All previous tasks

Acceptance Criteria:

Overall coverage >= 80%
Core modules (message, tool, agent) >= 90%
Provider module >= 70%
MCP module >= 60%
Coverage report generated
Missing coverage documented

Coverage Targets:

Module	Target	Priority
agentlite.message	95%	P0
agentlite.tool	95%	P0
agentlite.agent	90%	P0
agentlite.config	90%	P0
agentlite.provider	80%	P1
agentlite.providers.openai	70%	P1
agentlite.mcp	60%	P2

Files to Create:

/home/tcmofashi/proj/general_agent/agentlite/tests/.coveragerc

Commit: YES

Message: test: add coverage configuration and analysis
Files: tests/.coveragerc

Test File Structure

/home/tcmofashi/proj/general_agent/agentlite/tests/
├── conftest.py                    # Shared fixtures and configuration
├── utils.py                       # Test utilities and helpers
├── .coveragerc                    # Coverage configuration
├── unit/                          # Unit tests
│   ├── __init__.py
│   ├── test_message.py           # Message types tests
│   ├── test_tool.py              # Tool system tests
│   ├── test_config.py            # Configuration tests
│   └── test_provider.py          # Provider protocol tests
├── integration/                   # Integration tests
│   ├── __init__.py
│   ├── test_agent.py             # Agent integration tests
│   ├── test_tool_loop.py         # Tool calling loop tests
│   ├── test_streaming.py         # Streaming tests
│   ├── test_history.py           # History management tests
│   ├── test_mcp.py               # MCP integration tests
│   └── test_errors.py            # Error handling tests
├── scenarios/                     # Real-world scenario tests
│   ├── __init__.py
│   ├── test_data_quality.py      # Data quality agent
│   ├── test_fact_checking.py     # Fact-checking agent
│   └── test_multi_agent.py       # Multi-agent workflow
└── mocks/                         # Mock implementations
    ├── __init__.py
    ├── provider.py               # Mock OpenAI provider
    └── mcp_server.py             # Mock MCP server

Test Fixtures (conftest.py)

Core Fixtures

# Mock provider fixtures
@pytest.fixture
def mock_provider():
    """Create a mock provider with no responses configured."""
    return MockProvider()

@pytest.fixture
def mock_provider_with_response():
    """Create a mock provider that returns a simple text response."""
    provider = MockProvider()
    provider.add_text_response("Hello!")
    return provider

# Sample message fixtures
@pytest.fixture
def sample_text_message():
    """Create a sample text message."""
    return Message(role="user", content="Hello!")

@pytest.fixture
def sample_tool_call():
    """Create a sample tool call."""
    return ToolCall(
        id="call_123",
        function=ToolCall.FunctionBody(
            name="add",
            arguments='{"a": 1, "b": 2}'
        )
    )

# Tool fixtures
@pytest.fixture
def add_tool():
    """Create a simple add tool."""
    @tool()
    async def add(a: float, b: float) -> float:
        """Add two numbers."""
        return a + b
    return add

@pytest.fixture
def error_tool():
    """Create a tool that raises an error."""
    @tool()
    async def error() -> str:
        """Always raises an error."""
        raise ValueError("Test error")
    return error

# Agent fixtures
@pytest.fixture
async def simple_agent(mock_provider):
    """Create a simple agent with mocked provider."""
    return Agent(provider=mock_provider)

@pytest.fixture
async def agent_with_tools(mock_provider, add_tool):
    """Create an agent with tools."""
    return Agent(provider=mock_provider, tools=[add_tool])

Mock Implementations

MockProvider

class MockProvider:
    """Mock provider for testing AgentLite without real API calls.
    
    This provider simulates OpenAI API responses and allows:
    - Configuring response sequences
    - Simulating tool calls
    - Simulating errors
    - Tracking all calls for verification
    
    Example:
        provider = MockProvider()
        provider.add_text_response("Hello!")
        provider.add_tool_call("add", {"a": 1, "b": 2}, "3")
        
        agent = Agent(provider=provider)
        response = await agent.run("Hi")
        
        # Verify calls
        assert len(provider.calls) == 1
        assert provider.calls[0].system_prompt == "You are helpful."
    """
    
    def __init__(self):
        self.responses = []
        self.calls = []
        self.model = "mock-model"
    
    def add_text_response(self, text: str):
        """Add a text response to the queue."""
        self.responses.append({"type": "text", "content": text})
    
    def add_tool_call(self, name: str, arguments: dict, result: str):
        """Add a tool call response to the queue."""
        self.responses.append({
            "type": "tool_call",
            "name": name,
            "arguments": arguments,
            "result": result
        })
    
    def add_error(self, error: Exception):
        """Add an error response to the queue."""
        self.responses.append({"type": "error", "error": error})
    
    async def generate(self, system_prompt, tools, history):
        """Generate a mock response."""
        self.calls.append(MockCall(
            system_prompt=system_prompt,
            tools=tools,
            history=list(history)
        ))
        
        if not self.responses:
            return MockStreamedMessage([TextPart(text="Mock response")])
        
        response = self.responses.pop(0)
        
        if response["type"] == "error":
            raise response["error"]
        elif response["type"] == "text":
            return MockStreamedMessage([TextPart(text=response["content"])])
        elif response["type"] == "tool_call":
            return MockStreamedMessage([
                ToolCall(
                    id="call_123",
                    function=ToolCall.FunctionBody(
                        name=response["name"],
                        arguments=json.dumps(response["arguments"])
                    )
                )
            ])

Test Configuration (pytest.ini)

[pytest]
testpaths = tests
asyncio_mode = auto
asyncio_default_fixture_loop_scope = function
pythonpath = src
addopts = -v --tb=short --strict-markers
markers =
    unit: Unit tests
    integration: Integration tests
    scenario: Real-world scenario tests
    slow: Slow tests

Running Tests

# Run all tests
cd /home/tcmofashi/proj/general_agent/agentlite
pytest tests/

# Run with coverage
pytest tests/ --cov=agentlite --cov-report=html --cov-report=term

# Run specific test categories
pytest tests/unit/ -v
pytest tests/integration/ -v
pytest tests/scenarios/ -v

# Run with markers
pytest -m unit
pytest -m integration
pytest -m "not slow"

# Run specific test file
pytest tests/unit/test_message.py -v

# Run with debugging
pytest tests/ -v --pdb

Commit Strategy

After Task	Commit Message	Files
Task 1	`test: setup pytest configuration and shared fixtures`	`tests/conftest.py`, `tests/utils.py`
Task 2	`test: add unit tests for message types`	`tests/unit/test_message.py`
Task 3	`test: add unit tests for tool system`	`tests/unit/test_tool.py`
Task 4	`test: add unit tests for configuration models`	`tests/unit/test_config.py`
Task 5	`test: add unit tests for provider protocol`	`tests/unit/test_provider.py`
Task 6	`test: add mock provider for testing`	`tests/mocks/provider.py`
Task 7	`test: add agent integration tests`	`tests/integration/test_agent.py`
Task 8	`test: add tool calling loop tests`	`tests/integration/test_tool_loop.py`
Task 9	`test: add streaming response tests`	`tests/integration/test_streaming.py`
Task 10	`test: add conversation history tests`	`tests/integration/test_history.py`
Task 11	`test: add data quality agent scenario tests`	`tests/scenarios/test_data_quality.py`
Task 12	`test: add fact-checking agent scenario tests`	`tests/scenarios/test_fact_checking.py`
Task 13	`test: add multi-agent workflow scenario tests`	`tests/scenarios/test_multi_agent.py`
Task 14	`test: add MCP integration tests with mocks`	`tests/mocks/mcp_server.py`, `tests/integration/test_mcp.py`
Task 15	`test: add error handling tests`	`tests/integration/test_errors.py`
Task 16	`test: add coverage configuration and analysis`	`tests/.coveragerc`

Success Criteria

Verification Commands

# All tests pass
pytest tests/ -v

# Coverage meets targets
pytest tests/ --cov=agentlite --cov-report=term-missing

# No import errors
python -c "import agentlite; print('OK')"

# Type checking passes (if mypy configured)
mypy src/agentlite/

Final Checklist

All unit tests pass
All integration tests pass
All scenario tests pass
Coverage >= 80% overall
Core modules >= 90% coverage
All mocks work correctly
Tests run without real API keys
Tests are deterministic
Tests are well-documented
Test files follow naming convention

Notes

No Real API Calls: All tests must work without real API keys using mocks
Deterministic: Tests should produce consistent results
Fast: Unit tests should complete in < 1 second each
Isolated: Tests should not depend on each other
Documented: Complex scenarios should have docstrings explaining the use case

34 KiB Raw Blame History

AgentLite Comprehensive Test Suite Plan

Context

Test Location

Task Dependency Graph

Parallel Execution Graph

Tasks

Task 1: Test Configuration Setup

Task 2: Message Types Unit Tests

Task 3: Tool System Unit Tests

Task 4: Configuration Unit Tests

Task 5: Provider Protocol Unit Tests

Task 6: Mock Provider Implementation

Task 7: Agent Integration Tests

Task 8: Tool Calling Loop Tests

Task 9: Streaming Response Tests

Task 10: Conversation History Tests

Task 11: Real-World Scenario - Data Quality Agent

Task 12: Real-World Scenario - Fact-Checking Agent

Task 13: Real-World Scenario - Multi-Agent Workflow

Task 14: MCP Mock Tests

Task 15: Error Handling Tests

Task 16: Test Coverage Analysis

Test File Structure

Test Fixtures (conftest.py)

Core Fixtures

Mock Implementations

MockProvider

Test Configuration (pytest.ini)

Running Tests

Commit Strategy

Success Criteria

Verification Commands

Final Checklist

Notes

34 KiB

Raw Blame History