34 KiB
AgentLite Comprehensive Test Suite Plan
Context
AgentLite is a lightweight, async-first Agent component library for LLM applications. It provides:
- Agent: Main agent class with tool calling loop and streaming support
- OpenAIProvider: OpenAI-compatible provider implementation
- Tool System: @tool decorator, CallableTool, CallableTool2, SimpleToolset
- MCPClient: MCP server integration
- Message Types: ContentPart, Message, ToolCall, etc.
- Configuration: Pydantic-based config models
Test Location
/home/tcmofashi/proj/general_agent/agentlite/tests/
Task Dependency Graph
| Task | Depends On | Reason |
|---|---|---|
| 1. Test Configuration Setup | None | Foundation for all tests |
| 2. Message Types Unit Tests | Task 1 | Core data structures |
| 3. Tool System Unit Tests | Task 1 | Core tool abstractions |
| 4. Configuration Unit Tests | Task 1 | Config validation |
| 5. Provider Protocol Unit Tests | Task 1 | Provider interface |
| 6. Mock Provider Implementation | Task 1 | Required for integration tests |
| 7. Agent Integration Tests | Tasks 2, 3, 6 | Tests agent with mocked provider |
| 8. Tool Calling Loop Tests | Tasks 3, 6 | Tests tool execution flow |
| 9. Streaming Response Tests | Tasks 2, 6 | Tests streaming functionality |
| 10. Conversation History Tests | Task 7 | Tests history management |
| 11. Real-World Scenario: Data Quality Agent | Tasks 7, 8 | Practical use case |
| 12. Real-World Scenario: Fact-Checking Agent | Tasks 7, 8 | Practical use case |
| 13. Real-World Scenario: Multi-Agent Workflow | Tasks 7, 10 | Practical use case |
| 14. MCP Mock Tests | Tasks 3, 6 | Tests MCP integration with mocks |
| 15. Error Handling Tests | Tasks 6, 7 | Tests error scenarios |
| 16. Test Coverage Analysis | All above | Verify coverage targets |
Parallel Execution Graph
Wave 1 (Foundation - Start immediately):
├── Task 1: Test Configuration Setup
├── Task 2: Message Types Unit Tests
├── Task 3: Tool System Unit Tests
├── Task 4: Configuration Unit Tests
├── Task 5: Provider Protocol Unit Tests
└── Task 6: Mock Provider Implementation
Wave 2 (Core Integration - After Wave 1):
├── Task 7: Agent Integration Tests (depends: 1, 2, 3, 6)
├── Task 8: Tool Calling Loop Tests (depends: 3, 6)
└── Task 9: Streaming Response Tests (depends: 2, 6)
Wave 3 (Advanced Features - After Wave 2):
├── Task 10: Conversation History Tests (depends: 7)
├── Task 14: MCP Mock Tests (depends: 3, 6)
└── Task 15: Error Handling Tests (depends: 6, 7)
Wave 4 (Real-World Scenarios - After Wave 3):
├── Task 11: Data Quality Agent Scenario (depends: 7, 8)
├── Task 12: Fact-Checking Agent Scenario (depends: 7, 8)
└── Task 13: Multi-Agent Workflow Scenario (depends: 7, 10)
Wave 5 (Finalization - After Wave 4):
└── Task 16: Test Coverage Analysis (depends: all)
Critical Path: Task 1 → Task 6 → Task 7 → Task 10 → Task 13 → Task 16
Parallel Speedup: ~60% faster than sequential execution
Tasks
Task 1: Test Configuration Setup
Description: Create pytest configuration, conftest.py with shared fixtures, and test utilities.
Delegation Recommendation:
- Category:
quick- Configuration setup is straightforward - Skills: [
python-programmer] - Python testing infrastructure knowledge
Skills Evaluation:
- INCLUDED
python-programmer: Required for pytest configuration and fixture design - OMITTED
git-master: No git operations needed for this task - OMITTED
frontend-ui-ux: No UI work involved
Depends On: None
Acceptance Criteria:
pytest.iniconfigured with asyncio modeconftest.pywith shared fixtures (mock_provider, sample_messages, temp_agent)- Test utilities module for common assertions
- All tests can be run with
pytest tests/
Files to Create:
/home/tcmofashi/proj/general_agent/agentlite/tests/conftest.py/home/tcmofashi/proj/general_agent/agentlite/tests/utils.py
Commit: YES
- Message:
test: setup pytest configuration and shared fixtures - Files:
tests/conftest.py,tests/utils.py
Task 2: Message Types Unit Tests
Description: Test all message types: ContentPart, TextPart, ImageURLPart, AudioURLPart, ToolCall, ToolCallPart, Message.
Delegation Recommendation:
- Category:
quick- Unit tests for data structures - Skills: [
python-programmer] - Python testing patterns
Skills Evaluation:
- INCLUDED
python-programmer: Required for writing unit tests - OMITTED
frontend-ui-ux: No UI involved
Depends On: Task 1
Acceptance Criteria:
- ContentPart polymorphic validation works correctly
- TextPart merge_in_place works for streaming
- ToolCall merge_in_place works with ToolCallPart
- Message content coercion from string works
- Message.extract_text() returns correct text
- Message.has_tool_calls() returns correct boolean
- All edge cases covered (empty content, None values)
Test Cases:
test_content_part_registry- Verify subclass registrationtest_text_part_creation- Basic TextPart instantiationtest_text_part_merge- Streaming text mergetest_image_url_part- ImageURLPart creation and serializationtest_audio_url_part- AudioURLPart creation and serializationtest_tool_call_creation- ToolCall instantiationtest_tool_call_merge- ToolCall merging with ToolCallParttest_message_string_content- Message with string content coerciontest_message_list_content- Message with list of ContentPartstest_message_extract_text- Text extraction from mixed contenttest_message_has_tool_calls- Tool call detectiontest_message_serialization- Pydantic model_dump works
Files to Create:
/home/tcmofashi/proj/general_agent/agentlite/tests/unit/test_message.py
Commit: YES
- Message:
test: add unit tests for message types - Files:
tests/unit/test_message.py
Task 3: Tool System Unit Tests
Description: Test tool system: Tool, CallableTool, CallableTool2, SimpleToolset, @tool decorator, ToolResult types.
Delegation Recommendation:
- Category:
unspecified-low- Moderate complexity with async patterns - Skills: [
python-programmer] - Python async testing
Skills Evaluation:
- INCLUDED
python-programmer: Required for async tool testing - OMITTED
frontend-ui-ux: No UI involved
Depends On: Task 1
Acceptance Criteria:
- Tool JSON schema validation works
- CallableTool validates arguments against schema
- CallableTool2 uses Pydantic for validation
- SimpleToolset manages tools correctly
- @tool decorator creates valid tools
- Tool execution handles errors gracefully
- Async tool execution works correctly
Test Cases:
test_tool_schema_validation- Invalid schema raises ValueErrortest_tool_ok_result- ToolOk creation and propertiestest_tool_error_result- ToolError creation and propertiestest_callable_tool_validation- Argument validation against schematest_callable_tool_execution- Successful tool executiontest_callable_tool_error_handling- Exception handling in toolstest_callable_tool2_pydantic_validation- Pydantic model validationtest_callable_tool2_execution- Type-safe tool executiontest_simple_toolset_add_remove- Tool managementtest_simple_toolset_handle- Tool call handlingtest_simple_toolset_tool_not_found- Missing tool errortest_tool_decorator_basic- @tool creates valid tooltest_tool_decorator_with_params- @tool with custom name/descriptiontest_tool_decorator_type_hints- Type hint to schema conversiontest_tool_concurrent_execution- Multiple tools execute concurrently
Files to Create:
/home/tcmofashi/proj/general_agent/agentlite/tests/unit/test_tool.py
Commit: YES
- Message:
test: add unit tests for tool system - Files:
tests/unit/test_tool.py
Task 4: Configuration Unit Tests
Description: Test Pydantic configuration models: ProviderConfig, ModelConfig, ToolConfig, AgentConfig.
Delegation Recommendation:
- Category:
quick- Pydantic model validation tests - Skills: [
python-programmer] - Pydantic testing
Skills Evaluation:
- INCLUDED
python-programmer: Required for Pydantic validation tests
Depends On: Task 1
Acceptance Criteria:
- ProviderConfig validates base_url format
- ProviderConfig stores api_key as SecretStr
- ModelConfig validates temperature range
- ModelConfig validates provider is not empty
- AgentConfig validates default_model exists in models
- AgentConfig validates all model providers exist
- get_provider_config and get_model_config work correctly
Test Cases:
test_provider_config_validation- Valid config creationtest_provider_config_invalid_url- Invalid base_url raises errortest_provider_config_secret_str- API key is SecretStrtest_model_config_validation- Valid model configtest_model_config_temperature_range- Temperature bounds checkingtest_model_config_empty_provider- Empty provider raises errortest_agent_config_validation- Valid agent configtest_agent_config_missing_default_model- Missing default_model raises errortest_agent_config_unknown_provider- Unknown provider raises errortest_agent_config_get_provider- get_provider_config workstest_agent_config_get_model- get_model_config works
Files to Create:
/home/tcmofashi/proj/general_agent/agentlite/tests/unit/test_config.py
Commit: YES
- Message:
test: add unit tests for configuration models - Files:
tests/unit/test_config.py
Task 5: Provider Protocol Unit Tests
Description: Test provider protocol and exception types: ChatProvider, StreamedMessage, TokenUsage, exception hierarchy.
Delegation Recommendation:
- Category:
quick- Protocol and exception testing - Skills: [
python-programmer] - Python protocol testing
Skills Evaluation:
- INCLUDED
python-programmer: Required for protocol testing
Depends On: Task 1
Acceptance Criteria:
- TokenUsage calculates total correctly
- Exception hierarchy is correct
- APIStatusError stores status_code
- ChatProvider protocol can be implemented
Test Cases:
test_token_usage_total- Total token calculationtest_token_usage_defaults- Default cached_tokens = 0test_chat_provider_error_base- Base exception classtest_api_connection_error- APIConnectionError creationtest_api_timeout_error- APITimeoutError creationtest_api_status_error- APIStatusError with status_codetest_api_empty_response_error- APIEmptyResponseError creationtest_chat_provider_protocol- Protocol implementation check
Files to Create:
/home/tcmofashi/proj/general_agent/agentlite/tests/unit/test_provider.py
Commit: YES
- Message:
test: add unit tests for provider protocol - Files:
tests/unit/test_provider.py
Task 6: Mock Provider Implementation
Description: Create a comprehensive mock provider for testing that simulates OpenAI API responses without real API calls.
Delegation Recommendation:
- Category:
unspecified-low- Requires understanding of streaming and async patterns - Skills: [
python-programmer] - Async generator implementation
Skills Evaluation:
- INCLUDED
python-programmer: Required for mock provider implementation
Depends On: Task 1
Acceptance Criteria:
- MockProvider implements ChatProvider protocol
- Can simulate text responses
- Can simulate tool calls
- Can simulate streaming responses
- Can simulate errors
- Configurable response sequences
- Tracks calls for verification
Implementation Details:
class MockProvider:
"""Mock provider for testing.
Usage:
provider = MockProvider()
provider.add_response("Hello!")
provider.add_tool_call("add", {"a": 1, "b": 2}, "3")
agent = Agent(provider=provider)
response = await agent.run("Hi")
assert provider.calls == [...]
"""
Files to Create:
/home/tcmofashi/proj/general_agent/agentlite/tests/mocks/provider.py
Commit: YES
- Message:
test: add mock provider for testing - Files:
tests/mocks/provider.py
Task 7: Agent Integration Tests
Description: Test Agent class with mocked provider: initialization, run(), generate(), history management.
Delegation Recommendation:
- Category:
unspecified-low- Integration testing with async - Skills: [
python-programmer] - Async integration testing
Skills Evaluation:
- INCLUDED
python-programmer: Required for agent integration testing
Depends On: Tasks 1, 2, 3, 6
Acceptance Criteria:
- Agent initializes correctly with provider
- Agent.run() returns string response
- Agent.run(stream=True) returns async iterator
- Agent.generate() returns Message
- Agent adds messages to history
- Agent.clear_history() clears history
- Agent respects max_iterations
Test Cases:
test_agent_initialization- Basic agent creationtest_agent_with_tools- Agent with toolsettest_agent_run_simple- Simple non-streaming runtest_agent_run_streaming- Streaming responsetest_agent_generate- Generate without tool looptest_agent_history_tracking- Messages added to historytest_agent_clear_history- History cleared correctlytest_agent_max_iterations- Respects iteration limittest_agent_system_prompt- System prompt used
Files to Create:
/home/tcmofashi/proj/general_agent/agentlite/tests/integration/test_agent.py
Commit: YES
- Message:
test: add agent integration tests - Files:
tests/integration/test_agent.py
Task 8: Tool Calling Loop Tests
Description: Test the complete tool calling loop: agent requests tool, tool executes, result returned.
Delegation Recommendation:
- Category:
unspecified-low- Complex async flow testing - Skills: [
python-programmer] - Async flow testing
Skills Evaluation:
- INCLUDED
python-programmer: Required for tool loop testing
Depends On: Tasks 3, 6
Acceptance Criteria:
- Agent calls tool when requested by LLM
- Tool result is added to history
- Agent continues conversation after tool result
- Multiple tool calls in one response handled
- Tool errors are handled gracefully
- Tool calls are concurrent
Test Cases:
test_single_tool_call- One tool call in conversationtest_multiple_tool_calls- Multiple tools in one responsetest_tool_call_chain- Sequential tool callstest_tool_error_handling- Tool returns errortest_tool_not_found- Unknown tool requestedtest_tool_concurrent_execution- Tools execute concurrentlytest_tool_result_in_history- Tool results in conversation historytest_tool_call_with_arguments- Arguments passed correctly
Files to Create:
/home/tcmofashi/proj/general_agent/agentlite/tests/integration/test_tool_loop.py
Commit: YES
- Message:
test: add tool calling loop tests - Files:
tests/integration/test_tool_loop.py
Task 9: Streaming Response Tests
Description: Test streaming responses: text streaming, tool call streaming, mixed content.
Delegation Recommendation:
- Category:
unspecified-low- Async streaming testing - Skills: [
python-programmer] - Async generator testing
Skills Evaluation:
- INCLUDED
python-programmer: Required for streaming testing
Depends On: Tasks 2, 6
Acceptance Criteria:
- Text streams in chunks
- Tool calls stream correctly
- Mixed content (text + tool) streams correctly
- Complete response can be reconstructed
- Streaming works with tool calling loop
Test Cases:
test_stream_text_only- Simple text streamingtest_stream_tool_call- Tool call streamingtest_stream_mixed_content- Text then tool calltest_stream_reconstruction- Rebuild full responsetest_stream_with_tool_loop- Streaming in tool looptest_stream_empty_response- Empty stream handling
Files to Create:
/home/tcmofashi/proj/general_agent/agentlite/tests/integration/test_streaming.py
Commit: YES
- Message:
test: add streaming response tests - Files:
tests/integration/test_streaming.py
Task 10: Conversation History Tests
Description: Test conversation history management: message ordering, role tracking, history limits.
Delegation Recommendation:
- Category:
quick- History management testing - Skills: [
python-programmer] - State management testing
Skills Evaluation:
- INCLUDED
python-programmer: Required for history testing
Depends On: Task 7
Acceptance Criteria:
- Messages added in correct order
- Roles tracked correctly (user, assistant, tool)
- Tool call IDs preserved
- History can be inspected
- History can be cleared
- History persists across multiple runs
Test Cases:
test_history_message_order- Messages in correct ordertest_history_roles- Correct role trackingtest_history_tool_responses- Tool call IDs preservedtest_history_persistence- History across multiple runstest_history_clear- Clear history workstest_history_manual_add- Manually add messagestest_history_copy- history property returns copy
Files to Create:
/home/tcmofashi/proj/general_agent/agentlite/tests/integration/test_history.py
Commit: YES
- Message:
test: add conversation history tests - Files:
tests/integration/test_history.py
Task 11: Real-World Scenario - Data Quality Agent
Description: Test a realistic data quality improvement agent that validates and cleans data.
Delegation Recommendation:
- Category:
unspecified-high- Complex scenario testing - Skills: [
python-programmer] - Complex test scenario design
Skills Evaluation:
- INCLUDED
python-programmer: Required for scenario implementation
Depends On: Tasks 7, 8
Acceptance Criteria:
- Agent validates data format
- Agent identifies data quality issues
- Agent suggests corrections
- Uses multiple tools (validate, clean, analyze)
- Handles edge cases (empty data, invalid format)
Scenario:
# Data Quality Agent validates CSV data
# Tools: validate_csv, detect_anomalies, suggest_fixes
# Test with sample data containing errors
Test Cases:
test_data_quality_valid_data- Clean data passes validationtest_data_quality_detects_errors- Errors detected and reportedtest_data_quality_suggests_fixes- Corrections suggestedtest_data_quality_empty_data- Handles empty inputtest_data_quality_invalid_format- Handles format errors
Files to Create:
/home/tcmofashi/proj/general_agent/agentlite/tests/scenarios/test_data_quality.py
Commit: YES
- Message:
test: add data quality agent scenario tests - Files:
tests/scenarios/test_data_quality.py
Task 12: Real-World Scenario - Fact-Checking Agent
Description: Test a fact-checking agent that verifies claims using tools.
Delegation Recommendation:
- Category:
unspecified-high- Complex scenario testing - Skills: [
python-programmer] - Complex test scenario design
Skills Evaluation:
- INCLUDED
python-programmer: Required for scenario implementation
Depends On: Tasks 7, 8
Acceptance Criteria:
- Agent extracts claims from text
- Agent uses search tool to verify
- Agent provides verdict with evidence
- Handles uncertain claims appropriately
- Multiple claims in one text handled
Scenario:
# Fact-Checking Agent verifies statements
# Tools: search_facts, calculate_statistics, check_date
# Test with verifiable and unverifiable claims
Test Cases:
test_fact_check_true_claim- Correctly identifies true claimtest_fact_check_false_claim- Correctly identifies false claimtest_fact_check_multiple_claims- Multiple claims in one texttest_fact_check_uncertain- Handles uncertain claimstest_fact_check_with_evidence- Provides supporting evidence
Files to Create:
/home/tcmofashi/proj/general_agent/agentlite/tests/scenarios/test_fact_checking.py
Commit: YES
- Message:
test: add fact-checking agent scenario tests - Files:
tests/scenarios/test_fact_checking.py
Task 13: Real-World Scenario - Multi-Agent Workflow
Description: Test multiple agents collaborating on a complex task.
Delegation Recommendation:
- Category:
unspecified-high- Complex multi-agent testing - Skills: [
python-programmer] - Complex scenario design
Skills Evaluation:
- INCLUDED
python-programmer: Required for multi-agent testing
Depends On: Tasks 7, 10
Acceptance Criteria:
- Multiple agents can share a provider
- Agents maintain separate histories
- Workflow stages execute in order
- Output from one agent feeds into next
- Each agent has specialized role
Scenario:
# Research → Write → Edit workflow
# Researcher gathers facts
# Writer creates content
# Editor reviews and improves
Test Cases:
test_multi_agent_research_write- Research to writer flowtest_multi_agent_with_editor- Three-agent workflowtest_multi_agent_isolated_histories- Histories don't leaktest_multi_agent_shared_provider- Provider shared correctlytest_multi_agent_error_handling- Errors don't break workflow
Files to Create:
/home/tcmofashi/proj/general_agent/agentlite/tests/scenarios/test_multi_agent.py
Commit: YES
- Message:
test: add multi-agent workflow scenario tests - Files:
tests/scenarios/test_multi_agent.py
Task 14: MCP Mock Tests
Description: Test MCP integration with mocked MCP server.
Delegation Recommendation:
- Category:
unspecified-low- MCP protocol mocking - Skills: [
python-programmer] - Protocol mocking
Skills Evaluation:
- INCLUDED
python-programmer: Required for MCP mocking
Depends On: Tasks 3, 6
Acceptance Criteria:
- MCPClient connects to mock server
- Tools load from mock server
- MCP tools execute correctly
- MCP errors handled gracefully
- Connection cleanup works
Test Cases:
test_mcp_connect_stdio- STDIO connection mocktest_mcp_connect_sse- SSE connection mocktest_mcp_load_tools- Load tools from mocktest_mcp_tool_execution- Execute MCP tooltest_mcp_error_handling- MCP errors handledtest_mcp_context_manager- Async context manager works
Files to Create:
/home/tcmofashi/proj/general_agent/agentlite/tests/mocks/mcp_server.py/home/tcmofashi/proj/general_agent/agentlite/tests/integration/test_mcp.py
Commit: YES
- Message:
test: add MCP integration tests with mocks - Files:
tests/mocks/mcp_server.py,tests/integration/test_mcp.py
Task 15: Error Handling Tests
Description: Test error scenarios: provider errors, tool errors, timeout, connection issues.
Delegation Recommendation:
- Category:
unspecified-low- Error scenario testing - Skills: [
python-programmer] - Error testing patterns
Skills Evaluation:
- INCLUDED
python-programmer: Required for error testing
Depends On: Tasks 6, 7
Acceptance Criteria:
- APIConnectionError handled correctly
- APITimeoutError handled correctly
- APIStatusError handled correctly
- Tool execution errors don't crash agent
- Invalid tool arguments handled
- Max iterations prevents infinite loops
Test Cases:
test_provider_connection_error- Connection failuretest_provider_timeout_error- Request timeouttest_provider_status_error- HTTP error statustest_provider_empty_response- Empty response handlingtest_tool_execution_error- Tool raises exceptiontest_tool_invalid_arguments- Invalid args to tooltest_tool_not_found_error- Unknown tool calledtest_max_iterations_reached- Loop preventiontest_json_decode_error- Invalid JSON in tool args
Files to Create:
/home/tcmofashi/proj/general_agent/agentlite/tests/integration/test_errors.py
Commit: YES
- Message:
test: add error handling tests - Files:
tests/integration/test_errors.py
Task 16: Test Coverage Analysis
Description: Analyze test coverage and ensure targets are met.
Delegation Recommendation:
- Category:
quick- Coverage analysis - Skills: [
python-programmer] - Coverage tooling
Skills Evaluation:
- INCLUDED
python-programmer: Required for coverage analysis
Depends On: All previous tasks
Acceptance Criteria:
- Overall coverage >= 80%
- Core modules (message, tool, agent) >= 90%
- Provider module >= 70%
- MCP module >= 60%
- Coverage report generated
- Missing coverage documented
Coverage Targets:
| Module | Target | Priority |
|---|---|---|
| agentlite.message | 95% | P0 |
| agentlite.tool | 95% | P0 |
| agentlite.agent | 90% | P0 |
| agentlite.config | 90% | P0 |
| agentlite.provider | 80% | P1 |
| agentlite.providers.openai | 70% | P1 |
| agentlite.mcp | 60% | P2 |
Files to Create:
/home/tcmofashi/proj/general_agent/agentlite/tests/.coveragerc
Commit: YES
- Message:
test: add coverage configuration and analysis - Files:
tests/.coveragerc
Test File Structure
/home/tcmofashi/proj/general_agent/agentlite/tests/
├── conftest.py # Shared fixtures and configuration
├── utils.py # Test utilities and helpers
├── .coveragerc # Coverage configuration
├── unit/ # Unit tests
│ ├── __init__.py
│ ├── test_message.py # Message types tests
│ ├── test_tool.py # Tool system tests
│ ├── test_config.py # Configuration tests
│ └── test_provider.py # Provider protocol tests
├── integration/ # Integration tests
│ ├── __init__.py
│ ├── test_agent.py # Agent integration tests
│ ├── test_tool_loop.py # Tool calling loop tests
│ ├── test_streaming.py # Streaming tests
│ ├── test_history.py # History management tests
│ ├── test_mcp.py # MCP integration tests
│ └── test_errors.py # Error handling tests
├── scenarios/ # Real-world scenario tests
│ ├── __init__.py
│ ├── test_data_quality.py # Data quality agent
│ ├── test_fact_checking.py # Fact-checking agent
│ └── test_multi_agent.py # Multi-agent workflow
└── mocks/ # Mock implementations
├── __init__.py
├── provider.py # Mock OpenAI provider
└── mcp_server.py # Mock MCP server
Test Fixtures (conftest.py)
Core Fixtures
# Mock provider fixtures
@pytest.fixture
def mock_provider():
"""Create a mock provider with no responses configured."""
return MockProvider()
@pytest.fixture
def mock_provider_with_response():
"""Create a mock provider that returns a simple text response."""
provider = MockProvider()
provider.add_text_response("Hello!")
return provider
# Sample message fixtures
@pytest.fixture
def sample_text_message():
"""Create a sample text message."""
return Message(role="user", content="Hello!")
@pytest.fixture
def sample_tool_call():
"""Create a sample tool call."""
return ToolCall(
id="call_123",
function=ToolCall.FunctionBody(
name="add",
arguments='{"a": 1, "b": 2}'
)
)
# Tool fixtures
@pytest.fixture
def add_tool():
"""Create a simple add tool."""
@tool()
async def add(a: float, b: float) -> float:
"""Add two numbers."""
return a + b
return add
@pytest.fixture
def error_tool():
"""Create a tool that raises an error."""
@tool()
async def error() -> str:
"""Always raises an error."""
raise ValueError("Test error")
return error
# Agent fixtures
@pytest.fixture
async def simple_agent(mock_provider):
"""Create a simple agent with mocked provider."""
return Agent(provider=mock_provider)
@pytest.fixture
async def agent_with_tools(mock_provider, add_tool):
"""Create an agent with tools."""
return Agent(provider=mock_provider, tools=[add_tool])
Mock Implementations
MockProvider
class MockProvider:
"""Mock provider for testing AgentLite without real API calls.
This provider simulates OpenAI API responses and allows:
- Configuring response sequences
- Simulating tool calls
- Simulating errors
- Tracking all calls for verification
Example:
provider = MockProvider()
provider.add_text_response("Hello!")
provider.add_tool_call("add", {"a": 1, "b": 2}, "3")
agent = Agent(provider=provider)
response = await agent.run("Hi")
# Verify calls
assert len(provider.calls) == 1
assert provider.calls[0].system_prompt == "You are helpful."
"""
def __init__(self):
self.responses = []
self.calls = []
self.model = "mock-model"
def add_text_response(self, text: str):
"""Add a text response to the queue."""
self.responses.append({"type": "text", "content": text})
def add_tool_call(self, name: str, arguments: dict, result: str):
"""Add a tool call response to the queue."""
self.responses.append({
"type": "tool_call",
"name": name,
"arguments": arguments,
"result": result
})
def add_error(self, error: Exception):
"""Add an error response to the queue."""
self.responses.append({"type": "error", "error": error})
async def generate(self, system_prompt, tools, history):
"""Generate a mock response."""
self.calls.append(MockCall(
system_prompt=system_prompt,
tools=tools,
history=list(history)
))
if not self.responses:
return MockStreamedMessage([TextPart(text="Mock response")])
response = self.responses.pop(0)
if response["type"] == "error":
raise response["error"]
elif response["type"] == "text":
return MockStreamedMessage([TextPart(text=response["content"])])
elif response["type"] == "tool_call":
return MockStreamedMessage([
ToolCall(
id="call_123",
function=ToolCall.FunctionBody(
name=response["name"],
arguments=json.dumps(response["arguments"])
)
)
])
Test Configuration (pytest.ini)
[pytest]
testpaths = tests
asyncio_mode = auto
asyncio_default_fixture_loop_scope = function
pythonpath = src
addopts = -v --tb=short --strict-markers
markers =
unit: Unit tests
integration: Integration tests
scenario: Real-world scenario tests
slow: Slow tests
Running Tests
# Run all tests
cd /home/tcmofashi/proj/general_agent/agentlite
pytest tests/
# Run with coverage
pytest tests/ --cov=agentlite --cov-report=html --cov-report=term
# Run specific test categories
pytest tests/unit/ -v
pytest tests/integration/ -v
pytest tests/scenarios/ -v
# Run with markers
pytest -m unit
pytest -m integration
pytest -m "not slow"
# Run specific test file
pytest tests/unit/test_message.py -v
# Run with debugging
pytest tests/ -v --pdb
Commit Strategy
| After Task | Commit Message | Files |
|---|---|---|
| Task 1 | test: setup pytest configuration and shared fixtures |
tests/conftest.py, tests/utils.py |
| Task 2 | test: add unit tests for message types |
tests/unit/test_message.py |
| Task 3 | test: add unit tests for tool system |
tests/unit/test_tool.py |
| Task 4 | test: add unit tests for configuration models |
tests/unit/test_config.py |
| Task 5 | test: add unit tests for provider protocol |
tests/unit/test_provider.py |
| Task 6 | test: add mock provider for testing |
tests/mocks/provider.py |
| Task 7 | test: add agent integration tests |
tests/integration/test_agent.py |
| Task 8 | test: add tool calling loop tests |
tests/integration/test_tool_loop.py |
| Task 9 | test: add streaming response tests |
tests/integration/test_streaming.py |
| Task 10 | test: add conversation history tests |
tests/integration/test_history.py |
| Task 11 | test: add data quality agent scenario tests |
tests/scenarios/test_data_quality.py |
| Task 12 | test: add fact-checking agent scenario tests |
tests/scenarios/test_fact_checking.py |
| Task 13 | test: add multi-agent workflow scenario tests |
tests/scenarios/test_multi_agent.py |
| Task 14 | test: add MCP integration tests with mocks |
tests/mocks/mcp_server.py, tests/integration/test_mcp.py |
| Task 15 | test: add error handling tests |
tests/integration/test_errors.py |
| Task 16 | test: add coverage configuration and analysis |
tests/.coveragerc |
Success Criteria
Verification Commands
# All tests pass
pytest tests/ -v
# Coverage meets targets
pytest tests/ --cov=agentlite --cov-report=term-missing
# No import errors
python -c "import agentlite; print('OK')"
# Type checking passes (if mypy configured)
mypy src/agentlite/
Final Checklist
- All unit tests pass
- All integration tests pass
- All scenario tests pass
- Coverage >= 80% overall
- Core modules >= 90% coverage
- All mocks work correctly
- Tests run without real API keys
- Tests are deterministic
- Tests are well-documented
- Test files follow naming convention
Notes
- No Real API Calls: All tests must work without real API keys using mocks
- Deterministic: Tests should produce consistent results
- Fast: Unit tests should complete in < 1 second each
- Isolated: Tests should not depend on each other
- Documented: Complex scenarios should have docstrings explaining the use case