Reference - Chain of Thought Analysis Documentation

Access Requirements

Primary Access Credential (Required)

A primary access credential is required to run analyses through a hosted inference service.

Typical workflow:

Create an account with your chosen service
Generate an access credential
Store it securely and pass it to your integration

Illustrative usage:

Python

result = await run_analysis(
    query="Your analysis query",
    api_key="your-access-key",
)

Optional Research Credential

Some deployments support optional live research or fresh-data retrieval. When available, an additional credential can unlock those capabilities.

Python

result = await run_analysis(
    query="Analyze recent developments in AI regulation",
    api_key="your-access-key",
    research_key="your-research-key",
)

Optional Image Credential

Some environments support optional image generation for report artwork. When available, a separate credential can enable that feature.

Python

result = await run_analysis(
    query="Analyze the future of sustainable architecture",
    api_key="your-access-key",
    image_key="your-image-key",
)

Access Summary

Credential	Required?	Purpose	Illustrative Name
Primary access credential	Required	Model access	`api_key`
Research credential	Optional	Live research features	`research_key`
Image credential	Optional	Image generation	`image_key`

Available Models

GPT-OSS 20B

Identifier: openai/gpt-oss-20b

Description	Compact open-weight Mixture of Experts (MoE) model optimized for cost-efficient deployment
Size	21 billion total parameters, 3.6 billion active per token (32 experts, Top-4 routing)
Architecture	MoE with 24 layers, Grouped Query Attention, RMSNorm
Context Window	128K tokens
Speed	High-throughput hosted inference
License	Apache 2.0 (fully open for commercial use)

Hardware Requirements: Can run on high-end consumer GPUs with at least 16-20 GB VRAM (NVIDIA RTX 4090/5090). Using MXFP4 quantization enables fast, efficient local inference. 24+ GB VRAM recommended for optimal performance.

Best For: Cost-efficient agentic workflows, tool calling, web browsing, code execution.

GPT-OSS 120B

Identifier: openai/gpt-oss-120b

Description	Larger open-weight MoE model for complex tasks
Size	120 billion total parameters
Context Window	128K tokens
License	Apache 2.0 (fully open for commercial use)

Hardware Requirements: Requires a single 80GB H100 GPU (typically accessed via data center or cloud).

Best For: Complex reasoning, advanced code generation, research tasks.

Why These Models?

This program relies extensively on advanced structured data output—the ability for LLMs to return responses in precise, validated formats (Pydantic models). The GPT-OSS models were specifically trained by OpenAI to handle structured data as part of their training process, making them ideal for this application where every response must conform to a specific schema.

Error Handling

Cost Safeguards

The system tracks costs continuously and enforces limits:

Python

# Internal cost checking (from source)
current_cost = await cost_tracker.get_total_cost()
if current_cost > max_cost:
    raise CostFailsafeError(
        message=f"Cost limit exceeded: ${current_cost:.4f} > ${max_cost}",
        current_cost=current_cost,
        cost_limit=max_cost,
    )

Task Evaluation Recovery

%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#fff', 'primaryBorderColor': '#4a4a6a', 'lineColor': '#6c63ff', 'fontFamily': 'JetBrains Mono, monospace'}}}%%
flowchart TD
    A["Task Completes"] --> B{"Status
Check"}
    B -->|"Error"| C["Automatic Rejection
no LLM call"]
    B -->|"Success"| D["LLM Evaluation"]
    D --> E{"Quality
Assessment"}
    E -->|"Accepted"| F["Write to
Analysis History"]
    E -->|"Rejected"| G["Log Reason
Don't Write"]

    C --> H(["Continue Pipeline"])
    F --> H
    G --> H

    style A fill:#16213e,stroke:#4a4a6a,color:#fff
    style B fill:#0f3460,stroke:#e94560,stroke-width:2px,color:#fff
    style C fill:#6b2737,stroke:#e94560,color:#fff
    style D fill:#16213e,stroke:#4a4a6a,color:#fff
    style E fill:#0f3460,stroke:#e94560,stroke-width:2px,color:#fff
    style F fill:#0d7377,stroke:#14ffec,stroke-width:2px,color:#fff
    style G fill:#6b2737,stroke:#e94560,color:#fff
    style H fill:#4a4a6a,stroke:#6c63ff,color:#fff

Only accepted tasks inform subsequent analysis, preventing error propagation.

Troubleshooting

Possible causes:

Invalid or missing access credential
Cost limit set too low
Content filter blocking query

Solutions:

Verify API key is valid and has credits
Increase cost_limit parameter
Review query for content policy issues

Possible causes:

Complex query with many iterations
Network issues
API rate limiting

Solutions:

Reduce max_iterations
Use quick mode for testing
Check your provider's service status page

Possible causes:

Using 20B model for complex task
Insufficient context
Too few iterations

Solutions:

Upgrade to gpt-oss-120b
Provide more focused context
Increase max_iterations and use thorough mode

Possible causes:

Safety violations in generated code
Runtime errors in calculations
Timeout during execution

Solutions:

Check error details in output
Simplify the computational request
Increase python_tool_timeout

Symptoms:

Slow execution
Timeout errors
Partial results

Solutions:

Use shared state manager for concurrent runs
Reduce concurrent analysis count
Add delays between requests

Access Requirements

Access Summary

Available Models

Why These Models?

Error Handling

Cost Safeguards

Task Evaluation Recovery

Troubleshooting

Analysis Fails Immediately

Analysis Times Out

Results Are Low Quality

Python Tool Errors

Rate Limiting