Pipeline - Chain of Thought Analysis Documentation

%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#fff', 'primaryBorderColor': '#4a4a6a', 'lineColor': '#6c63ff', 'secondaryColor': '#16213e', 'tertiaryColor': '#0f3460', 'fontFamily': 'JetBrains Mono, monospace'}}}%%
flowchart TD
    subgraph INPUT["📥 INPUT"]
        A[/"User Query"/]
    end

    subgraph ANALYSIS["🔍 ANALYSIS"]
        B["1. Complexity Analysis"]
      C["2. Plan Generation\n& Selection"]
    end

    subgraph EXECUTION["⚡ EXECUTION"]
      E["3. Task Execution"]
      F{"4. Evaluation
& Iteration Decision"}
    end

    subgraph OUTPUT["📤 OUTPUT"]
      G["5. Synthesis"]
      H["6. Focused Answer"]
        I[/"Final Result"/]
    end

    A --> B
    B --> C
    C --> E
    E --> F
    F -->|"Continue"| B
    F -->|"Done"| G
    G --> H
    H --> I

    style A fill:#4a4a6a,stroke:#6c63ff,stroke-width:2px,color:#fff
    style I fill:#4a4a6a,stroke:#6c63ff,stroke-width:2px,color:#fff
    style F fill:#0f3460,stroke:#e94560,stroke-width:2px,color:#fff
    style B fill:#1a1a2e,stroke:#4a4a6a,color:#fff
    style C fill:#1a1a2e,stroke:#4a4a6a,color:#fff
    style D fill:#1a1a2e,stroke:#4a4a6a,color:#fff
    style E fill:#1a1a2e,stroke:#4a4a6a,color:#fff
    style G fill:#1a1a2e,stroke:#4a4a6a,color:#fff
    style H fill:#1a1a2e,stroke:#4a4a6a,color:#fff

Pipeline Stage Overview

Stage 1: Complexity Analysis

Purpose: Analyze the incoming query to understand its complexity before generating plans.

The system identifies problem dimensions, determines overall complexity, and suggests an iteration strategy.

Complexity Levels

Level	Typical Tasks	Description
`straightforward`	2-3	Direct questions with clear answers
`moderate`	3-4	Multi-step problems with some dependencies
`complex`	4-5	Problems requiring multiple perspectives
`highly_complex`	5-6	Deep analysis with many interrelated factors

%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#fff', 'primaryBorderColor': '#4a4a6a', 'lineColor': '#6c63ff', 'fontFamily': 'JetBrains Mono, monospace'}}}%%
flowchart LR
    subgraph COMPLEXITY["Complexity Levels"]
        direction TB
        C1["Straightforward"]
        C2["Moderate"]
        C3["Complex"]
        C4["Highly Complex"]
    end

    subgraph TASKS["Task Allocation"]
        direction TB
        T1["2-3 tasks"]
        T2["3-4 tasks"]
        T3["4-5 tasks"]
        T4["5-6 tasks"]
    end

    C1 --> T1
    C2 --> T2
    C3 --> T3
    C4 --> T4

    style C1 fill:#0d7377,stroke:#14ffec,color:#fff
    style C2 fill:#16213e,stroke:#4a4a6a,color:#fff
    style C3 fill:#0f3460,stroke:#6c63ff,color:#fff
    style C4 fill:#6b2737,stroke:#e94560,color:#fff

Complexity Level to Task Count Mapping

What Gets Analyzed

Problem dimensions: The different aspects that need to be addressed
Key challenges: What makes this problem difficult
Sequential dependencies: Which tasks must happen in order
Iteration strategy: How many passes might be needed
Errors from previous iterations: What went wrong before (in later iterations)

💡

Tip

You can disable complexity analysis with enable_complexity_analysis=False to save one API call for simple queries. However, keeping it enabled helps the system choose appropriate task counts.

Stage 2: Plan Generation

Purpose: Generate multiple candidate plans for solving the problem.

The system creates several different approaches to answering your query, each consisting of a sequence of tasks.

Plan Structure

Each plan contains 2-6 tasks based on the complexity analysis. Every task specifies:

resource_name: Which tool to use (e.g., logic_kernel, py_executor)
task_query: What the tool should accomplish

Example

For the query "What factors affect solar panel efficiency?", the system might generate:

Plan A (Knowledge-focused)

Knowledge tool: Research photovoltaic cell physics
Knowledge tool: Identify environmental factors
Reasoning tool: Synthesize and rank factors

Plan B (Computation-focused)

Knowledge tool: Gather efficiency data
Python tool: Calculate correlation between factors
Reasoning tool: Interpret results

Plan C (Mixed approach)

Knowledge tool: Background on solar technology
Reasoning tool: Analyze cause-effect relationships
Python tool: Quantify impact ranges

Configuration

Setting	Effect
`candidate_plans`	Number of plans generated (1-5, default 3)
Maximum tasks	24 per plan (safety limit)

Plan Selection (Part of Stage 2)

Purpose: When multiple plans exist, evaluate and select the best one before task execution.

Each candidate plan is scored on a 0-100 scale, and the highest-scoring plan is selected for execution.

This is a sub-step inside Stage 2, not a separate pipeline stage.

Evaluation Criteria

Appropriateness of tool selection for each task
Logical sequencing of tasks
Coverage of problem dimensions identified in complexity analysis
Feasibility given available tools

Selection

Python

# The system selects the plan with highest overall score
best_eval = max(evaluation.evaluations, key=lambda e: e.overall_score)

💡

Tip

Setting candidate_plans=1 skips plan evaluation entirely (only one plan to choose from), saving cost for straightforward queries.

Stage 3: Task Execution

Purpose: Execute tasks sequentially with immediate quality evaluation.

This is where the actual analysis work happens. The system runs each task in the selected plan, one at a time.

Why Sequential (Not Parallel)

"Sequential execution sends requests one at a time to the same server, maximizing prompt cache hits and reducing costs."

The system deliberately executes tasks one after another because:

Later tasks can use results from earlier tasks
API calls to the same server benefit from prompt caching
Quality evaluation between tasks prevents error propagation

Task Execution Flow

For each task in the plan:

%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#fff', 'primaryBorderColor': '#4a4a6a', 'lineColor': '#6c63ff', 'fontFamily': 'JetBrains Mono, monospace'}}}%%
flowchart LR
    subgraph TASK["For Each Task in Plan"]
        A["Convert Query
to Tool Format"] --> B["Execute
Tool"]
        B --> C{"Evaluate
Quality"}
        C -->|"ACCEPTED"| D["Write to
Analysis History"]
        C -->|"REJECTED"| E["Log Reason
Skip Writing"]
    end

    D --> F(["Next Task"])
    E --> F

    style A fill:#16213e,stroke:#4a4a6a,color:#fff
    style B fill:#16213e,stroke:#4a4a6a,color:#fff
    style C fill:#0f3460,stroke:#e94560,stroke-width:2px,color:#fff
    style D fill:#0d7377,stroke:#14ffec,stroke-width:2px,color:#fff
    style E fill:#6b2737,stroke:#e94560,stroke-width:2px,color:#fff
    style F fill:#4a4a6a,stroke:#6c63ff,color:#fff

Critical Design Decision

Only accepted tasks write their results to analysis history. This ensures subsequent tasks see only quality results, preventing bad intermediate work from contaminating later analysis.

Passthrough Tools

Some tools don't need query conversion:

Python

PASSTHROUGH_TOOLS = {"py_executor"}

The Python tool receives task queries directly because they already contain computational specifications.

Stage 4: Evaluation & Iteration Decision

Purpose: Decide whether to continue with another iteration or proceed to synthesis.

After completing all tasks in a plan, the system evaluates overall progress and decides whether more work is needed.

Decision Factors

Factor	Description
`continue_iterations`	Should we keep going? (Boolean)
`quality_summary`	Assessment of current work quality
`decision_rationale`	Why continue or stop
`recommendations_for_next_iteration`	What to focus on if continuing

Controls

Setting	Effect
`max_iterations`	Hard limit—never exceeds this
`min_iterations`	Minimum before termination allowed (if enforced)
`enforce_iterations`	Whether to enforce minimum
`execution_retry_limit`	Max retries after failures before giving up

Iteration Loop

%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#fff', 'primaryBorderColor': '#4a4a6a', 'lineColor': '#6c63ff', 'fontFamily': 'JetBrains Mono, monospace'}}}%%
flowchart TD
    A["Task Execution
Complete"] --> B{"Continue
Iterating?"}
    B -->|"Yes - More work needed"| C["Generate New Plan
with Insights"]
    C --> D["Execute New Tasks"]
    D --> A
    B -->|"No - Quality sufficient"| E["Proceed to
Synthesis"]

    style A fill:#16213e,stroke:#4a4a6a,color:#fff
    style B fill:#0f3460,stroke:#e94560,stroke-width:2px,color:#fff
    style C fill:#16213e,stroke:#4a4a6a,color:#fff
    style D fill:#16213e,stroke:#4a4a6a,color:#fff
    style E fill:#0d7377,stroke:#14ffec,stroke-width:2px,color:#fff

Stage 5: Synthesis

Purpose: Combine all accepted task results into a coherent final response.

The system takes everything learned across all iterations and produces unified outputs.

Outputs Generated

Synthesis Response

Comprehensive answer with full reasoning
List of key findings discovered during analysis

Final Answer

Refined, user-facing response
More polished and direct than the synthesis

Metadata

Title for the analysis
Headline summary
Image caption (if image generation is enabled)

What Goes Into Synthesis

The synthesis stage has access to:

Original query and context
All accepted task results from all iterations
Complexity analysis findings
Evaluation feedback from each iteration

Stage 6: Focused Answer (Optional)

Purpose: Extract a specific, constrained answer type when requested.

This stage only runs if you set focused_answer_type to something other than "none".

Supported Answer Types

Type	Output
`number`	Numeric value
`yes/no`	"Yes" or "No"
`true/false`	"True" or "False"
`good/bad`	"Good" or "Bad"
`high/low`	"High" or "Low"
`accepted/rejected`	"Accepted" or "Rejected"
`yes/maybe/no`	"Yes", "Maybe", or "No"
`good/neutral/bad`	"Good", "Neutral", or "Bad"
`high/medium/low`	"High", "Medium", or "Low"

Middleware Use Case

Focused answers enable using this system as middleware in automated pipelines:

Scenario	Focused Answer Type	Downstream Action
Document review	`accepted/rejected`	Route to approval queue or revision
Risk assessment	`high/medium/low`	Adjust portfolio allocation
Content moderation	`good/bad`	Publish or flag for review
Price estimation	`number`	Set automated pricing
Compliance check	`yes/no`	Allow transaction or block

Benefits:

Predictable output format for parsing
Clear decision boundaries
Easy integration with existing systems
Reduced ambiguity compared to free-form text

Pipeline Flow Example

Here's a concrete example of how a query flows through the pipeline:

Query: "Is Python or JavaScript better for building a web scraper?"

%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#fff', 'primaryBorderColor': '#4a4a6a', 'lineColor': '#6c63ff', 'fontFamily': 'JetBrains Mono, monospace'}}}%%
flowchart TD
    subgraph QUERY["Query"]
        Q[/"Is Python or JavaScript
better for web scraping?"/]
    end

    subgraph S1["Stage 1: Complexity"]
        C1["Complexity: moderate
Dimensions: 4 aspects
Suggested tasks: 3-4"]
    end

    subgraph S2["Stage 2: Planning & Selection"]
        P1["Plan A: 78/100"]
        P2["Plan B: 85/100 ✓"]
        P3["Plan C: 72/100"]
    end

    subgraph EXEC_STAGE["Stage 3: Execution"]
        T1["Task 1: Python capabilities
✓ Accepted"]
        T2["Task 2: JavaScript capabilities
✓ Accepted"]
        T3["Task 3: Compare & recommend
✓ Accepted"]
    end

    subgraph S4["Stage 4: Evaluation"]
        D1["Quality: Good coverage
Decision: Proceed"]
    end

    subgraph S5["Stage 5-6: Output"]
        O1["Synthesis: Comprehensive comparison"]
        O2["Final Answer: Recommendation"]
        O3["Focused: 'Python' or 'JavaScript'"]
    end

    Q --> C1
    C1 --> P1 & P2 & P3
    P2 --> T1
    T1 --> T2
    T2 --> T3
    T3 --> D1
    D1 --> O1
    O1 --> O2
    O2 --> O3

    style Q fill:#4a4a6a,stroke:#6c63ff,stroke-width:2px,color:#fff
    style P2 fill:#0d7377,stroke:#14ffec,stroke-width:2px,color:#fff
    style T1 fill:#0d7377,stroke:#14ffec,color:#fff
    style T2 fill:#0d7377,stroke:#14ffec,color:#fff
    style T3 fill:#0d7377,stroke:#14ffec,color:#fff
    style O3 fill:#4a4a6a,stroke:#6c63ff,stroke-width:2px,color:#fff

Example Pipeline Flow

Stage 1 - Complexity Analysis

Complexity: moderate
Dimensions: Language features, library ecosystem, performance, ease of use
Suggested tasks: 3-4

Stage 2 - Plan Generation

Plan A: Compare libraries → Analyze syntax → Evaluate performance
Plan B: Knowledge on both → Reasoning comparison → Recommendation
Plan C: Deep dive Python → Deep dive JavaScript → Side-by-side

Stage 2 Sub-step - Plan Selection

Plan A: 78/100
Plan B: 85/100 ← Selected
Plan C: 72/100

Stage 3 - Task Execution

Task 1 (Knowledge): Python web scraping capabilities → Accepted
Task 2 (Knowledge): JavaScript web scraping capabilities → Accepted
Task 3 (Reasoning): Compare and recommend → Accepted

Stage 4 - Evaluation & Iteration Decision

Quality: Good coverage of both languages
Decision: Proceed to synthesis

Stage 5 - Synthesis

Combines findings into comprehensive comparison, generates final answer with recommendation

Stage 6 - Focused Answer

(if focused_answer_type was set): Would extract "Python" or "JavaScript" as the constrained response

Understanding Iterations

What Triggers Another Iteration?

The system may continue iterating when:

Key aspects of the query weren't addressed
Task results revealed new questions
Quality evaluation found gaps in reasoning
Errors in previous work need correction

What Changes in Subsequent Iterations?

In iteration 2+, the system sees:

Results from all previously accepted tasks
Evaluation feedback identifying gaps
Detected errors from prior work

This allows it to:

Fill in missing pieces
Correct mistakes
Explore alternative approaches
Deepen analysis on important points

Cost Implications

Each iteration involves:

New plan generation (or plan adjustment)
Additional task executions
More API calls

More iterations = more thorough analysis but higher cost. Use max_iterations and cost_limit to control this trade-off.