%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#fff', 'primaryBorderColor': '#4a4a6a', 'lineColor': '#6c63ff', 'secondaryColor': '#16213e', 'tertiaryColor': '#0f3460', 'fontFamily': 'JetBrains Mono, monospace'}}}%%
flowchart TD
subgraph INPUT["π₯ INPUT"]
A[/"User Query"/]
end
subgraph ANALYSIS["π ANALYSIS"]
B["1. Complexity Analysis"]
C["2. Plan Generation"]
D["3. Plan Evaluation & Selection"]
end
subgraph EXECUTION["β‘ EXECUTION"]
E["4. Task Execution"]
F{"5. Iteration
Decision"}
end
subgraph OUTPUT["π€ OUTPUT"]
G["6. Synthesis"]
H["7. Focused Answer"]
I[/"Final Result"/]
end
A --> B
B --> C
C --> D
D --> E
E --> F
F -->|"Continue"| B
F -->|"Done"| G
G --> H
H --> I
style A fill:#4a4a6a,stroke:#6c63ff,stroke-width:2px,color:#fff
style I fill:#4a4a6a,stroke:#6c63ff,stroke-width:2px,color:#fff
style F fill:#0f3460,stroke:#e94560,stroke-width:2px,color:#fff
style B fill:#1a1a2e,stroke:#4a4a6a,color:#fff
style C fill:#1a1a2e,stroke:#4a4a6a,color:#fff
style D fill:#1a1a2e,stroke:#4a4a6a,color:#fff
style E fill:#1a1a2e,stroke:#4a4a6a,color:#fff
style G fill:#1a1a2e,stroke:#4a4a6a,color:#fff
style H fill:#1a1a2e,stroke:#4a4a6a,color:#fff
Pipeline Stage Overview
Stage 1: Complexity Analysis
Purpose: Analyze the incoming query to understand its complexity before generating plans.
The system identifies problem dimensions, determines overall complexity, and suggests an iteration strategy.
Complexity Levels
| Level | Typical Tasks | Description |
|---|---|---|
straightforward |
2-3 | Direct questions with clear answers |
moderate |
3-4 | Multi-step problems with some dependencies |
complex |
4-5 | Problems requiring multiple perspectives |
highly_complex |
5-6 | Deep analysis with many interrelated factors |
%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#fff', 'primaryBorderColor': '#4a4a6a', 'lineColor': '#6c63ff', 'fontFamily': 'JetBrains Mono, monospace'}}}%%
flowchart LR
subgraph COMPLEXITY["Complexity Levels"]
direction TB
C1["Straightforward"]
C2["Moderate"]
C3["Complex"]
C4["Highly Complex"]
end
subgraph TASKS["Task Allocation"]
direction TB
T1["2-3 tasks"]
T2["3-4 tasks"]
T3["4-5 tasks"]
T4["5-6 tasks"]
end
C1 --> T1
C2 --> T2
C3 --> T3
C4 --> T4
style C1 fill:#0d7377,stroke:#14ffec,color:#fff
style C2 fill:#16213e,stroke:#4a4a6a,color:#fff
style C3 fill:#0f3460,stroke:#6c63ff,color:#fff
style C4 fill:#6b2737,stroke:#e94560,color:#fff
Complexity Level to Task Count Mapping
What Gets Analyzed
- Problem dimensions: The different aspects that need to be addressed
- Key challenges: What makes this problem difficult
- Sequential dependencies: Which tasks must happen in order
- Iteration strategy: How many passes might be needed
- Errors from previous iterations: What went wrong before (in later iterations)
Stage 2: Plan Generation
Purpose: Generate multiple candidate plans for solving the problem.
The system creates several different approaches to answering your query, each consisting of a sequence of tasks.
Plan Structure
Each plan contains 2-6 tasks based on the complexity analysis. Every task specifies:
- resource_name: Which tool to use (e.g.,
logic_kernel,py_executor) - task_query: What the tool should accomplish
Example
For the query "What factors affect solar panel efficiency?", the system might generate:
- Knowledge tool: Research photovoltaic cell physics
- Knowledge tool: Identify environmental factors
- Reasoning tool: Synthesize and rank factors
- Knowledge tool: Gather efficiency data
- Python tool: Calculate correlation between factors
- Reasoning tool: Interpret results
- Knowledge tool: Background on solar technology
- Reasoning tool: Analyze cause-effect relationships
- Python tool: Quantify impact ranges
Configuration
| Setting | Effect |
|---|---|
candidate_plans |
Number of plans generated (1-5, default 3) |
| Maximum tasks | 24 per plan (safety limit) |
Stage 3: Plan Evaluation & Selection
Purpose: When multiple plans exist, evaluate and select the best one.
Each candidate plan is scored on a 0-100 scale, and the highest-scoring plan is selected for execution.
Evaluation Criteria
- Appropriateness of tool selection for each task
- Logical sequencing of tasks
- Coverage of problem dimensions identified in complexity analysis
- Feasibility given available tools
Selection
# The system selects the plan with highest overall score
best_eval = max(evaluation.evaluations, key=lambda e: e.overall_score)
Stage 4: Task Execution
Purpose: Execute tasks sequentially with immediate quality evaluation.
This is where the actual analysis work happens. The system runs each task in the selected plan, one at a time.
Why Sequential (Not Parallel)
"Sequential execution sends requests one at a time to the same server, maximizing prompt cache hits and reducing costs."
The system deliberately executes tasks one after another because:
- Later tasks can use results from earlier tasks
- API calls to the same server benefit from prompt caching
- Quality evaluation between tasks prevents error propagation
Task Execution Flow
For each task in the plan:
%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#fff', 'primaryBorderColor': '#4a4a6a', 'lineColor': '#6c63ff', 'fontFamily': 'JetBrains Mono, monospace'}}}%%
flowchart LR
subgraph TASK["For Each Task in Plan"]
A["Convert Query
to Tool Format"] --> B["Execute
Tool"]
B --> C{"Evaluate
Quality"}
C -->|"ACCEPTED"| D["Write to
Analysis History"]
C -->|"REJECTED"| E["Log Reason
Skip Writing"]
end
D --> F(["Next Task"])
E --> F
style A fill:#16213e,stroke:#4a4a6a,color:#fff
style B fill:#16213e,stroke:#4a4a6a,color:#fff
style C fill:#0f3460,stroke:#e94560,stroke-width:2px,color:#fff
style D fill:#0d7377,stroke:#14ffec,stroke-width:2px,color:#fff
style E fill:#6b2737,stroke:#e94560,stroke-width:2px,color:#fff
style F fill:#4a4a6a,stroke:#6c63ff,color:#fff
Critical Design Decision
Only accepted tasks write their results to analysis history. This ensures subsequent tasks see only quality results, preventing bad intermediate work from contaminating later analysis.
Passthrough Tools
Some tools don't need query conversion:
PASSTHROUGH_TOOLS = {"py_executor"}
The Python tool receives task queries directly because they already contain computational specifications.
Stage 5: Iteration Decision
Purpose: Decide whether to continue with another iteration or proceed to synthesis.
After completing all tasks in a plan, the system evaluates overall progress and decides whether more work is needed.
Decision Factors
| Factor | Description |
|---|---|
continue_iterations |
Should we keep going? (Boolean) |
quality_summary |
Assessment of current work quality |
decision_rationale |
Why continue or stop |
recommendations_for_next_iteration |
What to focus on if continuing |
Controls
| Setting | Effect |
|---|---|
max_iterations |
Hard limitβnever exceeds this |
min_iterations |
Minimum before termination allowed (if enforced) |
enforce_iterations |
Whether to enforce minimum |
execution_retry_limit |
Max retries after failures before giving up |
Iteration Loop
%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#fff', 'primaryBorderColor': '#4a4a6a', 'lineColor': '#6c63ff', 'fontFamily': 'JetBrains Mono, monospace'}}}%%
flowchart TD
A["Task Execution
Complete"] --> B{"Continue
Iterating?"}
B -->|"Yes - More work needed"| C["Generate New Plan
with Insights"]
C --> D["Execute New Tasks"]
D --> A
B -->|"No - Quality sufficient"| E["Proceed to
Synthesis"]
style A fill:#16213e,stroke:#4a4a6a,color:#fff
style B fill:#0f3460,stroke:#e94560,stroke-width:2px,color:#fff
style C fill:#16213e,stroke:#4a4a6a,color:#fff
style D fill:#16213e,stroke:#4a4a6a,color:#fff
style E fill:#0d7377,stroke:#14ffec,stroke-width:2px,color:#fff
Stage 6: Synthesis
Purpose: Combine all accepted task results into a coherent final response.
The system takes everything learned across all iterations and produces unified outputs.
Outputs Generated
- Synthesis Response
-
- Comprehensive answer with full reasoning
- List of key findings discovered during analysis
- Final Answer
-
- Refined, user-facing response
- More polished and direct than the synthesis
- Metadata
-
- Title for the analysis
- Headline summary
- Image caption (if image generation is enabled)
What Goes Into Synthesis
The synthesis stage has access to:
- Original query and context
- All accepted task results from all iterations
- Complexity analysis findings
- Evaluation feedback from each iteration
Stage 7: Focused Answer (Optional)
Purpose: Extract a specific, constrained answer type when requested.
This stage only runs if you set focused_answer_type to something other than "none".
Supported Answer Types
| Type | Output |
|---|---|
number | Numeric value |
yes/no | "Yes" or "No" |
true/false | "True" or "False" |
good/bad | "Good" or "Bad" |
high/low | "High" or "Low" |
accepted/rejected | "Accepted" or "Rejected" |
yes/maybe/no | "Yes", "Maybe", or "No" |
good/neutral/bad | "Good", "Neutral", or "Bad" |
high/medium/low | "High", "Medium", or "Low" |
Middleware Use Case
Focused answers enable using this system as middleware in automated pipelines:
| Scenario | Focused Answer Type | Downstream Action |
|---|---|---|
| Document review | accepted/rejected |
Route to approval queue or revision |
| Risk assessment | high/medium/low |
Adjust portfolio allocation |
| Content moderation | good/bad |
Publish or flag for review |
| Price estimation | number |
Set automated pricing |
| Compliance check | yes/no |
Allow transaction or block |
Benefits:
- Predictable output format for parsing
- Clear decision boundaries
- Easy integration with existing systems
- Reduced ambiguity compared to free-form text
Pipeline Flow Example
Here's a concrete example of how a query flows through the pipeline:
Query: "Is Python or JavaScript better for building a web scraper?"
%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#1a1a2e', 'primaryTextColor': '#fff', 'primaryBorderColor': '#4a4a6a', 'lineColor': '#6c63ff', 'fontFamily': 'JetBrains Mono, monospace'}}}%%
flowchart TD
subgraph QUERY["Query"]
Q[/"Is Python or JavaScript
better for web scraping?"/]
end
subgraph S1["Stage 1: Complexity"]
C1["Complexity: moderate
Dimensions: 4 aspects
Suggested tasks: 3-4"]
end
subgraph S2["Stage 2-3: Planning"]
P1["Plan A: 78/100"]
P2["Plan B: 85/100 β"]
P3["Plan C: 72/100"]
end
subgraph S4["Stage 4: Execution"]
T1["Task 1: Python capabilities
β Accepted"]
T2["Task 2: JavaScript capabilities
β Accepted"]
T3["Task 3: Compare & recommend
β Accepted"]
end
subgraph S5["Stage 5: Decision"]
D1["Quality: Good coverage
Decision: Proceed"]
end
subgraph S6["Stage 6-7: Output"]
O1["Synthesis: Comprehensive comparison"]
O2["Final Answer: Recommendation"]
O3["Focused: 'Python' or 'JavaScript'"]
end
Q --> C1
C1 --> P1 & P2 & P3
P2 --> T1
T1 --> T2
T2 --> T3
T3 --> D1
D1 --> O1
O1 --> O2
O2 --> O3
style Q fill:#4a4a6a,stroke:#6c63ff,stroke-width:2px,color:#fff
style P2 fill:#0d7377,stroke:#14ffec,stroke-width:2px,color:#fff
style T1 fill:#0d7377,stroke:#14ffec,color:#fff
style T2 fill:#0d7377,stroke:#14ffec,color:#fff
style T3 fill:#0d7377,stroke:#14ffec,color:#fff
style O3 fill:#4a4a6a,stroke:#6c63ff,stroke-width:2px,color:#fff
Example Pipeline Flow
- Complexity:
moderate - Dimensions: Language features, library ecosystem, performance, ease of use
- Suggested tasks: 3-4
- Plan A: Compare libraries β Analyze syntax β Evaluate performance
- Plan B: Knowledge on both β Reasoning comparison β Recommendation
- Plan C: Deep dive Python β Deep dive JavaScript β Side-by-side
- Plan A: 78/100
- Plan B: 85/100 β Selected
- Plan C: 72/100
- Task 1 (Knowledge): Python web scraping capabilities β Accepted
- Task 2 (Knowledge): JavaScript web scraping capabilities β Accepted
- Task 3 (Reasoning): Compare and recommend β Accepted
- Quality: Good coverage of both languages
- Decision: Proceed to synthesis
Combines findings into comprehensive comparison, generates final answer with recommendation
(if focused_answer_type was set): Would extract "Python" or "JavaScript" as the constrained response
Understanding Iterations
What Triggers Another Iteration?
The system may continue iterating when:
- Key aspects of the query weren't addressed
- Task results revealed new questions
- Quality evaluation found gaps in reasoning
- Errors in previous work need correction
What Changes in Subsequent Iterations?
In iteration 2+, the system sees:
- Results from all previously accepted tasks
- Evaluation feedback identifying gaps
- Detected errors from prior work
This allows it to:
- Fill in missing pieces
- Correct mistakes
- Explore alternative approaches
- Deepen analysis on important points
Cost Implications
Each iteration involves:
- New plan generation (or plan adjustment)
- Additional task executions
- More API calls
More iterations = more thorough analysis but higher cost. Use max_iterations and cost_limit to control this trade-off.