Basic Workflow
Complete guide to the standard FluxLoop CLI workflow from setup to evaluation.
Overview
The FluxLoop workflow consists of five main phases:
- Initialize - Create project structure and configuration
- Generate - Create input variations for testing
- Run - Execute experiments and collect traces
- Parse - Convert raw outputs into readable formats
- Evaluate - Score results and generate reports
This guide walks through each phase with practical examples.
Phase 1: Initialize Project
Create New Project
# Initialize a new FluxLoop project
fluxloop init project --name my-agent
# Navigate to project directory
cd fluxloop/my-agent
What Gets Created:
fluxloop/my-agent/
├── configs/
│ ├── project.yaml # Project metadata
│ ├── input.yaml # Personas and input settings
│ ├── simulation.yaml # Runner and experiment config
│ └── evaluation.yaml # Evaluators and success criteria
├── .env # Environment variables (gitignored)
├── .gitignore
├── examples/
│ └── simple_agent.py # Sample agent implementation
├── inputs/ # Generated inputs stored here
├── recordings/ # Recorded arguments (if using recording mode)
└── experiments/ # Experiment outputs
Verify Installation
# Check system status
fluxloop doctor
# Verify configuration
fluxloop config validate
Expected Output:
╭──────────────────────────────────╮
│ FluxLoop Environment Doctor │
╰──────────────────────────────────╯
Component Status Details
Python ✓ 3.11.5
FluxLoop CLI ✓ 0.2.30
FluxLoop SDK ✓ 0.1.6
FluxLoop MCP ✓ /path/to/fluxloop-mcp
MCP Index ✓ ~/.fluxloop/mcp/index/dev
Project Config ✓ configs/project.yaml
╭──────────────────╮
│ Doctor completed │
╰──────────────────╯
Phase 2: Configure Your Agent
Set Up LLM Provider
# Configure OpenAI (for input generation)
fluxloop config set-llm openai sk-your-api-key --model gpt-4o
# Or Anthropic
fluxloop config set-llm anthropic sk-ant-your-key --model claude-3-5-sonnet-20241022
Point to Your Agent Code
Edit configs/simulation.yaml:
runner:
target: "my_agent:run" # Point to your agent entry point
working_directory: . # Project root
timeout_seconds: 120
max_retries: 3
Agent Entry Point Example:
# my_agent.py
import fluxloop
@fluxloop.agent(name="MyAgent")
def run(input_text: str) -> str:
"""Main agent entry point."""
# Your agent logic here
return f"Response to: {input_text}"
Configure Personas
Edit configs/input.yaml to define user personas:
personas:
- name: novice_user
description: A user new to the system
characteristics:
- Asks basic questions
- May use incorrect terminology
- Needs detailed explanations
language: en
expertise_level: novice
goals:
- Understand system capabilities
- Complete basic tasks
- name: expert_user
description: An experienced power user
characteristics:
- Uses technical terminology
- Asks complex questions
- Expects efficient responses
language: en
expertise_level: expert
goals:
- Optimize workflows
- Access advanced features
Define Base Inputs
Still in configs/input.yaml:
base_inputs:
- input: "How do I get started?"
expected_intent: help
metadata:
category: onboarding
- input: "What are the advanced features?"
expected_intent: capabilities
metadata:
category: features
- input: "Can you help me with error code 404?"
expected_intent: troubleshooting
metadata:
category: support
Phase 3: Generate Inputs
Generate Input Variations
# Generate variations using LLM
fluxloop generate inputs --limit 50 --mode llm
# Or deterministic variations
fluxloop generate inputs --limit 30 --mode deterministic
What Happens:
- FluxLoop reads
configs/input.yaml - For each base input × persona combination
- Generates variations using specified strategies
- Saves to
inputs/generated.yaml
Output:
Generating inputs with LLM mode...
✓ Generated 50 input variations
- Base inputs: 3
- Personas: 2
- Strategies: rephrase, verbose, error_prone
- Saved to: inputs/generated.yaml
Total inputs ready: 50
View Generated Inputs:
# Check generated inputs
cat inputs/generated.yaml
# Or view in editor
code inputs/generated.yaml
Example Generated Input:
- input: "Hey, how would I go about getting started with this thing?"
persona: novice_user
metadata:
base_input: "How do I get started?"
variation_strategy: rephrase
variation_index: 0
Phase 4: Run Experiment
Execute Simulation
# Run experiment with default settings
fluxloop run experiment
# Run with custom iterations
fluxloop run experiment --iterations 5
# Run with multi-turn enabled
fluxloop run experiment --multi-turn --max-turns 10
What Happens:
- Loads configuration from
configs/simulation.yaml - Loads inputs from
inputs/generated.yaml - For each input × iteration:
- Calls your agent function
- Captures traces via FluxLoop SDK
- Records observations
- Saves results to
experiments/exp_<timestamp>/
Output:
╭─ Experiment: my_agent_experiment ─────────────────────╮
│ Iterations: 1 │
│ Input Source: inputs/generated.yaml │
│ Total Runs: 50 │
╰───────────────────────────────────────────────────────╯
Running experiments...
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 50/50 [00:45]
✓ Experiment completed!
Results saved to: experiments/exp_20250117_143022/
- traces.jsonl (50 traces)
- observations.jsonl (150 observations)
- summary.json
- metadata.json
Check Results Immediately
# View summary
cat experiments/exp_*/summary.json | jq
# List recent experiments
fluxloop status experiments
Phase 5: Parse Results
Convert to Human-Readable Format
# Parse experiment outputs
fluxloop parse experiment experiments/exp_20250117_143022/
# Or use glob pattern for latest
fluxloop parse experiment experiments/exp_*/
What Happens:
- Reads
traces.jsonlandobservations.jsonl - Reconstructs conversation flow for each trace
- Generates Markdown timeline for each trace
- Creates
per_trace.jsonlwith structured data
Output:
Parsing experiment: experiments/exp_20250117_143022/
✓ Parsed 50 traces
- Total observations: 150
- Average trace length: 3 observations
- Saved to: experiments/.../per_trace_analysis/
Generated Files:
- per_trace_analysis/00_<trace_id>.md (50 files)
- per_trace_analysis/per_trace.jsonl
View Parsed Trace:
# Open first trace in editor
code experiments/exp_*/per_trace_analysis/00_*.md
Example Parsed Trace (Markdown):
# Trace Analysis: 00_7f39c11-1eac-423b-96b9-09aa8a8f588a
**Experiment**: my_agent_experiment
**Persona**: novice_user
**Input**: "Hey, how would I go about getting started with this thing?"
**Status**: SUCCESS
**Duration**: 234ms
## Timeline
### 1. AGENT: SimpleAgent
**Start**: 14:30:22.123
**Duration**: 234ms
**Input:**
```json
{
"input_text": "Hey, how would I go about getting started with this thing?"
}
Output:
{
"response": "Hello! To get started, first visit our documentation..."
}
2. TOOL: process_input
Start: 14:30:22.145 Duration: 12ms
Input:
{
"text": "Hey, how would I go about getting started with this thing?"
}
Output:
{
"intent": "help",
"word_count": 11,
"has_question": true
}
---
## Phase 6: Evaluate Results
### Generate the Interactive Report
```bash
# Run evaluation
fluxloop evaluate experiment experiments/exp_20250117_143022/
# With custom config
fluxloop evaluate experiment experiments/exp_*/ \
--config configs/evaluation.yaml
What Happens:
- Loads
per_trace_analysis/per_trace.jsonlgenerated byfluxloop parse. - Runs the 5단계 파이프라인:
- LLM-PT: LLM judges each trace on 7 metrics (task completion, hallucination, relevance, tool usage, satisfaction, clarity, persona)
- Rule aggregation: 통계, 메트릭 패스율, 성능 카드, 케이스 분류
- LLM-OV: LLM이 전반적인 인사이트·추천을 작성
- Data preparation + HTML 렌더링:
evaluation_report/report.html생성
- Outputs progress logs plus a final path to the generated dashboard.
Output:
📊 Evaluating experiment at experiments/exp_20250117_143022
🧵 Per-trace data: experiments/.../per_trace_analysis/per_trace.jsonl
📁 Output: experiments/.../evaluation_report
Stage 1: LLM-PT (10 traces) ...
Stage 2: Aggregation ...
Stage 3: LLM-OV ...
Stage 4 & 5: Rendering HTML report...
✅ Report ready: experiments/.../evaluation_report/report.html
View Evaluation Reports
# Open HTML report in browser
open experiments/exp_*/evaluation_report/report.html
# Check the structured traces (input to the pipeline)
cat experiments/exp_*/per_trace_analysis/per_trace.jsonl | jq
Complete Workflow Script
Put it all together in one script:
#!/bin/bash
# run_full_workflow.sh
set -e # Exit on error
PROJECT="my-agent"
ITERATIONS=5
echo "=== FluxLoop Complete Workflow ==="
# 1. Initialize (if needed)
if [ ! -d "fluxloop/$PROJECT" ]; then
echo "1. Initializing project..."
fluxloop init project --name "$PROJECT"
fi
cd "fluxloop/$PROJECT"
# 2. Verify setup
echo "2. Verifying setup..."
fluxloop doctor
fluxloop config validate
# 3. Generate inputs
echo "3. Generating inputs..."
fluxloop generate inputs --limit 50 --mode llm
# 4. Run experiment
echo "4. Running experiment..."
fluxloop run experiment --iterations "$ITERATIONS"
# 5. Find latest experiment
LATEST_EXP=$(ls -td experiments/*/ | head -1)
# 6. Parse results
echo "5. Parsing results..."
fluxloop parse experiment "$LATEST_EXP"
# 7. Evaluate
echo "6. Evaluating..."
fluxloop evaluate experiment "$LATEST_EXP"
# 8. Summary
echo ""
echo "=== Workflow Complete ==="
echo "Results in: $LATEST_EXP"
echo ""
echo "View reports:"
echo " HTML: open ${LATEST_EXP}evaluation_report/report.html"
Make it executable and run:
chmod +x run_full_workflow.sh
./run_full_workflow.sh
Tips and Best Practices
Start Small
Begin with a minimal setup to verify everything works:
# Generate just 10 inputs
fluxloop generate inputs --limit 10
# Run 1 iteration
fluxloop run experiment --iterations 1
# Verify results
fluxloop status experiments
Iterate on Configuration
Tune your setup incrementally:
# Try different variation strategies
fluxloop config set variation_strategies "[rephrase, verbose]" --file configs/input.yaml
# Adjust iteration count
fluxloop config set iterations 20 --file configs/simulation.yaml
# Re-run and compare
fluxloop run experiment
Version Control Your Configs
# Track configuration changes
git add configs/
git commit -m "Tune evaluation thresholds"
# Compare experiment results across git branches
git checkout experiment-v1
fluxloop run experiment
git checkout experiment-v2
fluxloop run experiment
Use Multi-Turn for Deeper Testing
# Enable multi-turn conversations
fluxloop config set multi_turn.enabled true --file configs/simulation.yaml
fluxloop config set multi_turn.max_turns 10 --file configs/simulation.yaml
# Run with supervisor
fluxloop run experiment --multi-turn --auto-approve
Troubleshooting
Agent Not Found
Error: ModuleNotFoundError: No module named 'my_agent'
Solution:
# Verify runner.target points to correct module
fluxloop config show --file configs/simulation.yaml | grep target
# Ensure module is in PYTHONPATH
export PYTHONPATH="${PYTHONPATH}:$(pwd)"
No Inputs Generated
Error: No inputs found in inputs/generated.yaml
Solution:
# Check input config
fluxloop config show --file configs/input.yaml
# Verify LLM API key
fluxloop config env | grep API_KEY
# Re-generate
fluxloop generate inputs --limit 10 --mode deterministic
Evaluation Fails
Error: Missing per_trace.jsonl
Solution:
# Must parse before evaluating
fluxloop parse experiment experiments/exp_*/
fluxloop evaluate experiment experiments/exp_*/
Next Steps
- Multi-Turn Workflow - Dynamic conversations
- Recording Workflow - Capture and replay arguments
- CI/CD Integration - Automate in pipelines
- Commands Reference - Detailed command documentation
Quick Reference
# Setup
fluxloop init project --name my-agent
fluxloop config set-llm openai sk-xxxxx
fluxloop doctor
# Workflow
fluxloop generate inputs --limit 50
fluxloop run experiment
fluxloop parse experiment experiments/latest_*/
fluxloop evaluate experiment experiments/latest_*/
# Monitoring
fluxloop status check
fluxloop status experiments
fluxloop config env