Basic Workflow

Complete guide to the standard FluxLoop CLI workflow from setup to evaluation.

Overview

The FluxLoop workflow consists of five main phases:

Initialize - Create project structure and configuration
Generate - Create input variations for testing
Run - Execute experiments and collect traces
Parse - Convert raw outputs into readable formats
Evaluate - Score results and generate reports

This guide walks through each phase with practical examples.

Phase 1: Initialize Project

Create New Project

# Initialize a new FluxLoop project
fluxloop init project --name my-agent

# Navigate to project directory
cd fluxloop/my-agent

What Gets Created:

fluxloop/my-agent/
├── configs/
│   ├── project.yaml       # Project metadata
│   ├── input.yaml         # Personas and input settings
│   ├── simulation.yaml    # Runner and experiment config
│   └── evaluation.yaml    # Evaluators and success criteria
├── .env                   # Environment variables (gitignored)
├── .gitignore
├── examples/
│   └── simple_agent.py    # Sample agent implementation
├── inputs/                # Generated inputs stored here
├── recordings/            # Recorded arguments (if using recording mode)
└── experiments/           # Experiment outputs

Verify Installation

# Check system status
fluxloop doctor

# Verify configuration
fluxloop config validate

Expected Output:

╭──────────────────────────────────╮
│ FluxLoop Environment Doctor      │
╰──────────────────────────────────╯

Component       Status  Details
Python          ✓       3.11.5
FluxLoop CLI    ✓       0.2.30
FluxLoop SDK    ✓       0.1.6
FluxLoop MCP    ✓       /path/to/fluxloop-mcp
MCP Index       ✓       ~/.fluxloop/mcp/index/dev
Project Config  ✓       configs/project.yaml

╭──────────────────╮
│ Doctor completed │
╰──────────────────╯

Phase 2: Configure Your Agent

Set Up LLM Provider

# Configure OpenAI (for input generation)
fluxloop config set-llm openai sk-your-api-key --model gpt-4o

# Or Anthropic
fluxloop config set-llm anthropic sk-ant-your-key --model claude-3-5-sonnet-20241022

Point to Your Agent Code

Edit configs/simulation.yaml:

runner:
  target: "my_agent:run"              # Point to your agent entry point
  working_directory: .                # Project root
  timeout_seconds: 120
  max_retries: 3

Agent Entry Point Example:

# my_agent.py
import fluxloop

@fluxloop.agent(name="MyAgent")
def run(input_text: str) -> str:
    """Main agent entry point."""
    # Your agent logic here
    return f"Response to: {input_text}"

Configure Personas

Edit configs/input.yaml to define user personas:

personas:
  - name: novice_user
    description: A user new to the system
    characteristics:
      - Asks basic questions
      - May use incorrect terminology
      - Needs detailed explanations
    language: en
    expertise_level: novice
    goals:
      - Understand system capabilities
      - Complete basic tasks

  - name: expert_user
    description: An experienced power user
    characteristics:
      - Uses technical terminology
      - Asks complex questions
      - Expects efficient responses
    language: en
    expertise_level: expert
    goals:
      - Optimize workflows
      - Access advanced features

Define Base Inputs

Still in configs/input.yaml:

base_inputs:
  - input: "How do I get started?"
    expected_intent: help
    metadata:
      category: onboarding
      
  - input: "What are the advanced features?"
    expected_intent: capabilities
    metadata:
      category: features

  - input: "Can you help me with error code 404?"
    expected_intent: troubleshooting
    metadata:
      category: support

Phase 3: Generate Inputs

Generate Input Variations

# Generate variations using LLM
fluxloop generate inputs --limit 50 --mode llm

# Or deterministic variations
fluxloop generate inputs --limit 30 --mode deterministic

What Happens:

FluxLoop reads configs/input.yaml
For each base input × persona combination
Generates variations using specified strategies
Saves to inputs/generated.yaml

Output:

Generating inputs with LLM mode...

✓ Generated 50 input variations
  - Base inputs: 3
  - Personas: 2
  - Strategies: rephrase, verbose, error_prone
  - Saved to: inputs/generated.yaml

Total inputs ready: 50

View Generated Inputs:

# Check generated inputs
cat inputs/generated.yaml

# Or view in editor
code inputs/generated.yaml

Example Generated Input:

- input: "Hey, how would I go about getting started with this thing?"
  persona: novice_user
  metadata:
    base_input: "How do I get started?"
    variation_strategy: rephrase
    variation_index: 0

Phase 4: Run Experiment

Execute Simulation

# Run experiment with default settings
fluxloop run experiment

# Run with custom iterations
fluxloop run experiment --iterations 5

# Run with multi-turn enabled
fluxloop run experiment --multi-turn --max-turns 10

What Happens:

Loads configuration from configs/simulation.yaml
Loads inputs from inputs/generated.yaml
For each input × iteration:
- Calls your agent function
- Captures traces via FluxLoop SDK
- Records observations
Saves results to experiments/exp_<timestamp>/

Output:

╭─ Experiment: my_agent_experiment ─────────────────────╮
│ Iterations: 1                                         │
│ Input Source: inputs/generated.yaml                   │
│ Total Runs: 50                                        │
╰───────────────────────────────────────────────────────╯

Running experiments...
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 50/50 [00:45]

✓ Experiment completed!

Results saved to: experiments/exp_20250117_143022/
  - traces.jsonl (50 traces)
  - observations.jsonl (150 observations)
  - summary.json
  - metadata.json

Check Results Immediately

# View summary
cat experiments/exp_*/summary.json | jq

# List recent experiments
fluxloop status experiments

Phase 5: Parse Results

Convert to Human-Readable Format

# Parse experiment outputs
fluxloop parse experiment experiments/exp_20250117_143022/

# Or use glob pattern for latest
fluxloop parse experiment experiments/exp_*/

What Happens:

Reads traces.jsonl and observations.jsonl
Reconstructs conversation flow for each trace
Generates Markdown timeline for each trace
Creates per_trace.jsonl with structured data

Output:

Parsing experiment: experiments/exp_20250117_143022/

✓ Parsed 50 traces
  - Total observations: 150
  - Average trace length: 3 observations
  - Saved to: experiments/.../per_trace_analysis/

Generated Files:
  - per_trace_analysis/00_<trace_id>.md (50 files)
  - per_trace_analysis/per_trace.jsonl

View Parsed Trace:

# Open first trace in editor
code experiments/exp_*/per_trace_analysis/00_*.md

Example Parsed Trace (Markdown):

# Trace Analysis: 00_7f39c11-1eac-423b-96b9-09aa8a8f588a

**Experiment**: my_agent_experiment
**Persona**: novice_user
**Input**: "Hey, how would I go about getting started with this thing?"
**Status**: SUCCESS
**Duration**: 234ms

## Timeline

### 1. AGENT: SimpleAgent
**Start**: 14:30:22.123
**Duration**: 234ms

**Input:**
```json
{
  "input_text": "Hey, how would I go about getting started with this thing?"
}

Output:

{
  "response": "Hello! To get started, first visit our documentation..."
}

2. TOOL: process_input

Start: 14:30:22.145 Duration: 12ms

Input:

{
  "text": "Hey, how would I go about getting started with this thing?"
}

Output:

{
  "intent": "help",
  "word_count": 11,
  "has_question": true
}

---

## Phase 6: Evaluate Results

### Generate the Interactive Report

```bash
# Run evaluation
fluxloop evaluate experiment experiments/exp_20250117_143022/

# With custom config
fluxloop evaluate experiment experiments/exp_*/ \
  --config configs/evaluation.yaml

What Happens:

Loads per_trace_analysis/per_trace.jsonl generated by fluxloop parse.
Runs the 5단계 파이프라인:
- LLM-PT: LLM judges each trace on 7 metrics (task completion, hallucination, relevance, tool usage, satisfaction, clarity, persona)
- Rule aggregation: 통계, 메트릭 패스율, 성능 카드, 케이스 분류
- LLM-OV: LLM이 전반적인 인사이트·추천을 작성
- Data preparation + HTML 렌더링: evaluation_report/report.html 생성
Outputs progress logs plus a final path to the generated dashboard.

Output:

📊 Evaluating experiment at experiments/exp_20250117_143022
🧵 Per-trace data: experiments/.../per_trace_analysis/per_trace.jsonl
📁 Output: experiments/.../evaluation_report

Stage 1: LLM-PT (10 traces) ...
Stage 2: Aggregation ...
Stage 3: LLM-OV ...
Stage 4 & 5: Rendering HTML report...

✅ Report ready: experiments/.../evaluation_report/report.html

View Evaluation Reports

# Open HTML report in browser
open experiments/exp_*/evaluation_report/report.html

# Check the structured traces (input to the pipeline)
cat experiments/exp_*/per_trace_analysis/per_trace.jsonl | jq

Complete Workflow Script

Put it all together in one script:

#!/bin/bash
# run_full_workflow.sh

set -e  # Exit on error

PROJECT="my-agent"
ITERATIONS=5

echo "=== FluxLoop Complete Workflow ==="

# 1. Initialize (if needed)
if [ ! -d "fluxloop/$PROJECT" ]; then
    echo "1. Initializing project..."
    fluxloop init project --name "$PROJECT"
fi

cd "fluxloop/$PROJECT"

# 2. Verify setup
echo "2. Verifying setup..."
fluxloop doctor
fluxloop config validate

# 3. Generate inputs
echo "3. Generating inputs..."
fluxloop generate inputs --limit 50 --mode llm

# 4. Run experiment
echo "4. Running experiment..."
fluxloop run experiment --iterations "$ITERATIONS"

# 5. Find latest experiment
LATEST_EXP=$(ls -td experiments/*/ | head -1)

# 6. Parse results
echo "5. Parsing results..."
fluxloop parse experiment "$LATEST_EXP"

# 7. Evaluate
echo "6. Evaluating..."
fluxloop evaluate experiment "$LATEST_EXP"

# 8. Summary
echo ""
echo "=== Workflow Complete ==="
echo "Results in: $LATEST_EXP"
echo ""
echo "View reports:"
echo "  HTML: open ${LATEST_EXP}evaluation_report/report.html"

Make it executable and run:

chmod +x run_full_workflow.sh
./run_full_workflow.sh

Tips and Best Practices

Start Small

Begin with a minimal setup to verify everything works:

# Generate just 10 inputs
fluxloop generate inputs --limit 10

# Run 1 iteration
fluxloop run experiment --iterations 1

# Verify results
fluxloop status experiments

Iterate on Configuration

Tune your setup incrementally:

# Try different variation strategies
fluxloop config set variation_strategies "[rephrase, verbose]" --file configs/input.yaml

# Adjust iteration count
fluxloop config set iterations 20 --file configs/simulation.yaml

# Re-run and compare
fluxloop run experiment

Version Control Your Configs

# Track configuration changes
git add configs/
git commit -m "Tune evaluation thresholds"

# Compare experiment results across git branches
git checkout experiment-v1
fluxloop run experiment
git checkout experiment-v2
fluxloop run experiment

Use Multi-Turn for Deeper Testing

# Enable multi-turn conversations
fluxloop config set multi_turn.enabled true --file configs/simulation.yaml
fluxloop config set multi_turn.max_turns 10 --file configs/simulation.yaml

# Run with supervisor
fluxloop run experiment --multi-turn --auto-approve

Troubleshooting

Agent Not Found

Error: ModuleNotFoundError: No module named 'my_agent'

Solution:

# Verify runner.target points to correct module
fluxloop config show --file configs/simulation.yaml | grep target

# Ensure module is in PYTHONPATH
export PYTHONPATH="${PYTHONPATH}:$(pwd)"

No Inputs Generated

Error: No inputs found in inputs/generated.yaml

Solution:

# Check input config
fluxloop config show --file configs/input.yaml

# Verify LLM API key
fluxloop config env | grep API_KEY

# Re-generate
fluxloop generate inputs --limit 10 --mode deterministic

Evaluation Fails

Error: Missing per_trace.jsonl

Solution:

# Must parse before evaluating
fluxloop parse experiment experiments/exp_*/
fluxloop evaluate experiment experiments/exp_*/

Next Steps

Multi-Turn Workflow - Dynamic conversations
Recording Workflow - Capture and replay arguments
CI/CD Integration - Automate in pipelines
Commands Reference - Detailed command documentation

Quick Reference

# Setup
fluxloop init project --name my-agent
fluxloop config set-llm openai sk-xxxxx
fluxloop doctor

# Workflow
fluxloop generate inputs --limit 50
fluxloop run experiment
fluxloop parse experiment experiments/latest_*/
fluxloop evaluate experiment experiments/latest_*/

# Monitoring
fluxloop status check
fluxloop status experiments
fluxloop config env

Overview​

Phase 1: Initialize Project​

Create New Project​

Verify Installation​

Phase 2: Configure Your Agent​

Set Up LLM Provider​

Point to Your Agent Code​

Configure Personas​

Define Base Inputs​

Phase 3: Generate Inputs​

Generate Input Variations​

Phase 4: Run Experiment​

Execute Simulation​

Check Results Immediately​

Phase 5: Parse Results​

Convert to Human-Readable Format​

2. TOOL: process_input​

View Evaluation Reports​

Complete Workflow Script​

Tips and Best Practices​

Start Small​

Iterate on Configuration​

Version Control Your Configs​

Use Multi-Turn for Deeper Testing​

Troubleshooting​

Agent Not Found​

No Inputs Generated​

Evaluation Fails​

Next Steps​

Quick Reference​

Overview

Phase 1: Initialize Project

Create New Project

Verify Installation

Phase 2: Configure Your Agent

Set Up LLM Provider

Point to Your Agent Code

Configure Personas

Define Base Inputs

Phase 3: Generate Inputs

Generate Input Variations

Phase 4: Run Experiment

Execute Simulation

Check Results Immediately

Phase 5: Parse Results

Convert to Human-Readable Format

2. TOOL: process_input

View Evaluation Reports

Complete Workflow Script

Tips and Best Practices

Start Small

Iterate on Configuration

Version Control Your Configs

Use Multi-Turn for Deeper Testing

Troubleshooting

Agent Not Found

No Inputs Generated

Evaluation Fails

Next Steps

Quick Reference