6. Compiler Design: From AST to Python Runtime Engine (Compiler Architecture)¶

This might be the most hardcore chapter in this documentation. We must emphasize a core fact: The Nexa language is absolutely not just a SDK library with a bunch of @decorator hastily glossing over various API request abstraction classes mounted on top of Python!

It is a real language (Domain Specific Language, DSL), containing a complete frontend lexical analysis (Lexer/Parser), semantic static inference engine, and complex abstract syntax tree (AST), finally connecting to a powerful target code generator (Target Code Generator) that transpiles to the underlying runtime.

To help geeks who want to explore further or developers who want to contribute PRs to the Nexa core better onboard, this section will comprehensively dissect Nexa's Transpiler compilation principles and Runtime high-concurrency execution model design philosophy.

🛠️ Compilation Pipeline Overview: What Happens When You Hit `nexa run`?¶

When you've written an elegant main.nx and hit the compile run command in the terminal, its lifecycle will go through the following four ultra-fast stages:

1. Lexical & Parsing Engine¶

Nexa does not reuse Python's built-in Parser. In src/nexa_parser.py, we have implemented our own state-machine-based grammar parser (early versions used Lark/PLY-based Parser, now refactored).

The parser will tokenize and assemble statements with Nexa-specific natural expressions containing agent, match intent, >>, etc., line by line into a strict tree structure — AST processing nodes. At this step, any low-level errors such as missing semicolons or misaligned brackets will be directly intercepted.

2. Graph Discovery & Static Type Check¶

Ordinary languages usually just execute after compiling to AST, but as an agent network orchestration expert, Nexa adds an extra graph construction step here.

When detecting syntax like join(AgentA, AgentB).Summarizer, the compiler doesn't treat it as linear call text. Instead, it draws a Directed Acyclic Graph (DAG) of all agent call hierarchies in memory. It uses underlying graph algorithms (such as Kahn's Algorithm) to find which paths don't have upstream-downstream strong data dependencies, and labels them as "Parallelizable". Moreover, using the protocol mentioned earlier, the engine also performs strict static protocol connectivity cross-validation on node data input and output. If there's a type mismatch in parameter passing, you'll immediately get a big red error at compile time (rather than the expensive runtime phase)!

3. Python Async Coroutine Transpilation Layer¶

Nexa's current main target platform is Python. The backend code_generator.py traverses the just-optimized AST frontend tree, using the Visitor pattern to "translate" node by node:

Converts regular methods to async/await.
Automatically outputs protocol as class MyProtocol(BaseModel): (importing Pydantic).
Automatically builds powerful state machine engine call code.

4. Landing on Nexa Runtime Execution Engine¶

The transpiled xxx.py internally actually highly depends on and invokes the Nexa dedicated execution layer library src/runtime/ installed locally.

🆕 v1.0 AVM Architecture: Agent Virtual Machine¶

Nexa v1.0-alpha introduces the revolutionary Agent Virtual Machine (AVM) — a high-performance, securely isolated agent execution engine written in Rust.

Architecture Overview¶

┌─────────────────────────────────────────────────────────────┐
│                    Nexa AVM Architecture                     │
├─────────────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐  │
│  │  Nexa       │  │  AST        │  │  Bytecode           │  │
│  │  Compiler   │→ │  Generator  │→ │  Generator          │  │
│  │  (Rust)     │  │  (Rust)     │  │  (Rust)             │  │
│  └─────────────┘  └─────────────┘  └─────────────────────┘  │
│                              ↓                               │
│  ┌─────────────────────────────────────────────────────────┐ │
│  │                    AVM Runtime (Rust)                    │ │
│  │  ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌─────────┐ │ │
│  │  │ Scheduler │ │ Memory    │ │ Tool      │ │ Sandbox │ │ │
│  │  │           │ │ Manager   │ │ Registry  │ │ (WASM)  │ │ │
│  │  └───────────┘ └───────────┘ └───────────┘ └─────────┘ │ │
│  └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

Compiler Frontend (Lexer + Parser)¶

Technology Choices¶

Language: Rust
Lexical Analysis: Using logos crate for high-performance lexical analysis
Syntax Parsing: Recursive descent parser
AST: Type-safe abstract syntax tree representation

Lexer Features¶

// Nexa Token type definition (simplified version)
#[derive(Logos, Debug, PartialEq)]
enum Token {
    #[token("agent")]
    Agent,

    #[token("flow")]
    Flow,

    #[token("protocol")]
    Protocol,

    #[token("implements")]
    Implements,

    #[token(">>")]
    Pipeline,

    #[token("|>>")]
    FanOut,

    #[token("&>>")]
    Merge,

    // ... more Tokens
}

Bytecode Compiler¶

Bytecode Instruction Set Design¶

// Nexa bytecode instruction set
enum NexaOpcode {
    // Agent operations
    AgentCreate { name: StringId },
    AgentRun { agent_id: u32 },
    AgentClone { agent_id: u32 },

    // Flow control - DAG operators
    Pipeline { stages: Vec<u32> },      // >> pipeline
    Fork { agents: Vec<u32> },          // |>> fan-out (parallel wait)
    Merge { strategy: MergeStrategy },  // &>> merge/fan-in
    FireForget { agents: Vec<u32> },    // || fire-and-forget (v0.9.7+)
    Consensus { agents: Vec<u32> },     // && consensus voting (v0.9.7+)
    ConditionalBranch {
        condition: ConditionType,
        true_branch: u32,
        false_branch: u32
    },                                   // ?? conditional branch

    // Memory operations
    MemStore { key: StringId, scope: MemoryScope },
    MemLoad { key: StringId },

    // Tool calls
    ToolCall { tool_id: u32, args: Vec<Value> },

    // Control flow
    Jump { offset: i32 },
    JumpIf { condition: bool, offset: i32 },
    SemanticIf { pattern: StringId },

    // LLM calls
    LLMCall { model: StringId },
    LLMStream { model: StringId },

    // New primitives (v0.9.7+)
    Reason { context: StringId },       // Type-aware reasoning
    WaitForHuman { prompt: StringId },  // Human approval wait
    Break,                              // Loop break
}

AVM Runtime¶

Core Components¶

Scheduler
Tokio-based async task scheduling
Priority queue
Load balancing strategies
Memory Manager
Zero-copy message passing
Arena allocator
Vector virtual memory paging
Tool Registry
Dynamic WASM module loading
Python FFI bindings
Sandbox
WASM runtime (wasmtime)
Resource limits
Permission control

🔒 WASM Security Sandbox¶

Introducing WebAssembly in AVM to provide strong isolation and cross-platform compatibility for external tool execution.

WASM Sandbox Architecture¶

┌─────────────────────────────────────────────────────────────┐
│                    WASM Sandbox Architecture                 │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌───────────────────────────────────────────────────────┐  │
│  │                    Host Interface                      │  │
│  │  ┌─────────────┐ ┌─────────────┐ ┌─────────────────┐  │  │
│  │  │ stdin/stdout│ │ filesystem  │ │ network (HTTP)  │  │  │
│  │  └─────────────┘ └─────────────┘ └─────────────────┘  │  │
│  └───────────────────────────────────────────────────────┘  │
│                              ↓ WASI                         │
│  ┌───────────────────────────────────────────────────────┐  │
│  │                    WASM Runtime                        │  │
│  │  ┌─────────────────────────────────────────────────┐  │  │
│  │  │  Tool 1 (WASM)  │  Tool 2 (WASM)  │  Tool N     │  │  │
│  │  └─────────────────────────────────────────────────┘  │  │
│  └───────────────────────────────────────────────────────┘  │
│                                                              │
│  Resource Limits: CPU Time | Memory | Filesystem | Network  │
└─────────────────────────────────────────────────────────────┘

Permission Model¶

// Four-level permission model
enum PermissionLevel {
    None,       // No permissions
    Standard,   // Standard permissions (read, basic network)
    Elevated,   // Elevated permissions (write, more network)
    Full,       // Full permissions
}

// WASM resource limit configuration
struct WasmResourceLimits {
    max_memory_pages: u32,      // Maximum memory pages
    max_table_elements: u32,    // Maximum table elements
    max_cpu_time_ms: u64,       // Maximum CPU time
    max_file_size: u64,         // Maximum file size
    allowed_paths: Vec<PathBuf>, // Allowed access paths
    allowed_hosts: Vec<String>,  // Allowed access hosts
    network_enabled: bool,       // Whether network is allowed
}

impl Default for WasmResourceLimits {
    fn default() -> Self {
        Self {
            max_memory_pages: 256,    // 16MB
            max_table_elements: 1024,
            max_cpu_time_ms: 5000,    // 5 seconds
            max_file_size: 10 * 1024 * 1024, // 10MB
            allowed_paths: vec![],
            allowed_hosts: vec![],
            network_enabled: false,
        }
    }
}

⚡ Smart Scheduler¶

Dynamically allocating concurrent resources based on system load at the AVM layer.

Scheduling Strategies¶

Strategy	Description	Use Case
RoundRobin	Round-robin allocation	Even load
LeastLoaded	Least loaded first	Uneven tasks
Adaptive	Adaptive adjustment	Complex scenarios

Scheduler Features¶

Priority Queue: Task scheduling based on Agent priority
Load Balancing: RoundRobin/LeastLoaded/Adaptive strategies
DAG Topological Sort: Automatic dependency resolution and parallelism analysis
Resource Allocation: Memory, CPU core allocation optimization

🐄 COW Memory: Copy-On-Write State Branching (v1.0.3+)¶

Nexa v1.0.3-beta introduces COW (Copy-On-Write) memory mechanism, achieving O(1) complexity state branching, enabling efficient implementation of Tree-of-Thoughts and other advanced reasoning patterns.

Core Concept¶

COW Memory allows creating state branches without copying the entire state:

Original State (S0)
    │
    ├── Branch S1 (only records differences, not full copy)
    │   └── Branch S1' (continues recording differences)
    │
    └── Branch S2 (only records differences)
        └── Branch S2' (continues recording differences)

Working Principle¶

// COW Memory structure (simplified)
struct CowMemory {
    base_state: Arc<State>,          // Shared base state
    delta_changes: Vec<ChangeRecord>, // Only incremental changes
    parent_ref: Option<Arc<CowMemory>>, // Parent node reference
}

// Create branch: O(1) complexity
fn branch_state(current: &CowMemory) -> CowMemory {
    CowMemory {
        base_state: current.base_state.clone(),  // Shared, not copied
        delta_changes: Vec::new(),                // New delta list
        parent_ref: Some(Arc::new(current.clone())),
    }
}

Performance Advantages¶

Operation	Traditional Copy	COW Memory
Create Branch	O(n)	O(1)
Merge State	O(n)	O(k) where k = changes
Memory Usage	n × branches	n + k × branches
Backtrack	O(n)	O(1)

Use Cases¶

Tree-of-Thoughts Exploration¶

// Tree-of-Thoughts exploration example
agent Thinker {
    prompt: "Explore multiple solution approaches"
    memory: "cow"  // Enable COW memory
}

flow explore_solutions {
    problem = "How to optimize system performance?";

    // Create multiple thinking branches
    branch1 = Thinker.run(problem) |>> {
        "Approach A: Database optimization"
    };

    branch2 = Thinker.run(problem) |>> {
        "Approach B: Caching strategy"
    };

    branch3 = Thinker.run(problem) |>> {
        "Approach C: Architecture redesign"
    };

    // Consensus to find best solution
    best_solution = branch1 && branch2 && branch3;
    print(best_solution);
}

Multi-Path Reasoning¶

// Multi-path reasoning example
agent Reasoner {
    prompt: "Analyze problems from multiple perspectives"
    memory: "cow"
}

flow multi_path_reasoning {
    question = "Is this design decision correct?";

    // Create reasoning branches
    technical = Reasoner.run(question + " from technical perspective");
    business = Reasoner.run(question + " from business perspective");
    user = Reasoner.run(question + " from user experience perspective");

    // Combine decisions
    decision = technical && business && user;
    print(decision);
}

🔄 Work-Stealing Scheduler (v1.0.3+)¶

v1.0.3-beta also introduces Work-Stealing scheduler, implementing efficient concurrent task scheduling based on Actor model.

Core Mechanism¶

Work-Stealing scheduler allows idle Workers to "steal" tasks from busy Workers:

┌─────────────────────────────────────────────────┐
│              Global Task Queue                   │
└─────────────────────────────────────────────────┘
         │                │                │
         ▼                ▼                ▼
    ┌─────────┐      ┌─────────┐      ┌─────────┐
    │ Worker1 │      │ Worker2 │      │ Worker3 │
    │ [busy]  │      │ [idle]  │      │ [busy]  │
    └─────────┘      └─────────┘      └─────────┘
         │                │
         │   ← steal task │
         └────────────────┘

Scheduling Algorithm¶

// Work-Stealing scheduler core logic
struct WorkStealingScheduler {
    workers: Vec<Worker>,
    global_queue: Arc<Mutex<Vec<Task>>>,
}

impl Worker {
    fn run(&self) {
        loop {
            // 1. First get task from local queue
            if let Some(task) = self.local_queue.pop() {
                self.execute(task);
                continue;
            }

            // 2. Local empty, get from global queue
            if let Some(task) = global_queue.lock().pop() {
                self.local_queue.push(task);
                continue;
            }

            // 3. Global empty, steal from other workers
            self.steal_from_others();
        }
    }

    fn steal_from_others(&self) {
        for other in &other_workers {
            if other.local_queue.len() > THRESHOLD {
                // Steal half of tasks
                let stolen = other.local_queue.steal_half();
                self.local_queue.extend(stolen);
                break;
            }
        }
    }
}

Performance Advantages¶

Feature	Traditional Scheduler	Work-Stealing
Load Balancing	Requires explicit allocation	Automatic balance
Idle Workers	Wait for tasks	Proactively steal
Task Migration	Requires extra overhead	Zero-cost migration
Concurrency	Limited by allocation strategy	Maximize utilization

Usage Example¶

// Concurrent task processing example
@timeout(seconds=120)
agent ParallelProcessor {
    prompt: "Process multiple tasks concurrently"
}

flow parallel_workflow {
    tasks = ["task1", "task2", "task3", "task4", "task5"];

    // Work-Stealing automatically schedules all tasks
    results = tasks |>> ParallelProcessor;

    // Aggregate results
    final = results &>> Aggregator;
    print(final);
}

Configuration Options¶

// Enable Work-Stealing in project config
// nexa.config.toml
[scheduler]
type = "work_stealing"
workers = 8           # Number of workers
steal_threshold = 2   # Steal threshold
balance_interval = 100  # Load balance check interval (ms)

📄 Vector Virtual Memory Paging¶

AVM takes over memory and automatically performs vectorized swapping of conversation history.

Core Features¶

Feature	Description
LRU/LFU/Hybrid eviction strategy	Intelligent page replacement
Embedding vector similarity search	Semantic relevance loading
Transparent page loading	Imperceptible memory management
Automatic compression	Old page summary compression

Memory Page Management¶

struct MemoryPage {
    id: PageId,
    content: Vec<Message>,
    embedding: Option<Vec<f32>>,
    last_accessed: Instant,
    access_count: u32,
    compressed: bool,
}

struct ContextPager {
    pages: HashMap<PageId, MemoryPage>,
    active_pages: Vec<PageId>,
    max_active_pages: usize,
    eviction_strategy: EvictionStrategy,
}

🌐 Core AST Node Design: A Perspective on the Codification of "Intent System"¶

To give you a more intuitive feel, let's take the match intent magic we saw in Chapter 2 as an example:

// Your .nx source code:
match req {
    intent("Check weather") => WeatherBot.run(req)
}

In the underlying parser, it will be encapsulated into the following Python dataclass-based tree node hierarchy (internal simplified version shown):

MatchIntentNode(
    target_variable=VarQueryNode("req"),
    cases=[
        IntentCase(
            condition_literal="Check weather",
            action=RunAgentNode(
                agent_name="WeatherBot", 
                arguments=[VarQueryNode("req")]
            )
        )
    ]
)

Subsequently, the code generator will intelligently transpile this into the following astonishing low-level async call mechanism — directly injected into the runtime scheduling engine, accomplishing in one line what would take you a hundred lines to write manually:

# Generated target Python code overview (extremely simplified version)
_intent_router = runtime.IntentEngine()
_match_res = await _intent_router.classify(
    query=req, 
    candidates=["Check weather", "Other"]
)
if _match_res == "Check weather":
    await runtime.Orchestrator.dispatch("WeatherBot", req)

This dimensional reduction transpilation successfully keeps developers' mental model at the upper-level "natural semantic field", while the underlying layer is taken over by cold and efficient matrix operations and coroutines.

⚙️ Runtime Black Box Revealed: High Concurrency and Defensive Systems¶

All generated pure .py files ultimately run on the black technologies we preset in the runtime/ namespace. This microkernel contains four guardians:

runtime.memory (Explicit Long-Short Term Memory Scheduling Module)¶

Mounts and prunes Context Window Management. You can implant sliding window protocols or RAG-based external summaries here, ensuring that long-running Agents won't overflow the context window (Token Limits).

runtime.orchestrator (Pipeline and Parallel Scheduling Center)¶

Nexa's most hardcore part. It controls the precise blocking and waking of >> pipeline flow, manages concurrent pulling of multiple models under join() actions, and implements the event loop that suspends state to yield to human approval.

runtime.evaluator (Defensive Loop Engine and Strong Validation Layer)¶

Makes natural language judgments on exit thresholds for loop until, performs implicit retries with reasons for malformed outputs that don't conform to protocol format. This Evaluator can wrap system errors as Context and forcefully throw them back to the large model, demanding it to recognize and correct mistakes!

runtime.rbac (Permission Access Control)¶

Defines security roles for different Agents or flows, ensuring the principle of least privilege for tool calls.

📊 Performance Comparison¶

Metric	Python Transpiler	Rust AVM
Compilation Time	~100ms	~5ms
Startup Time	~500ms	~10ms
Memory Footprint	~100MB	~10MB
Concurrent Agents	~100	~10000

🔧 Debugging and Transparency¶

Reject Black Box Frameworks, Embrace Transparency

We deeply understand the extreme pain of troubleshooting deadlocks inside platforms like AutoGPT. Therefore, when you use the pure build mode nexa build xxx.nx, Nexa doesn't hide its transpiler logic. You can see readable source code named out_xxx.py in the same-level directory. You can debug it like a normal Python backend server, directly setting hard breakpoints on this transpiled generated code for single-step debugging! This highly transparent design has won unanimous praise from geeks.

Debug Commands¶

# Generate Python code
nexa build script.nx

# View generated bytecode (AVM mode)
nexa build script.nx --bytecode

# Runtime debug mode
nexa run script.nx --debug

# Performance profiling
nexa run script.nx --profile

📝 Chapter Summary¶

In this chapter, we gained deep understanding of:

Compilation Pipeline: The complete process from lexical analysis to code generation
AVM Architecture: High-performance execution engine written in Rust
WASM Sandbox: Four-level permission model and resource limits
Smart Scheduler: Priority queue and load balancing
Vector Virtual Memory: LRU/LFU eviction strategies
Runtime Components: memory, orchestrator, evaluator, rbac

Architecture Evolution and Future - Rust/WASM technology roadmap
Best Practices - Enterprise-level development experience
Nexa Syntax Reference - Complete syntax definition

快来问问agent吧！

我是Nexa文档AI助手，可以问我有关文档的一切！

6. Compiler Design: From AST to Python Runtime Engine (Compiler Architecture)¶

🛠️ Compilation Pipeline Overview: What Happens When You Hit nexa run?¶

1. Lexical & Parsing Engine¶

2. Graph Discovery & Static Type Check¶

3. Python Async Coroutine Transpilation Layer¶

4. Landing on Nexa Runtime Execution Engine¶

🆕 v1.0 AVM Architecture: Agent Virtual Machine¶

Architecture Overview¶

Compiler Frontend (Lexer + Parser)¶

Technology Choices¶

Lexer Features¶

Bytecode Compiler¶

Bytecode Instruction Set Design¶

AVM Runtime¶

Core Components¶

🔒 WASM Security Sandbox¶

WASM Sandbox Architecture¶

Permission Model¶

⚡ Smart Scheduler¶

Scheduling Strategies¶

Scheduler Features¶

🐄 COW Memory: Copy-On-Write State Branching (v1.0.3+)¶

Core Concept¶

Working Principle¶

Performance Advantages¶

Use Cases¶

Tree-of-Thoughts Exploration¶

Multi-Path Reasoning¶

🔄 Work-Stealing Scheduler (v1.0.3+)¶

Core Mechanism¶

Scheduling Algorithm¶

Performance Advantages¶

Usage Example¶

Configuration Options¶

📄 Vector Virtual Memory Paging¶

Core Features¶

Memory Page Management¶

🌐 Core AST Node Design: A Perspective on the Codification of "Intent System"¶

⚙️ Runtime Black Box Revealed: High Concurrency and Defensive Systems¶

runtime.memory (Explicit Long-Short Term Memory Scheduling Module)¶

runtime.orchestrator (Pipeline and Parallel Scheduling Center)¶

runtime.evaluator (Defensive Loop Engine and Strong Validation Layer)¶

runtime.rbac (Permission Access Control)¶

📊 Performance Comparison¶

🔧 Debugging and Transparency¶

Debug Commands¶

📝 Chapter Summary¶

🔗 Related Resources¶

🛠️ Compilation Pipeline Overview: What Happens When You Hit `nexa run`?¶