
EAI — Embedded AI Runtime
EAI is the EoS embedded AI runtime — twelve curated LLM and small-model variants, ReAct agent orchestration, on-device LoRA fine-tuning, and federated-learning rounds, all engineered to run inside an edge-class power and memory envelope.
What EAI is
EAI ships with twelve quantized model variants spanning 50 M to 7 B parameters, each tuned for embedded inference: 4-bit / 8-bit weight formats, fused operators, and zero-copy tensor handoff to EoS HAL accelerators (NPU, GPU, DSP).
Beyond raw inference, EAI provides a ReAct-style agent loop with tool use, an on-device LoRA trainer for personalisation, and a federated-learning client so devices can collectively improve a shared model without ever uploading raw data.
Features
The shape of EAI at a glance.
12 Bundled Models
Family includes 50 M, 350 M, 1.3 B, 3 B, and 7 B variants — 4-bit and 8-bit quantized.
ReAct Agent Loop
Reasoning + acting framework with structured tool-call grammar and trace logging.
On-Device LoRA
Fine-tune adapter weights from local data; merge or hot-swap at runtime.
Federated Learning
Secure-aggregation client; participates in cross-device training rounds with differential privacy.
Accelerator HAL
Zero-copy dispatch to NPU, GPU, DSP, and CPU SIMD via the EoS HAL.
Streaming I/O
Token-by-token streaming responses; integrates with eApps and eBowser.
Memory-Mapped Weights
Demand-paged weight access for SoCs with limited RAM.
Safe-Output Filters
Built-in content / PII filters, jailbreak-resistant system-prompt vault.
Bench & Profiler
Per-op latency / power profiler; export to flame graphs and CSV.
Open source on GitHub
EAI is Apache-2.0 licensed and developed in the open. Issues, discussions, and pull requests welcome.
In the EoS stack
EAI is the highlighted layer below.
Pairs well with
Sibling components that EAI commonly works alongside.
Ready to build with EAI?
Start with the docs, browse the source, or join the community.