AIArtificial IntelligenceTrends

Meet EverOS: An Open Source Markdown-First Agent Memory Runtime With Hybrid BM25 + Vector Retrieval and Self-Evolving Skills

Views: 1
0 0
Read Time:6 Minute, 25 Second

  

EverMind has released EverOS, an open-source memory runtime for AI agents. It ships under an Apache 2.0 license. It targets a problem agent builders hit early: large language models are stateless. The conversation ends, and the context is gone.

EverOS proposes a different substrate. Instead of locking memory inside a vector database, it writes memory as plain Markdown files. Those files become the source of truth that agents read, edit, and search across sessions.

TL;DR

  • EverOS stores agent memory as editable Markdown, indexed by SQLite and LanceDB.
  • Hybrid retrieval blends BM25, vector search, and scalar filtering in one query.
  • Cases distill into reusable Skills, giving agents procedural, self-evolving memory.
  • Benchmark scores are strong but EverMind-reported; verify on your own workload.
  • It is open source under Apache 2.0, with cloud and self-hosted parity.

What is EverOS?

EverOS is a Python library and a local-first memory runtime. It runs as a server with a CLI and a FastAPI HTTP API, async-first throughout. You drop it into an existing agent loop rather than rebuilding your stack.

The design separates two memory tracks. User-side memory holds Profiles, Episodes, Facts, and Foresights. Agent-side memory holds Cases and Skills. Keeping them separate is unusual; most libraries center on chat history alone.

Every record lands as a .md file. You can open, edit, grep, and Git-version it, or view it in Obsidian. EverAlgo, a separate stateless library, handles the extraction algorithms. EverOS orchestrates and persists the results.

The endpoint stack is OpenAI-protocol compatible. It connects to OpenAI, OpenRouter, vLLM, Ollama, or DeepInfra by changing a base URL. That keeps integration close to a single configuration change.

The runtime is local-first by default. Data never has to leave your environment, and every layer is inspectable. A managed EverOS Cloud option exists for teams that prefer not to self-host. Both share the same SDK, retrieval engine, and memory format.

The Architecture — Markdown, SQLite, and LanceDB

EverOS uses a three-piece storage stack. Markdown is the source of truth. SQLite manages state and queues. LanceDB manages vectors, BM25, and scalar filters.

This is deliberately lighter than a typical production memory setup. There is no required MongoDB, Elasticsearch, Milvus, Redis, or Kafka. For solo developers and small teams, that lowers operational cost.

Retrieval is hybrid. A single LanceDB query combines BM25 keyword matching, dense vector search, and scalar filtering. EverMind markets this multimodal retrieval path as mRAG.

A cascade index sync keeps files and indexes aligned. Editing a .md file triggers a file-watcher that re-syncs the index. Memory stays inspectable without going stale.

Retrieval is also orthogonal across identifiers. You can scope a search by user_id, agent_id, app_id, project_id, and session_id. That scoping is important in multi-agent and multi-user deployments where data isolation is required.

How Memory Self-Evolves — Cases Become Skills

A distinctive feature is procedural memory. EverOS records each completed agent task as a Case. Repeated successful patterns are distilled offline into reusable Skills.

This is the ‘self-evolving’ claim, stated plainly. Skills are shared across an agent team, with no manual curation and no hardcoding. The goal is agents that improve with use instead of restarting each session.

Version 1.1.0 added more lifecycle machinery. It introduced Knowledge APIs for source-backed Markdown pages with taxonomy and topic search. It also added Reflection, an offline process that merges episode clusters and refines profiles and skills between sessions.

The memory model is simple. Episodic memory answers ‘what happened.’ Profile memory answers ‘who is this user.’ Procedural memory answers ‘how is this task done.’

Benchmark

EverMind team reports 93.05% on LoCoMo, 83.00% on LongMemEval, and 93.04% on HaluMem. It also cites sub-500ms p95 retrieval latency. LoCoMo and LongMemEval measure long-term conversational memory; HaluMem targets memory hallucination. These numbers come from EverMind posts.

The table below compares EverOS against common alternatives on concrete design dimensions:

Dimension EverOS Naive RAG Full context window Other memory libraries
Source of truth Plain Markdown .md files Vector DB records Prompt only API or database state
Local stack Markdown + SQLite + LanceDB Vector DB + app code None Often managed services
Retrieval Hybrid BM25 + vector + scalar Dense vector only None (no retrieval) Varies
Procedural memory Cases distilled into Skills None None Rare
Multimodal ingest PDF, image, Office, URL in one call Manual pipeline Via context only Partial
LoCoMo accuracy 93.05% (EverMind-reported) N/A (context limit) Varies
License Apache 2.0 Varies N/A Varies / proprietary

Use Cases, With Real Examples

The library links to working integrations. They show what persistent memory enables in real products.

Hive Orchestrator is a browser-native hive-mind for CLI coding agents. Claude Code, Codex, Gemini, and OpenCode collaborate as real PTY processes through a shared team protocol.

Reunite uses semantic memory for public-value search. Parents describe what they remember, children describe what they recall, and the system surfaces connections.

Other examples span healthcare and hardware. They include an Alzheimer’s memory assistant and an AI wearable. The wearable listens to everyday life and converts it into memory. A study buddy with self-evolving memory is also among the examples. The wider ecosystem adds a Claude Code plugin and an MCP-based memory layer for coding assistants.

A Five-Minute Code Walkthrough

Installation uses standard Python tooling. EverOS requires Python 3.12 or newer. The local demo needs no API keys.

# Requires Python 3.12+
uv pip install everos        # or: pip install everos
everos demo                  # local educational visualizer, no keys
everos init                  # paste OpenRouter + DeepInfra keys into .env
everos server start          # starts the FastAPI server
curl http://127.0.0.1:8000/health   # -> {"status":"ok"}

Adding and searching memory are plain HTTP calls. The example below stores a fact, forces extraction, then retrieves it.

# 1) Add a short conversation
curl -X POST http://127.0.0.1:8000/api/v1/memory/add 
  -H 'Content-Type: application/json' 
  -d '{"session_id":"demo-001","app_id":"default","project_id":"default",
       "messages":[{"sender_id":"alice","role":"user","timestamp":1750000000000,
                    "content":"I love climbing in Yosemite every spring."}]}'

# 2) Flush to force extraction (local demo)
curl -X POST http://127.0.0.1:8000/api/v1/memory/flush 
  -H 'Content-Type: application/json' 
  -d '{"session_id":"demo-001","app_id":"default","project_id":"default"}'

# 3) Search it back
curl -X POST http://127.0.0.1:8000/api/v1/memory/search 
  -H 'Content-Type: application/json' 
  -d '{"user_id":"alice","app_id":"default","project_id":"default",
       "query":"Where do I like to climb?","top_k":5}'

Multimodal ingestion is an optional extra. Installing everos[multimodal] adds parsing for images, PDFs, and audio. Office documents additionally require LibreOffice, which converts files to PDF before parsing.

Try It: Interactive Memory Demo

The embedded demo below simulates the EverOS loop in your browser. Add a snippet, watch it get extracted and tagged, then search it back through hybrid retrieval. It is illustrative and does not connect to a live server.


Check out the Repo and Technical details. Also, feel free to follow us on Twitter and don’t forget to join our 150k+ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us

The post Meet EverOS: An Open Source Markdown-First Agent Memory Runtime With Hybrid BM25 + Vector Retrieval and Self-Evolving Skills appeared first on MarkTechPost.

 

​MarkTechPost

Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %

Average Rating

5 Star
0%
4 Star
0%
3 Star
0%
2 Star
0%
1 Star
0%

Leave a Reply

Latest news