Aureliolo/synthorg

Aureliolo

synthorg

Blog Docs Changelog

Performance History

Latest Results

chore(main): release 0.9.4

release-please--branches--main--components--synthorg

3 hours ago

Wire layered memory (org/agent/project) into the working loop on a durable substrate (#2615) ## What this does Makes three-layer memory (org / agent / project) actually reach a working agent's context, durably and safely. The audit found the gap was deeper than the issue diagnosed: the shared backend in the default container was an ephemeral in-process dict with a substring matcher, and the real factory + embedder resolver had no caller anywhere in `src/`. So project-brain, the knowledge/RAG substrate and living-docs were all doing substring matching over a dict that emptied on every restart, while the settings page declared a Mem0 backend that boot silently ignored. ## The shape ``` AgentEngine._prepare_context -> MemoryInjectionStrategy (proactive, all three layers, one shared budget) -> RRF(dense, sparse) -> rerank -> MMR -> calibrated floor -> top-5 -> MemoryBackend protocol (unchanged) -> PgVectorBackend (Postgres: pgvector + tsvector/pg_trgm) -> SqliteVecBackend (SQLite: sqlite-vec + FTS5) ``` Everything above the protocol already existed and was merely unreached. Only the storage leaf is new. Mem0 + qdrant-client are removed. ### Substrate - `PgVectorBackend` + `SqliteVecBackend` behind the existing `MemoryBackend` protocol, inside the persistence boundary, with dual-backend conformance tests. Postgres image swapped to the hardened `dhi.io/pgvector` variant (no security regression). ### Boot - `memory_backend_wiring.py` resolves the embedder, builds the backend and wires it **before** runtime services read it. **Fails loud** when no embedder resolves (no silent keyword fallback); the ephemeral store is an explicit, confirm-gated, degraded opt-in only. ### Retrieval - `MemoryRecallRequest` composes the query from task + objective + role + department + project; cross-layer unified ranking; tuned defaults (RRF + rerank + MMR); a **calibrated** relevance floor (not a bare cosine cutoff) with a first-class inject-nothing branch. ### Write path - Agent self-edits through the MemGPT-style tools; a **deterministic** gate dedups and supersedes at write time (no LLM, no per-task cost). Success + procedural capture, distillation, and the consolidation service are wired to a scheduler so `memory.consolidation_enabled` and its knobs become real. ### Proof - Golden recall eval harness scoring the tuned config against the naive baseline on precision@k / recall@k / pollution, including abstention cases, run in CI. ## Review hardening (two `/pre-pr-review` rounds) - **Project isolation** (top security finding, was inert by default): every write now lands in its project namespace (derived from the ambient execution identity, or the source group for consolidation), and every read — CONTEXT, TOOL_BASED, SELF_EDITING, the write-gate dedup, offload rehydrate — is scoped to the project's namespace union. Proven end-to-end across all three strategies. - **Collaborator wiring** (was a latent boot crash): enabling rerank / hierarchical retrieval / query-reformulation now constructs each collaborator from the engine's explicit provider on its pinned model, instead of raising in the strategy constructor. - **Persistence**: HNSW index built `CONCURRENTLY` behind a session advisory lock; filtered dense search under `SET LOCAL hnsw.iterative_scan`; oldest-first cap eviction; connection-state leaks closed. - **Redaction**: credentials and emails masked before storage, with the finding report proven never to quote the secret it removed. - Plus a surfaced pre-existing perf bug: unseeded auto-name generation built a 57-locale Faker (~6s) to draw one name; now samples one locale (sub-100ms). ## Verification All pre-push gates green: consolidated Python gates, mypy, the affected unit suite, dual-backend parity, module size, magic numbers, licence compatibility, provider auto-pick, architecture drift, and runtime-stats freshness. Closes #2608

main

3 hours ago

fix: reject non-finite and oversized embedding components Catch OverflowError alongside TypeError/ValueError when coercing a raw embedding vector (float(10**400) raises rather than returning inf), and reject any NaN or infinity that survives the float() coercion before the vector is stored: distance maths against a non-finite component is meaningless and the store forbids it. Cover NaN, infinity, and oversized numeric values in the extractor tests. Clarify the memory health guide: reaching durable takes a wired durable backend, a resolved embedding model, and a passing readiness probe, not embedder configuration alone.

feat/layered-memory-wiring

4 hours ago

fix: scope memory write-dedup to the write namespace and harden failure paths Dedup candidate archival writes only within the ambient write namespace, matching supersession, so an unscoped run cannot collapse into a project-private memory it should never see and a project write keeps its own copy rather than folding into the shared default's. Re-raise MemoryError/RecursionError unwrapped from the sqlvector retrieve TaskGroup and document MemoryEmbeddingError on the retrieve path; reject non-numeric embedding values as malformed instead of surfacing a bare TypeError. Reject naive datetimes in the SQLite purge_expired path, and document the linear/rrf fusion strategies plus the degraded/off readiness matrix in the memory guide.

feat/layered-memory-wiring

4 hours ago

fix: satisfy explicit-override and review-origin gates Add @override to the _FrozenRoutes mapping methods and drop the SEC-1 taxonomy tag from a memory-formatting comment.

feat/layered-memory-wiring

5 hours ago

perf: draw one locale for unseeded auto-names instead of building a 57-locale Faker The unseeded name path constructed a Faker over every Latin-script locale, which eagerly loads every provider for all 57 locales (seconds) only to draw a single name and tripped the unit wall-clock guard. Sample one locale from the full set per call and reuse the per-locale cached instance, keeping international diversity while cutting the call from ~6s to sub-100ms.

feat/layered-memory-wiring

7 hours ago

chore(main): release 0.9.4

release-please--branches--main--components--synthorg

11 hours ago

Make a greenlit initiative one connected, status-rolling graph (#2610) Closes #2607 A greenlit objective now becomes an owned, planned, verified initiative you can supervise: the project knows its plan, plan items know their tasks, and status rolls up from the work. ## The model **Scalar keys up, collections derived.** `Project.plan_id`, `Task.plan_id`, `Task.plan_item_id` are new; `Project.task_ids` is deleted outright. It was write-orphaned (declared, persisted, never populated) and is the cautionary case: a collection embedded in a row cannot stay correct under concurrent writes, which is why the dashboard showed a task count of zero next to a full task list. Reverse lookups are indexed queries instead. **Real state machines.** `Plan` and `Project` both gain a transition table on the shared `core/state_machine.py`, as `Task` already had. `PlanStatus` gains `EXECUTING` and `COMPLETED`; `APPROVED` stops being terminal because it dispatches. There is deliberately **no failed project status**: nothing can honestly derive that an initiative is dead (an oracle `REJECT` routes a task back to rework, and a `FAILED` task stays reassignable), so a derived failure would flap. Ending an initiative stays a human act; failed and blocked work surfaces as derived counts. **Verification-derived rollup.** `ProjectRollupService` registers as a `TaskEngine` observer and, on each event, re-queries every task for the plan and recomputes from scratch. Two properties are load-bearing: recompute is idempotent, so a dropped best-effort event heals on the next one without a reconciler; and it reads *persisted* `Task.status`, never `DispatchResult` outcomes. The coordination-level parent rollup does the latter and so counts an `IN_REVIEW` task as done — the initiative rollup cannot, which is what stops a project completing on unverified work. **Re-planning.** A plan under review is edited in place. Once dispatched its items are already building, so `POST /plans/{id}/replan` retires the current revision, cancels the work it started, opens a successor under review, and repoints the project. The ordering protects one invariant: a project never has two live plans. **Operator surface.** `GET /projects/{id}/progress` serves per-item status, derived counts, and the critical path through the plan's item DAG, computed server-side so it is reachable by any API client rather than only the browser. The project page renders it; the dashboard persists nothing. ## Notes for review - `CREATE INDEX CONCURRENTLY` on `tasks` runs outside a transaction, so each change is expressed as one `ALTER TABLE` — the `plans` CHECK is swapped in a single statement rather than dropped and re-added, so there is never a window without a status constraint. - The claim that the review gate is the *only* path to `COMPLETED` is stated as a property of which writers are wired, not a structural guarantee: the lifecycle-only baseline execution service and the coordination parent rollup both reach it without the oracle chain. - `generate_endpoint_table.py` could not run at all (7 tags missing from `TAG_TO_SECTION`), which is why `docs/openapi/index.md` had drifted as far as Providers 20 vs 34. Fixed, and the `--check` hook its docstring already claimed now exists. ## Verification 37678 unit; 2310 dual-backend conformance; 3933 web plus tsc and ESLint; mypy strict over 6368 files; both schema-drift arms; full pre-push gate set. The full integration tier could not be driven green locally: running ~3600 tests at `-n 8` against a single Postgres container saturates the machine and the 30s per-test timeout kills workers at the same ~82% mark. Root-caused rather than assumed — the abort point is identical with and without the migration change, every aborting test is a Postgres infra test untouched by this diff, and `tests/integration/persistence` passes 89/89 at full parallelism. That tier gets its real verdict from CI.

main

11 hours ago

Latest Branches

chore(main): release 0.9.4#2551

3 hours ago

df3ed5a

release-please--branches--main--components--synthorg

Wire layered memory (org/agent/project) into the working loop on a durable substrate#2615

4 hours ago

2552c15

feat/layered-memory-wiring

Make a greenlit initiative one connected, status-rolling graph#2610

12 hours ago

b110272

feat/project-plan-task-linkage

Home Terms Privacy Docs