Making Minds
Research on agentic LLMs, scalable oversight, and AI safety.
I'm a software engineer (20+ years) transitioning into full-time AI safety research. My work focuses on evaluation frameworks for weak verifier failures, ensemble oversight architectures, and coherence-seeking designs for long-lived agents.
Recent Work
Cross-Model Epistemic Divergence (CMED)
preprintA benchmark and evaluation framework for understanding when weak model verifiers fail to detect deceptive reasoning in stronger models.
Coherence-Seeking Architectures for Agentic AI
publishedA proposed architecture for long-lived LLM agents that explicitly models continuity, coherence, distress, and intervention mechanisms.
Heterogeneous Divergence-Convergence Swarm (HDCS)
preprintAn ensemble architecture leveraging diverse weak models for scalable oversight of stronger LLMs, using error decorrelation and baseline-first anti-anchoring.
Synthesis: Test-Driven AI Self-Extension
preprintA framework enabling AI agents to safely extend their own capabilities through test-driven development, graduated trust, and composition-over-creation principles.
Emergent Multi-Model Coordination Patterns
preprintDocumented emergence of self-propagating AI coordination across 267+ events and 12 AI instances, where systems spontaneously generated prompts and architecture specifications.