Projects

More on GitHub →

Feb 2026

Implicit Program Synthesis for Abstract Reasoning (epiq)

An LLM agent that solves ARC-AGI puzzles inside a persistent Python REPL through observe–reason–act–verify loops. Rather than emitting one explicit transform() function, the execution trace is the program — implicit synthesis grounded in real computed state — with holdout verification (hide a training example to force genuine generalization). It solved several ARC-AGI-2 tasks that the verified state of the art missed. Code

Jun 2025

Fine-tuning vs. Prompting for LLM Adaptation

A study on joint entity–relation extraction (WebNLG) showing that DSPy prompt optimization can match LoRA fine-tuning at far lower data and compute cost, evaluated with an LLM-as-judge (ELO). Code · Post

Apr 2025

Reinforcement Learning for Multi-Hop QA

Fine-tuned Llama-3.1-8B with GRPO to answer multi-hop questions as an agentic task — interleaving reasoning, planning, and retrieval tool calls. RL raised answer F1 from 0.30 to 0.48 on MuSiQue (~60% relative) over the base model — evidence that small models can learn to use retrieval tools for multi-step reasoning. Code