Skip to content huecki Skip to content

huecki

Software, AI agents, messy notes and the occasional useful idea.

Currently on the bench

Currently building: Agent Buildprint

Agent Buildprint is my current main project: executable contracts for coding agents — phase-flow packets, evidence ledgers, review loops, and replay gates instead of just prompt + spec.

ACTIVE BUILD phase-flow replay + evidence honesty

Agent Buildprint

Agents no longer start from a vague assignment. They bootstrap a selected-buildprint packet, read the phase-flow constitution, write schema-valid runtime evidence, and cannot sell blockers as success.

$ agb start
→ phase before code
→ evidence before trust
→ replay before done
PHASE-FLOWEVIDENCEREVIEWSREPLAY
Open Buildprint registry →

AI Native Engineering

From prompt writer to AI system builder.

A self-paced learning path for developers who want to operate AI features, not just demo them — covering context budgets, Task Contracts, decomposition, evals, and fallbacks.

01

Tokens & Attention

Context windows, position effects, and lost-in-the-middle as real architecture constraints.

02

Context Engineering

Task Contracts, schemas, and source boundaries instead of longer prompts.

03

Agentic Delivery

Evals, traces, tool gates, and incident playbooks for operable AI features.

68 slides · self-paced · interactive

Becoming LLM-Native

Open the full learning path with the interactive slide deck, context models, Task Contracts, and operable AI-feature patterns.

Open AI Native Engineering →
· AI-first Engineering

Your AI-Built UI Needs a Playtester, Not a Screenshot Review

AI-generated interfaces often look finished before they behave correctly. A GUI playtester loop uses a separate browser agent to interact with the artifact, record screenshots and action logs, turn broken flows into reproducible bug reports, and rerun the same script after repairs.

Read article →
· AI-first Engineering

Stop Judging AI Code by the Diff

Better AI coding is not mainly about better prompts. It is about the harness around the model: explicit contracts, separate builder and reviewer roles, evidence requirements, and a loop that turns failures into better specifications.

Read article →
· AI Agent Workflows

Give Your Agent Seatbelts, Not a Longer Prompt

When an agent keeps jumping from planning to editing to testing at the wrong time, the fix is not usually another paragraph of system prompt. Put the workflow into explicit states, give each state a tiny tool policy, and make phase changes visible.

Read article →
· AI Agent Workflows

AI Agents Need Evidence Before They Click

When an agent clicks, sends, pays, deletes, or extracts data, the critical truth cannot live only in model prose. Put a small evidence gate before risky tool calls: predicate, evidence type, source, decision.

Read article →
· AI Agent Workflows

Stop Asking AI to Critically Self-Check

Open-ended instructions like “critically self-check this” accidentally reward the model for producing criticism. The fix is not less review. It is calibrated review: explicit criteria, PASS_NO_CHANGE, evidence per finding, severity thresholds, and a tiny change budget.

Read article →
· AI-first Engineering

Your Onboarding Is Why Your Team Is Vibe Coding

Teams do not usually start vibe coding because developers became careless. They start because onboarding is broken: docs are stale, harnesses are undocumented, system knowledge lives in people’s heads, and AI turns missing context into plausible code and Markdown.

Read article →
· AI-first Engineering

The LLM-native developer needs more than prompts

The next developer skill is not writing clever prompts. It is building the operating system around LLMs: data quality, model versioning, evals, guardrails, incident response, review UX, and repo instructions agents can actually follow.

Read article →
· AI-first Engineering

Prompting Is Dead. Context Wins.

In 2026, good prompting is not about one magic sentence. The better approach is to curate context, define tools and schemas, set agent rules, and verify behavior with evals.

Read article →