Developer Productivity

Garry Tan's gstack Is the Same Play We've Been Running

YC's CEO open-sourced six Claude Code skills built on the Director/Manager/Engineer model. We've been running that exact workflow internally for weeks. Here's what that validates — and where we differ.

Nicolas
NicolasFounder & CEO
March 12, 20267 min read

We Wrote This Post Three Weeks Ago

We've been running the Director/Manager/Team model internally for weeks — and I wrote about it in “How I Use Claude Code to Ship Features in One Session”. The thesis: stop prompting like a developer, start directing like an executive. You're the Director of Engineering. Claude Code is your Engineering Manager. Skills and plugins are the development team.

Then Garry Tan — the CEO of Y Combinator — drops gstack: six Claude Code skills built on the exact same mental model. CEO review, Eng Manager review, Staff Engineer review, Release Engineer shipping. Same role hierarchy. Same idea that AI works best when you give it a clear organizational lane.

This isn't a coincidence. It's convergent evolution. The best developers are arriving at the same conclusion independently: role-based AI development works because it mirrors how real engineering organizations already operate. When the person running the world's most influential startup accelerator builds the same system you've been shipping with internally, that's not validation of a technique — it's confirmation of a pattern.

The gstack Breakdown

gstack ships six skills, each mapped to a role in an engineering org. Here's what each one does:

SkillRoleWhat It Does
/plan-ceo-reviewFounder / CEORethink the problem. Find the 10-star product inside a boring ticket.
/plan-eng-reviewEng ManagerArchitecture review, data flow analysis, edge cases, ASCII diagrams.
/reviewStaff EngineerParanoid bug hunt. Not nitpicks — real production risks.
/shipRelease EngineerSync main, run tests, push, open PR. Fully automated.
/browseQA EngineerBrowser automation with compiled Playwright. Visual QA with screenshots.
/retroEng ManagerWeekly retrospectives with commit analysis and improvement tracking.

The standout is /plan-ceo-review. It's what Tan calls “Brian Chesky mode” — the idea that every ticket, no matter how mundane, has a 10-star product hiding inside it. Before you write a line of code, this skill forces you to rethink the problem at the product level. What would make this feature so good that users tell other people about it? That's a genuinely different framing than most developers bring to a Jira ticket.

/browse is clever too. It ships a compiled Playwright binary for browser automation, so Claude can take screenshots, click through flows, and visually verify that what it built actually renders correctly. It's QA without leaving the terminal.

And /ship does what it says — syncs with main, runs the full test suite, pushes, and opens a PR. No manual git dance. That's a workflow step I still do by hand, and I respect the automation.

Where We Overlap — and Where We Don't

The mental model is identical. The implementation is different. Here's where our workflow and our toolkit line up with gstack, and where they diverge:

PhaseOur Workflowgstack
PlanningYou write the brief (Director mode)/plan-ceo-review + /plan-eng-review
Building/batch + feature-dev + frontend-design(no dedicated build skill)
Review/simplify (3 agents)/review (paranoid staff engineer)
QAHuman testing + Chrome DevTools MCP/browse (compiled Playwright binary)
Final gate/code-review (4 agents, confidence scoring)(covered by /review)
ShippingManual PR/ship (automated)
Retro(none)/retro

Planning: gstack splits planning into two modes — CEO-level product thinking and eng manager-level architecture review. We keep planning as one human-driven step: you write the brief, you define the acceptance criteria. gstack's approach is interesting because it forces a product-quality rethink before the technical plan. Ours is faster but requires you to bring that product thinking yourself.

Building: This is our biggest differentiation. We have dedicated build skills — feature-dev for architecture, frontend-design for UI — that guide Claude through structured multi-phase development. gstack doesn't have a dedicated build phase. The assumption is that Claude Code handles building natively, and gstack focuses on the planning and review that surrounds it.

Review: We separate review into two passes — /simplify for code quality (3 parallel agents) and /code-review for security and standards compliance (4 agents with confidence scoring). gstack bundles everything into one /review pass with a “paranoid staff engineer” persona. Different trade-off: we optimize for thoroughness, gstack optimizes for speed.

QA and shipping: gstack wins here. /browse with a compiled Playwright binary is smarter than relying on manual testing or a separate Chrome DevTools MCP. And /ship automates the entire PR workflow that I still do by hand. Credit where it's due.

Retro: gstack has /retro for weekly retrospectives with commit analysis. We don't have an equivalent. It's a good idea — especially for teams shipping fast enough that the “what did we learn” step gets skipped.

Why This Pattern Is Winning

Here's the meta-point that matters more than any feature comparison: role-based AI development is emerging as the dominant workflow pattern. Not because anyone copied anyone. Because it maps to how real engineering organizations already work.

Directors direct. Managers manage. Engineers engineer. That organizational structure exists for a reason — it was refined over decades of building software at scale. The best AI workflows mirror that structure because the underlying problem hasn't changed: you need clear ownership, separation of concerns, and specialists who each own their lane.

When you prompt Claude Code like a peer — “hey, write me a function that does X” — you're fighting the tool. When you give it a role, a scope, and clear acceptance criteria, you're working with its strengths. gstack codifies this. Our workflow codifies this. The CLAUDE.md best practices we wrote about codify this. The pattern is everywhere because it works.

“The best AI workflows mirror organizational structure because that structure exists for a reason.”

The developers who figure this out first — whether through gstack, our toolkit, or their own homebrew system — are shipping at a fundamentally different velocity than those still treating Claude Code like autocomplete. The gap is only going to widen.

Try Both

These tools aren't competing. They're complementary. gstack excels at the planning and shipping phases — the CEO-level product rethink, the automated PR workflow, the browser-based QA. Our toolkit excels at the building and multi-pass review phases — structured feature development, dedicated UI skills, and layered code review with confidence scoring.

My recommendation: install gstack for /plan-ceo-review, /ship, and /browse. Use our build and review toolkit for the implementation phase. Run both through the Director/Manager/Team workflow. And make sure your CLAUDE.md is solid — both gstack and our tools are only as good as the context you give them.

Related Resources

Build Faster with the Right Workflow

Whether you use gstack, our toolkit, or both — the pattern is the same. See how document automation and e-signatures can accelerate your team.

Nicolas
NicolasFounder & CEO