We Wrote This Post Three Weeks Ago
We've been running the Director/Manager/Team model internally for weeks — and I wrote about it in “How I Use Claude Code to Ship Features in One Session”. The thesis: stop prompting like a developer, start directing like an executive. You're the Director of Engineering. Claude Code is your Engineering Manager. Skills and plugins are the development team.
Then Garry Tan — the CEO of Y Combinator — drops gstack: six Claude Code skills built on the exact same mental model. CEO review, Eng Manager review, Staff Engineer review, Release Engineer shipping. Same role hierarchy. Same idea that AI works best when you give it a clear organizational lane.
This isn't a coincidence. It's convergent evolution. The best developers are arriving at the same conclusion independently: role-based AI development works because it mirrors how real engineering organizations already operate. When the person running the world's most influential startup accelerator builds the same system you've been shipping with internally, that's not validation of a technique — it's confirmation of a pattern.
The gstack Breakdown
gstack ships six skills, each mapped to a role in an engineering org. Here's what each one does:
| Skill | Role | What It Does |
|---|---|---|
| /plan-ceo-review | Founder / CEO | Rethink the problem. Find the 10-star product inside a boring ticket. |
| /plan-eng-review | Eng Manager | Architecture review, data flow analysis, edge cases, ASCII diagrams. |
| /review | Staff Engineer | Paranoid bug hunt. Not nitpicks — real production risks. |
| /ship | Release Engineer | Sync main, run tests, push, open PR. Fully automated. |
| /browse | QA Engineer | Browser automation with compiled Playwright. Visual QA with screenshots. |
| /retro | Eng Manager | Weekly retrospectives with commit analysis and improvement tracking. |
The standout is /plan-ceo-review. It's what Tan calls “Brian Chesky mode” — the idea that every ticket, no matter how mundane, has a 10-star product hiding inside it. Before you write a line of code, this skill forces you to rethink the problem at the product level. What would make this feature so good that users tell other people about it? That's a genuinely different framing than most developers bring to a Jira ticket.
/browse is clever too. It ships a compiled Playwright binary for browser automation, so Claude can take screenshots, click through flows, and visually verify that what it built actually renders correctly. It's QA without leaving the terminal.
And /ship does what it says — syncs with main, runs the full test suite, pushes, and opens a PR. No manual git dance. That's a workflow step I still do by hand, and I respect the automation.
Where We Overlap — and Where We Don't
The mental model is identical. The implementation is different. Here's where our workflow and our toolkit line up with gstack, and where they diverge:
| Phase | Our Workflow | gstack |
|---|---|---|
| Planning | You write the brief (Director mode) | /plan-ceo-review + /plan-eng-review |
| Building | /batch + feature-dev + frontend-design | (no dedicated build skill) |
| Review | /simplify (3 agents) | /review (paranoid staff engineer) |
| QA | Human testing + Chrome DevTools MCP | /browse (compiled Playwright binary) |
| Final gate | /code-review (4 agents, confidence scoring) | (covered by /review) |
| Shipping | Manual PR | /ship (automated) |
| Retro | (none) | /retro |
Planning: gstack splits planning into two modes — CEO-level product thinking and eng manager-level architecture review. We keep planning as one human-driven step: you write the brief, you define the acceptance criteria. gstack's approach is interesting because it forces a product-quality rethink before the technical plan. Ours is faster but requires you to bring that product thinking yourself.
Building: This is our biggest differentiation. We have dedicated build skills — feature-dev for architecture, frontend-design for UI — that guide Claude through structured multi-phase development. gstack doesn't have a dedicated build phase. The assumption is that Claude Code handles building natively, and gstack focuses on the planning and review that surrounds it.
Review: We separate review into two passes — /simplify for code quality (3 parallel agents) and /code-review for security and standards compliance (4 agents with confidence scoring). gstack bundles everything into one /review pass with a “paranoid staff engineer” persona. Different trade-off: we optimize for thoroughness, gstack optimizes for speed.
QA and shipping: gstack wins here. /browse with a compiled Playwright binary is smarter than relying on manual testing or a separate Chrome DevTools MCP. And /ship automates the entire PR workflow that I still do by hand. Credit where it's due.
Retro: gstack has /retro for weekly retrospectives with commit analysis. We don't have an equivalent. It's a good idea — especially for teams shipping fast enough that the “what did we learn” step gets skipped.
Why This Pattern Is Winning
Here's the meta-point that matters more than any feature comparison: role-based AI development is emerging as the dominant workflow pattern. Not because anyone copied anyone. Because it maps to how real engineering organizations already work.
Directors direct. Managers manage. Engineers engineer. That organizational structure exists for a reason — it was refined over decades of building software at scale. The best AI workflows mirror that structure because the underlying problem hasn't changed: you need clear ownership, separation of concerns, and specialists who each own their lane.
When you prompt Claude Code like a peer — “hey, write me a function that does X” — you're fighting the tool. When you give it a role, a scope, and clear acceptance criteria, you're working with its strengths. gstack codifies this. Our workflow codifies this. The CLAUDE.md best practices we wrote about codify this. The pattern is everywhere because it works.
“The best AI workflows mirror organizational structure because that structure exists for a reason.”
The developers who figure this out first — whether through gstack, our toolkit, or their own homebrew system — are shipping at a fundamentally different velocity than those still treating Claude Code like autocomplete. The gap is only going to widen.
Try Both
These tools aren't competing. They're complementary. gstack excels at the planning and shipping phases — the CEO-level product rethink, the automated PR workflow, the browser-based QA. Our toolkit excels at the building and multi-pass review phases — structured feature development, dedicated UI skills, and layered code review with confidence scoring.
My recommendation: install gstack for /plan-ceo-review, /ship, and /browse. Use our build and review toolkit for the implementation phase. Run both through the Director/Manager/Team workflow. And make sure your CLAUDE.md is solid — both gstack and our tools are only as good as the context you give them.
Related Resources
Ship Features in One Session with Claude Code
The Director/Manager/Team workflow that gstack validates. Brief, delegate, review, ship — in 45-90 minutes.
Best Claude Code Plugins, Skills & MCP Servers
The 7 tools that power our build and review workflow. How they compare to gstack's approach.
How to Write a CLAUDE.md That Actually Works
The foundation that makes both gstack and our toolkit effective. Structure, progressive disclosure, and best practices.
The Developer's Guide to CLAUDE.md
Comprehensive reference covering file types, @imports system, structure patterns, and comparisons with other AI config files.
Build Faster with the Right Workflow
Whether you use gstack, our toolkit, or both — the pattern is the same. See how document automation and e-signatures can accelerate your team.
