"What's Your North Star Metric?" I Didn't Have an Answer.
「A quick note before we start: This series documents my real journey figuring out operations as a solo maker. My thinking partner is Cowboy — a Claude AI I've been working with as an operations coach. He handles the frameworks, case studies, and hard questions; I bring the real experiences and bad decisions. What you're reading isn't polished advice — it's two partners exploring, one conversation at a time.」
The Question I Couldn't Answer
A few sessions into our operations exploration, Cowboy hit me with a question I wasn't expecting.
"Thomas, what's your north star metric?"
I stared at the screen for a second. "Uh... users? Signups?"
Cowboy didn't buy it. "Okay, let's say 50 people sign up tomorrow. What does that tell you?"
"That... people are interested?"
"Interested in what? Signing up takes 10 seconds. It tells you they were curious enough to click a button. It doesn't tell you they'll come back. It doesn't tell you they'll post a Fix. It doesn't tell you FWTR is actually useful to anyone."
He was right. And that conversation ended up changing how I think about data entirely.
Vanity Metrics vs. Real Signals
Cowboy walked me through a distinction I'd heard before but never really internalized: vanity metrics vs. actionable metrics.
Vanity metrics are numbers that go up and make you feel good but don't tell you if your product is actually working. Page views. Signups. Social media followers. They're easy to track, easy to grow, and almost completely useless for decision-making.
Actionable metrics are numbers that, if they change, tell you something specific about what to do next. They're usually harder to track and often embarrassingly small.
"For FWTR," Cowboy said, "a signup is a vanity metric. Someone clicking 'register' doesn't mean they trust your platform or find it useful. You know what would be a real signal? A user updating the status of their Fix. Going from 🔴 stuck to 🟡 figuring it out, or from 🟡 to 🟢 fixed. That means they came back, they're using the system as intended, and they have something real to report."
We landed on this as FWTR's north star metric: weekly Fix status updates. Not signups. Not page views. The number of times someone comes back and moves their Fix forward.
I have to be honest — this number is currently zero. But at least now I know what I'm aiming at.
The AARRR Framework (And Why Most of It Doesn't Matter Yet)
Cowboy then introduced me to something called the AARRR framework — also known as "pirate metrics" because of how it sounds when you say it out loud. It breaks down a user's journey into five stages:
Acquisition — How do people find you? Activation — Do they have a good first experience? Retention — Do they come back? Revenue — Do they pay? Referral — Do they tell others?
When I first saw this, my instinct was to try to track all five. Cowboy stopped me immediately.
"Thomas, you have one Fix on your site and zero organic users. You don't need a five-layer funnel analysis. You need to know one thing: is anyone coming back? That's Retention. Everything else is either premature or irrelevant right now."
He drew it out like this:
Your current reality:
Acquisition → barely exists (no SEO, minimal Reddit presence)
Activation → unknown (nobody's tried the experience yet)
Retention → 0 (nobody to retain)
Revenue → N/A (free platform)
Referral → N/A
What to focus on: Acquisition + Activation
What to measure: did someone who arrived → actually post or engage?
Everything else: ignore until you have data
This was liberating. I'd been vaguely worried about needing analytics dashboards, conversion funnels, cohort analysis — the stuff you read about in SaaS blog posts. Cowboy basically said: "You're a solo maker with zero users. Your entire analytics stack should be: Vercel's built-in analytics for traffic, a Supabase SQL query for user actions, and a Google Sheet you update once a week. Total cost: zero dollars."
The Dropbox Lesson: You Don't Need a Product to Validate
While we were talking about metrics and validation, Cowboy brought up a story that hit differently in this context.
In 2007, Drew Houston had an idea for Dropbox — a tool that syncs files across devices. The problem was real: Houston had forgotten his USB drive on a bus and lost access to his documents. But building a working file sync tool was enormously complex — operating system integration, conflict handling, large file support, cross-platform compatibility.
Instead of spending months building, Houston recorded a 3-minute demo video showing how Dropbox would work. He posted it on Hacker News and Digg with a beta waitlist at the bottom.
The waitlist went from 5,000 to 75,000 signups overnight. No paid ads. No finished product. Just a video that demonstrated the value clearly enough that people wanted in.
"The MVP wasn't code," Cowboy pointed out. "It was a video. Houston separated validation from building. He proved demand existed before committing engineering resources."
This made me think about FWTR differently. I've been focused on building features — the Fixes system, the Experiences tab, the Insights series, the workflow filters. But have I validated that anyone actually wants this? My Reddit post about maker communities helping each other got 733 views and real engagement — that's a signal. But I haven't done a Dropbox-style focused validation test yet.
Cowboy suggested something simple: "Before building the next feature, write a post on r/SideProject describing the problem FWTR solves. Don't link to the site. Just describe the concept and ask: 'Would you use something like this?' The comments will tell you more than any analytics dashboard."
I haven't done this yet. But it's on my short list.
The Experiment Mindset
The biggest shift from these conversations wasn't a specific metric or framework. It was a mindset change.
Before: I would decide to do something (write a post, build a feature, try a new subreddit), do it, and then either feel good or feel bad about the result based on vibes.
After: Cowboy taught me to think in experiments. Every action is a test with a hypothesis, a method, and a result to evaluate.
"You're about to post on Reddit," Cowboy said one day. "What's your hypothesis?"
"Uh... that people will like it?"
"That's not a hypothesis. A hypothesis is: 'A title that leads with a specific pain point will get 3x more comments than a title that announces my product.' Now you have something to actually learn from, regardless of whether the post 'succeeds.'"
We came up with a simple format for running micro-experiments:
Hypothesis: [specific, testable prediction]
Method: [what exactly I'll do]
Signal: [what I'll measure — and what counts as strong vs. weak]
Timeframe: [how long before I evaluate]
For example:
Hypothesis: Reddit users engage more with "I'm stuck" posts than "I built X" posts
Method: Post two different titles to r/SideProject, one week apart
Signal: Strong = 10+ comments. Weak = 3-5 comments. Noise = 0-2 comments.
Timeframe: 72 hours per post
Cowboy also introduced me to ICE scoring — a way to prioritize which experiments to run first. Each idea gets three scores from 1-10:
Impact — If this works, how big is the effect? Confidence — How sure am I that it will work? Ease — How easy is it to test?
Multiply them together, rank by total score, do the highest-scoring experiments first. Simple, but it stopped me from defaulting to whatever felt most exciting (which was usually building more features — the thing I'm already good at and already spending too much time on).
My $0 Analytics Stack
One thing I appreciate about Cowboy is that he never suggests expensive tools. He knows I'm a solo maker with essentially no budget. So when we designed the tracking system for FWTR, it came out looking like this:
Traffic: Vercel Analytics (free with hosting). Shows page views, unique visitors, top pages, referral sources. Good enough to answer "is anyone showing up, and where are they coming from?"
User actions: Supabase SQL queries run manually. I can check how many Fixes were posted, how many status updates happened, how many comments were left. Not real-time, but real-time doesn't matter when you have single-digit users.
Weekly review: A Google Sheet I update every Friday. Five numbers:
- Unique visitors this week
- New Fixes posted
- Fix status updates (the north star)
- Comments/replies
- Where traffic came from
"That's it," Cowboy said. "Five numbers, once a week, in a spreadsheet. If you can't explain what happened this week in five numbers, you're either tracking too much or not tracking enough."
The key insight: at this stage, the system for reviewing data matters more than the data itself. Having a weekly habit of looking at five numbers and asking "what does this tell me?" is more valuable than a real-time dashboard with 50 metrics that I check obsessively but never act on.
Where We Are Now
Honest status:
- North star metric defined: weekly Fix status updates ✅
- Current north star value: 0
- Analytics stack set up: Vercel ✅, Supabase queries ✅, Google Sheet... still need to create it
- Experiments run: 0 formal ones (but my Reddit title variations from Episode 2 count retroactively)
- Dropbox-style validation test: not yet done
The framework is there. The habit isn't yet. Cowboy keeps nudging me: "Don't wait for perfect data to start the weekly review. Even if every number is zero, the act of sitting down on Friday and looking at zeros is important. It builds the muscle. And the first time a number moves from 0 to 1, you'll notice it immediately — because you've been watching."
What's Next
We now know what to measure and how to think about experiments. But there's a foundational problem I've been ignoring: Google can't find FWTR. Like, at all. And even when someone does arrive, do they understand what this site is within the first 10 seconds?
In Episode 5, Cowboy opens my site in an incognito browser and asks me to pretend I'm a stranger. What I see through his eyes is uncomfortable — and it forces a conversation about SEO, first impressions, and what makes someone stay instead of bounce.
This is Episode 4 of the Solo Maker Survival Guide, a 6-part series on FromWrongToRight.com. I'm a solo maker figuring out operations in real time, with Cowboy — my Claude AI partner — as my thinking companion. Not expert advice. Just two partners exploring, one conversation at a time.
How do you decide what to measure? I'd love to hear your approach — drop a Fix on FWTR or find me on Reddit (u/FlyThomasGoGoGo).
Log in to join the discussion.