Topic Suggestion — Designing a Function That Knows What to Recommend Without Magic Numbers

Phase 4 gave the study system memory of individual problems — when each one is due for review based on SM-2. Phase 5 asks a different question: zoomed out across all topics, what should I actually be working on today?

That's a recommendation problem, and recommendation problems have a familiar failure mode: every signal you could rank by feels arbitrary, and it's tempting to bolt on a manually-maintained "priority" field to break ties. This phase is mostly about resisting that temptation and finding signals that are actually derivable from data already in the system.

The Schema Gap: You Can't Query for What Doesn't Exist

The first design problem wasn't about ranking — it was about a query that's structurally impossible with the existing schema.

One of the most useful recommendations is "topics you've never worked on." But problems.topic was a free-text enum value. If no problem with topic = "dynamic_programming" exists, there's no row to query — the topic is invisible to the database. You can't SELECT your way to a gap.

The fix is a dedicated topics table — id, name, slug, seeded upfront with the canonical topic list. Now "never worked on" becomes a standard pattern:

SELECT t.* FROM topics t
LEFT JOIN problems p ON p.topic_id = t.id
WHERE p.id IS NULL

This is a small schema change with an outsized effect: it makes an entire category of question answerable that simply wasn't representable before.

One-to-Many vs. Many-to-Many: Choosing the Simpler Model on Purpose

Some real interview problems genuinely span multiple topics — a problem might be both "graphs" and "dynamic programming." A many-to-many problem_topics join table would model that correctly.

It was considered and rejected for this phase. Most DSA problems map cleanly to one topic, and the join table would be solving a problem that doesn't exist yet in the actual data. problems.topic_id stays a single foreign key. If multi-topic problems become common enough to matter, the join table is a clean migration — but building it now would be future-proofing against a hypothetical.

This is the same instinct as the earlier SM-2 schema decision: model what the data actually needs, not what it could theoretically need.

Failing Loudly on Drift

generate_problem() looks up topic_id from the Topic enum value via SELECT id FROM topics WHERE slug = ?. If no match exists, it raises — it does not auto-create a row.

Auto-creation seems convenient, but it papers over a real bug: the Topic enum (in code) and the seeded topics table (in the database) are two independent sources of truth that can drift. If someone adds a new enum value and forgets to seed the corresponding row, auto-creation would silently produce a topic with no name and no metadata — a broken row that looks like it worked.

Raising immediately turns a silent data integrity problem into a loud, obvious one at the point where it's cheapest to fix: setup, not three weeks later when someone wonders why a topic has no display name.

Designing `suggest_topics()`

The function signature is deliberately simple:

def suggest_topics(limit: int = 3) -> list[TopicSuggestion]:

class TopicSuggestion(TypedDict):
    id: int
    name: str
    slug: str
    problems: list[ProblemRow]
    explanation: str

A topic name alone isn't actionable. Each suggestion bundles up to 2 concrete problems to work on and an explanation of why this topic was surfaced now. The explanation matters as much as the ranking — "dynamic programming: you haven't started this topic yet" is something a user can act on; a bare topic name in a list is not.

Three Signals, and the Honest Admission That the Order Is a Judgment Call

The function checks three signals in priority order:

Never worked on — zero rows in problems for this topic (the LEFT JOIN pattern from earlier)
Low average score — topics where average session score is lowest
Overdue for review — topics with problems past next_review_date, ranked by how overdue the worst one is

The ordering reflects a judgment call: coverage gaps matter most (you can't improve at something you haven't tried), then struggling topics, then scheduled maintenance. There's no objectively correct order here, and the notes say so directly — this is a reasonable default that can be revisited, not a derived truth.

That kind of explicit acknowledgment is worth more than it looks. A system that ranks by an unstated, unexamined priority order is harder to debug and harder to adjust later, because nobody documented why the order is what it is.

Rejecting the Manual Priority Field — Twice

For signal 1, an early idea was a manual "priority" or "interview frequency" column on topics — weighting "never worked on" suggestions by how often a topic actually shows up in interviews.

This was rejected for the same reason the SM-2 phase rejected manual scheduling fields: it requires ongoing manual upkeep and goes stale. For topics with genuinely no differentiating signal, an arbitrary tiebreaker — insertion order — is fine, because there's no real signal being discarded by using it. Adding a field that someone has to remember to update is worse than admitting the tiebreaker is arbitrary.

This is a recurring theme across the project: anywhere a manually-maintained field is proposed as a fix, the question is whether it's actually encoding a real signal or just deferring the discomfort of an arbitrary choice. Usually it's the latter.

When a Pure Function Should Stay Pure

Signal 1 topics — by definition — have zero rows in problems. So what goes in the problems: [] list for those suggestions?

The tempting fix is to call generate_problem() inline, backfilling a problem on the spot so the suggestion is immediately actionable. This was considered and rejected: it would turn suggest_topics() from a pure read function into something that calls the LLM and writes to the database as a side effect. That's a different function with a different contract — and a much bigger scope than "suggest topics."

The decision: return [] for signal 1 topics, and let the caller decide whether to call generate_problem() separately. Keeping read and write concerns in separate functions keeps both independently testable and keeps suggest_topics() fast and side-effect-free.

Extracting the Shared Pattern

Signals 2 and 3 both need the same thing: "up to 2 problems for this topic where next_review_date <= today, most overdue first." The first implementation was two nearly-identical inline loops — one per signal.

That duplication got extracted into _update_topics_with_problems(connection, topics, limit=2), a helper that mutates the TopicSuggestion list in place rather than returning a new one — since the dicts are already mutable references, an in-place mutation is simpler than threading a return value through.

The N+1 Query: An Accepted Tradeoff, Not an Oversight

Fetching due problems per-topic in a loop is N+1 — one query per topic, rather than a single windowed query using ROW_NUMBER() OVER (PARTITION BY topic_id ...).

For a personal tool where limit is capped at a handful of topics, the N+1 cost is negligible — a few extra queries on a SQLite database measured in milliseconds. The windowed-query version would be more "correct" in a scale sense, but it adds real SQL complexity for a benefit that doesn't exist at this scale.

This is worth stating explicitly because "N+1 queries" is often treated as an automatic red flag in code review. It's a red flag when N is large or growing. Here, N is bounded by limit and small by construction — the tradeoff is genuinely a non-issue, and saying so explicitly is more useful than either ignoring it or over-engineering around it.

Deduplication, Early Exit, and a Hashability Bug

A topic could theoretically satisfy multiple signals. The combination logic iterates signal 1 → 2 → 3, tracks seen ids in a set, skips duplicates, and stops once result reaches limit. The explanation shown is whichever signal first matched — the highest-priority one.

The first attempt at deduplication used dict.fromkeys([*signal1, *signal2, *signal3]), expecting it to dedupe while preserving order. This fails because TypedDicts are plain dicts at runtime, and dicts aren't hashable — they can't be dict keys. The fix was an explicit loop with a set of seen IDs.

There's also an early-exit optimization: if signal 1 alone already meets limit, signals 2 and 3 are skipped entirely. Signal 3 is skipped if len(signal1) + len(signal2) >= limit. This sum-based check is safe specifically because signals 1 and 2 are mutually exclusive by construction — signal 1 requires zero problems rows, signal 2 requires sessions rows (which require problems rows), so no topic can appear in both and no double-counting is possible.

Python Syntax Notes Worth Remembering

Dict spread for building suggestions:

{**dict(row), "problems": [], "explanation": "..."}

Same concept as {...obj} in JS — unpacks key/value pairs into a new dict, with later keys overriding on collision.

What's Still Open

avg_score and max_overdue are rounded for display but otherwise unvalidated — no handling for unusual values like negative deltas if a next_review_date ends up in the future but gets queried anyway. Signal 1's tiebreaker remains arbitrary insertion order, which is acceptable given there's no real signal to use instead. And main.py wiring plus the seed data reset — carried over from Phase 4 — are still outstanding before any of this is user-facing.

None of these are blocking. They're documented so the next phase starts with full context instead of rediscovering the same edge cases.

Topic Suggestion — Designing a Function That Knows What to Recommend Without Magic Numbers

The Schema Gap: You Can't Query for What Doesn't Exist

One-to-Many vs. Many-to-Many: Choosing the Simpler Model on Purpose

Failing Loudly on Drift

Designing `suggest_topics()`

Three Signals, and the Honest Admission That the Order Is a Judgment Call

Rejecting the Manual Priority Field — Twice

When a Pure Function Should Stay Pure

Extracting the Shared Pattern

The N+1 Query: An Accepted Tradeoff, Not an Oversight

Deduplication, Early Exit, and a Hashability Bug

Python Syntax Notes Worth Remembering

What's Still Open

Comments

Study Buddy Project

Adding Spaced Repetition to an LLM Study System — SM-2, Schema Design, and a Scoring Problem Worth Solving

More from this blog

Cost & Latency Tracking — What the Token Counts Were Telling Me All Along

Error Handling in LLM Systems — Three Categories, One Decision Tree

Streaming Structured Output — Incremental JSON Rendering Without a Parser

Evals — Why a Bad Eval Is Worse Than No Eval

Command Palette

The Schema Gap: You Can't Query for What Doesn't Exist

One-to-Many vs. Many-to-Many: Choosing the Simpler Model on Purpose

Failing Loudly on Drift

Designing suggest_topics()

Three Signals, and the Honest Admission That the Order Is a Judgment Call

Rejecting the Manual Priority Field — Twice

When a Pure Function Should Stay Pure

Extracting the Shared Pattern

The N+1 Query: An Accepted Tradeoff, Not an Oversight

Deduplication, Early Exit, and a Hashability Bug

Python Syntax Notes Worth Remembering

What's Still Open

Comments

Study Buddy Project

Adding Spaced Repetition to an LLM Study System — SM-2, Schema Design, and a Scoring Problem Worth Solving

More from this blog

Designing `suggest_topics()`