The Best AI Tutor Refuses to Answer Your Question
A Wisconsin experiment found that an AI chatbot designed to ask questions instead of giving answers produced the highest exam scores — but only when paired with peer discussion.
Three economists at the University of Wisconsin, La Crosse built an AI chatbot that refuses to answer students' questions. When a student asks why lower prices increase spending, it doesn't explain. It asks: "What happens to your purchasing power when prices fall?" Then it waits.
The chatbot is called Macro Buddy. And it just produced the highest exam scores in the class.
The experiment nobody expected
Ninety percent of U.S. college students already use generative AI — for drafting assignments, clarifying concepts, and sometimes just getting answers fast. The debate's been stuck on a binary: ban AI or allow it. Macro Buddy sidesteps that entirely. What if AI's job isn't to know things, but to make you know things?
The researchers — Saharnaz Nedjat-Haiem, Shishir Shakya, and their colleague at UW-La Crosse — took 140 undergraduates in a macroeconomics course and split them into four groups after their first exam. One group studied alone. One studied in groups. One studied alone with Macro Buddy. One studied in groups with Macro Buddy.
Same course materials. Same exams. In-person, closed-book, no AI allowed during testing. Whatever students learned, they had to prove on their own.
The Socratic chatbot
Macro Buddy runs on ChatGPT's custom GPT feature with one critical modification: web access is turned off. It can only draw from the instructor's lecture transcripts, slides, and homework questions. No Wikipedia rabbit holes. No outside sources. Just the course material, delivered through questions instead of answers.
This is the Socratic method, automated. A student types a question. Macro Buddy responds with a better question. The student has to connect concepts, explain their reasoning, build the answer themselves — step by step, in their own words.
It sounds annoying. It works.
What the scores showed
By the third exam, the pattern was clear. Students who used Macro Buddy plus group discussion earned the highest average scores. Students who used Macro Buddy alone also outperformed those who studied without it.
The combination mattered. AI alone helped. Peers alone helped. Both together helped most.
This tracks with a larger finding. A 2025 Harvard study published in Nature Scientific Reports tested an AI tutor against traditional active-learning classrooms — the gold standard in pedagogy. Students using the AI tutor scored a median of 4.5 on post-tests versus 3.5 for the classroom group. Learning gains more than doubled. Students also reported feeling more engaged.
The other side of the coin
Here's the part that makes this complicated. A separate study published in Psychology Today in February 2026 found that students who used AI freely — as an answer machine, not a reasoning partner — scored 17% lower on quizzes than students who didn't use AI at all. They weren't slower. They just learned less.
The culprit: cognitive offloading. When a machine does your thinking, your brain doesn't bother encoding the information. You feel productive. You're actually getting dumber.
So the same technology produces opposite results depending on one design choice: does it give you answers, or does it make you find them?
The $17 million question
This distinction matters because institutions are making enormous bets right now. The California State University system — the largest public university system in the U.S. with 500,000 students — signed a $17 million deal with OpenAI to give every student access to ChatGPT Edu through July 2026. It's the single largest ChatGPT deployment in the world.
But ChatGPT Edu isn't Macro Buddy. It answers questions. It drafts essays. It summarises readings. Whether that helps students learn or just helps them finish assignments faster is the open question Cal State is running at industrial scale.
A Columbia University freshman named Maximilian Milovidov, writing for NPR this week, described taking a course called "Writing AI" — the only class on campus where AI wasn't banned but required. Students brought their own outlines, fed drafts into chatbots, documented every suggestion, and explained why they accepted or rejected each one.
His professor called it the "friend test": you'd ask a friend for feedback on a paper, but you wouldn't make them write it.
The design gap
The evidence is piling up in one direction. AI that does the work for you makes you worse at the work. AI that makes you do the work yourself — by questioning, prodding, refusing to just hand over the answer — makes you better.
This is obvious in hindsight. It's how every good human tutor works. The best teachers don't lecture — they ask questions until the student arrives at understanding on their own. The breakthrough isn't that AI can teach. It's that AI can be annoying in exactly the right way.
But most AI tools aren't built like this. ChatGPT, Claude, Gemini — they're designed to be helpful, which means they answer your question as completely and quickly as possible. That's great for productivity. It's potentially terrible for learning.
The gap between "AI that helps you work" and "AI that helps you learn" is the most important design problem in education right now. And almost nobody is talking about it.
What this means
Macro Buddy cost essentially nothing to build. Three professors, one custom GPT, a set of lecture transcripts. No $17 million contract. No institutional deal. Just a decision about what the AI should refuse to do.
That's the insight buried in a macroeconomics class in La Crosse, Wisconsin: the most effective AI tutor is the one that won't give you what you asked for.
Every university deploying AI right now faces the same fork. One path leads to students who finish assignments faster. The other leads to students who actually understand what they're studying. The technology is identical. The difference is a single design choice — and right now, the default setting is wrong.
Sources & Verification
Based on 5 sources from 2 regions
- The ConversationNorth America
- NPRNorth America
- Nature Scientific Reports (Harvard Study)International
- Psychology TodayNorth America
- FortuneNorth America
Keep Reading
China Banned Human Tutors. AI Replaced Them Overnight.
China's 2021 tutoring ban destroyed a $100B industry. Now parents are using DeepSeek and Doubao as free AI tutors while US schools debate banning chatbots.
Fifteen Minutes a Day Just Solved a Reading Crisis. Then the Money Disappeared.
Johns Hopkins research shows 15 min/day virtual tutoring took first graders from 6% to 48% reading proficiency. But ESSER funding just expired.
Countries Are Racing to Replace Teachers With AI Tutors. The Evidence Isn't Ready.
El Salvador, Kazakhstan, and 8 more nations are deploying AI tutors at national scale. One Harvard study supports it. That's about it.
Explore Perspectives
Get this delivered free every morning
The daily briefing with perspectives from 7 regions — straight to your inbox.