March 7, 2026·5 min read

The Best AI Tutor Refuses to Answer Your Question

A Wisconsin experiment found that an AI chatbot designed to ask questions instead of giving answers produced the highest exam scores — but only when paired with peer discussion.

By Harry Wenham

education ai-tutoringlearning-scienceedtech higher-education

Three economists at the University of Wisconsin, La Crosse built an AI chatbot that refuses to answer students' questions. When a student asks why lower prices increase spending, it doesn't explain. It asks: "What happens to your purchasing power when prices fall?" Then it waits.

The chatbot is called Macro Buddy. And it just produced the highest exam scores in the class.

The experiment nobody expected

Ninety percent of U.S. college students already use generative AI — for drafting assignments, clarifying concepts, and sometimes just getting answers fast. The debate's been stuck on a binary: ban AI or allow it. Macro Buddy sidesteps that entirely. What if AI's job isn't to know things, but to make you know things?

The researchers — Saharnaz Nedjat-Haiem, Shishir Shakya, and their colleague at UW-La Crosse — took 140 undergraduates in a macroeconomics course and split them into four groups after their first exam. One group studied alone. One studied in groups. One studied alone with Macro Buddy. One studied in groups with Macro Buddy.

Same course materials. Same exams. In-person, closed-book, no AI allowed during testing. Whatever students learned, they had to prove on their own.

The Socratic chatbot

Macro Buddy runs on ChatGPT's custom GPT feature with one critical modification: web access is turned off. It can only draw from the instructor's lecture transcripts, slides, and homework questions. No Wikipedia rabbit holes. No outside sources. Just the course material, delivered through questions instead of answers.

This is the Socratic method, automated. A student types a question. Macro Buddy responds with a better question. The student has to connect concepts, explain their reasoning, build the answer themselves — step by step, in their own words.

It sounds annoying. It works.

What the scores showed

By the third exam, the pattern was clear. Students who used Macro Buddy plus group discussion earned the highest average scores. Students who used Macro Buddy alone also outperformed those who studied without it.

The combination mattered. AI alone helped. Peers alone helped. Both together helped most.

This tracks with a larger finding. A 2025 Harvard study published in Nature Scientific Reports tested an AI tutor against traditional active-learning classrooms — the gold standard in pedagogy. Students using the AI tutor scored a median of 4.5 on post-tests versus 3.5 for the classroom group. Learning gains more than doubled. Students also reported feeling more engaged.

The other side of the coin

Here's the part that makes this complicated. A separate study published in Psychology Today in February 2026 found that students who used AI freely — as an answer machine, not a reasoning partner — scored 17% lower on quizzes than students who didn't use AI at all. They weren't slower. They just learned less.

The culprit: cognitive offloading. When a machine does your thinking, your brain doesn't bother encoding the information. You feel productive. You're actually getting dumber.

So the same technology produces opposite results depending on one design choice: does it give you answers, or does it make you find them?

The $17 million question

This distinction matters because institutions are making enormous bets right now. The California State University system — the largest public university system in the U.S. with 500,000 students — signed a $17 million deal with OpenAI to give every student access to ChatGPT Edu through July 2026. It's the single largest ChatGPT deployment in the world.

But ChatGPT Edu isn't Macro Buddy. It answers questions. It drafts essays. It summarises readings. Whether that helps students learn or just helps them finish assignments faster is the open question Cal State is running at industrial scale.

A Columbia University freshman named Maximilian Milovidov, writing for NPR this week, described taking a course called "Writing AI" — the only class on campus where AI wasn't banned but required. Students brought their own outlines, fed drafts into chatbots, documented every suggestion, and explained why they accepted or rejected each one.

His professor called it the "friend test": you'd ask a friend for feedback on a paper, but you wouldn't make them write it.

The design gap

The evidence is piling up in one direction. AI that does the work for you makes you worse at the work. AI that makes you do the work yourself — by questioning, prodding, refusing to just hand over the answer — makes you better.

This is obvious in hindsight. It's how every good human tutor works. The best teachers don't lecture — they ask questions until the student arrives at understanding on their own. The breakthrough isn't that AI can teach. It's that AI can be annoying in exactly the right way.

But most AI tools aren't built like this. ChatGPT, Claude, Gemini — they're designed to be helpful, which means they answer your question as completely and quickly as possible. That's great for productivity. It's potentially terrible for learning.

The gap between "AI that helps you work" and "AI that helps you learn" is the most important design problem in education right now. And almost nobody is talking about it.

What this means

Macro Buddy cost essentially nothing to build. Three professors, one custom GPT, a set of lecture transcripts. No $17 million contract. No institutional deal. Just a decision about what the AI should refuse to do.

That's the insight buried in a macroeconomics class in La Crosse, Wisconsin: the most effective AI tutor is the one that won't give you what you asked for.

Every university deploying AI right now faces the same fork. One path leads to students who finish assignments faster. The other leads to students who actually understand what they're studying. The technology is identical. The difference is a single design choice — and right now, the default setting is wrong.

Sources & Verification

🟢Confirmed— Verified by 3+ independent sources across multiple regions

Based on 5 sources from 2 regions

North AmericaInternational

The ConversationNorth America
NPRNorth America
Nature Scientific Reports (Harvard Study)International
Psychology TodayNorth America
FortuneNorth America

Think you know today's news?

Take the daily quiz — 5 questions, 60 seconds

→

Keep Reading

Mar 3·3 min

China Banned Human Tutors. AI Replaced Them Overnight.

China's 2021 tutoring ban destroyed a $100B industry. Now parents are using DeepSeek and Doubao as free AI tutors while US schools debate banning chatbots.

Feb 27·3 min

Fifteen Minutes a Day Just Solved a Reading Crisis. Then the Money Disappeared.

Johns Hopkins research shows 15 min/day virtual tutoring took first graders from 6% to 48% reading proficiency. But ESSER funding just expired.

Mar 6·3 min

Countries Are Racing to Replace Teachers With AI Tutors. The Evidence Isn't Ready.

El Salvador, Kazakhstan, and 8 more nations are deploying AI tutors at national scale. One Harvard study supports it. That's about it.

Explore Perspectives

Artificial Intelligence Technology Education

Get this delivered free every morning

The daily briefing with perspectives from 7 regions — straight to your inbox.

Get the daily briefing free

← All articles

Loading…