AI in the classroom three years in: what’s working, and what isn’t

The first wave of stories about ChatGPT in schools was about cheating. The second wave was about bans. The third, finally, has been about practice. Three years in, the picture from working classrooms looks different from the picture in the policy debates.

A short version: the teachers getting traction with AI are the ones treating it as a thinking partner, not a content faucet. The ones banning it altogether are losing students who use it anyway, and losing the chance to teach them how to use it well.

Why prohibition is fragile

The argument for blanket prohibition was always a teaching argument: writing is thinking, and outsourcing the writing means outsourcing the thinking. We agree with that as a starting point. But the policy that follows from it — pretend the tool doesn’t exist, run the lesson as if it’s still 2021 — has a few practical problems.

Students use AI anyway. Surveys from groups like the Pew Research Center and Common Sense Media have shown that majorities of older students have used AI for school work, and the share grows each year. A classroom rule that depends on students not having access to a tool that lives in their pocket is not really a rule. It’s a fiction the room agrees to maintain.

The detection arms race is mostly a loss. The major detectors flag a meaningful share of legitimate human writing as AI-generated, particularly for English-language learners and for students who write in clean, plain styles. OpenAI itself pulled its own classifier in 2023, citing low accuracy. Penalizing the wrong students is worse than the original problem.

The ban also forfeits the curriculum. AI literacy — knowing what these tools are good and bad at, how to prompt them, how to fact-check their output, when to refuse them — is going to matter to every student we send into the workforce. The teachers banning use are skipping the lesson their students will most need.

What seems to be working in classrooms

A few patterns we keep seeing in the practitioner writing and conferences we follow.

Treat the AI as the first reader, not the last writer. Students draft on paper or in a locked editor, then paste into ChatGPT or Claude with a specific prompt: identify weak arguments, missing evidence, unclear sentences. The student then revises by hand. The cognitive work stays with the student; the AI plays the role a peer or instructor used to play in a workshop, except every student gets one. This works because the AI is genuinely good at the diagnosis-of-weakness task, less reliable at the production-of-final-text task.

Use AI for differentiation. Generating five versions of a reading at different reading levels used to take an English teacher their entire prep period. It now takes minutes. Same for math word problems written around different student interests, or vocabulary lists adjusted for English-language learners. Tools like Diffit, Brisk Teaching, and Magic School AI were built specifically around this use case, and they’re getting traction in districts because the time savings are real.

Use AI for feedback at scale. A 25-student writing class is impossible to give individualized written feedback on every draft. With careful prompting, AI can produce a useful first round of rubric-aligned feedback on clarity, evidence, and structure, and the teacher’s role shifts to triaging which students most need the human follow-up. Teachers who do this carefully report better feedback per student, not worse.

Teach prompting as a thinking discipline. The students who get the most out of these tools are the ones who think clearly about what they’re asking. Some teachers have started grading prompts and revisions, not just final outputs, making the structure of the inquiry visible.

What’s not working

A few common misuses worth naming.

AI as the entire lesson. Some districts have rolled out an “AI tutor” platform and treated it as a substitute for instruction. Students sit at chromebooks and have whatever conversation the tool wants to have. The engagement looks fine on a screen for a week and falls apart by the second month. AI is a tool inside a teacher’s lesson, not a replacement for one.

Detector-based discipline policies. Sending kids to the dean because Turnitin or GPTZero flagged a paper above some threshold is a bad system in 2026. The false positive rate is too high. Districts that have leaned heavily on detection are starting to walk it back.

Treating it as a search engine. The tools hallucinate, especially on specific facts and citations. Teachers who’ve moved past prohibition still need to teach the verification habit, every time.

The honest part

We’re three years in, not three decades. There’s some early evidence that heavy AI use during the formative period of writing instruction can hurt skill development. There’s also evidence that students with strong baseline writing skills get more out of AI than weaker writers do, which is a familiar equity story in any educational technology rollout.

The right posture in our view is engaged, careful, and honest. Use it where it helps. Don’t pretend it isn’t there. Teach students that the tool is fallible and bounded, and that the thinking is still theirs to do.