AI History Chat Tools, Compared: What Works in Classrooms

By Stas Shakirov, Founder humy.aiMay 19, 2026

A history chat is not a history chat is not a history chat. The category gets used to describe five very different things in 2026: discipline-specific platforms built around primary-source-grounded figure conversations (Humy), consumer apps that role-play historical figures with no sourcing layer (Hello History), general-purpose chatbots configured for K-12 history use (Character.AI), bespoke custom GPTs built by individual teachers (ChatGPT custom GPTs), and teacher-controlled AI spaces (SchoolAI Spaces). Each of those is a different tool for a different problem, and the marketing flattens the differences in ways that are not helpful for a teacher trying to make an actual classroom decision.

This piece is the working teacher’s comparison. The focus is what each tool does inside a real classroom, what the failure modes are, and which problem each one is actually solving. We make one of the tools on this list, so the frame is opinionated, but we will tell you where each competitor does something we do not.

The five categories

The fastest way to sort the noise is to put each tool into the architectural category it actually occupies, not the category its marketing implies.

Discipline-specific, source-grounded AI history platforms (Humy). Designed from the ground up around primary-source-grounded conversations with historical figures, with teacher controls, district-signable DPAs, and C3-aligned activity scaffolding. The figure library spans more than 1,200 figures from across the K-12 social studies curriculum. The technical pattern is retrieval-augmented generation against curated documentary records, which is the pattern the 2025 Applied Sciences survey of RAG chatbots in education identifies as the one that solves the hallucination problem for educational deployments.

Consumer figure-roleplay apps (Hello History). Designed for casual consumer use, not for classrooms. No source corpus the figures are anchored to, no teacher controls, no district privacy agreements. The Jerusalem Post documented the predictable failure mode: the app’s AI Hitler character was reported to deny responsibility for the Holocaust. This is the category that produces the worst public failures, and it is not classroom-safe by design.

General-purpose AI chatbots (Character.AI, raw ChatGPT). Consumer or general-developer-facing platforms where a user can prompt the model to roleplay a historical figure. Character.AI has faced repeated safety concerns around its broad usage patterns. Raw ChatGPT can produce plausible historical-figure prose, but with no anchoring to a documentary record the student or teacher can verify. Both are general tools applied to a discipline-specific task they were not built for.

Teacher-built custom GPTs (ChatGPT GPTs). A teacher with an OpenAI account can build a custom GPT configured with a system prompt, uploaded primary sources, and topic guardrails for a specific unit. This is a real and useful pattern for technically inclined teachers, with two important limitations. First, the resulting tool lives inside a consumer product (ChatGPT), with a consumer-grade privacy model that may not meet district DPA expectations. Second, building, maintaining, and supporting custom GPTs across an entire department of teachers is more engineering work than most schools can sustain.

Teacher-controlled AI “Spaces” (SchoolAI Spaces). SchoolAI’s product organizes a class activity into a Space the teacher configures, including for figure-style conversations. The platform has done public, thoughtful work on Holocaust-era figure handling (see SchoolAI’s own writing on AI ethics in classrooms) and earns a place in the category for K-12 use. The depth of the underlying historical corpus is shallower than a discipline-specific platform’s, but the teacher-control story is real.

Those five categories produce very different classroom experiences. The right tool depends on what classroom problem you are actually solving.

What each tool actually does well

Inside the architectural categories above, each tool has a place where it earns its keep.

Humy is the right tool when the classroom problem is sustained, source-grounded inquiry across a full unit or year of social studies. The figure library is built for that purpose, the teacher controls map onto how a real teacher needs to constrain a unit on the Holocaust or Reconstruction or the partition of India, and the privacy posture is set up to clear a K-12 district review. Roger Campbell, a 7th-grade World History teacher in Lancaster County, Pennsylvania, describes the chat as a place where students learn to “formulate thoughtful follow-up questions rather than just interrogating” historical figures, which is the C3 Inquiry Arc move the discipline is trying to teach.

Hello History does not earn a place in the K-12 category. Its strength, for the consumer audience it was built for, is novelty and accessibility. Its public failure on Holocaust-era handling is exactly the failure mode that disqualifies it from classroom use. If a student or parent ever asks why Hello History is not on your platform list, the answer is a one-link reply to the Jerusalem Post reporting.

Character.AI similarly does not earn a place. The platform’s strength is creative roleplay across a broad surface area; the K-12 history use case is far away from where the tool’s actual design center sits, and the safety story does not hold up at scale. If a teacher in your department is using Character.AI in class, the conversation to have with them is about what classroom-safe alternative will do the same instructional move with the same low setup cost.

A teacher-built custom GPT can be useful for a single unit when the teacher has both the technical fluency to build one and the time to maintain it. The classroom problem it solves is narrow: a specific unit, a specific set of figures, a specific document set. The cost is that the tool does not scale beyond that single teacher and that single unit, and the privacy story does not survive a district review. For a one-off enrichment activity, fine. As a department-wide pattern, no.

SchoolAI Spaces earns a real place in K-12 use, with the caveat that its center of gravity is broader teacher-controlled AI activities rather than deep social-studies figure work specifically. A district running SchoolAI for general classroom AI work can use it for some history conversations effectively. For the depth a serious social studies department wants on figure conversations, a discipline-specific platform fills the gap.

The procurement matrix, narrowed

For a teacher or department chair specifically evaluating AI history chat tools, four questions narrow the decision quickly.

What corpus is each figure grounded in, and can students and teachers see it? Humy publishes its grounding pattern explicitly; SchoolAI’s Spaces can be configured with teacher-supplied documents; Character.AI and Hello History are operating without an exposed corpus.

What teacher controls govern sensitive topics? Humy’s controls are figure-by-figure and unit-by-unit; SchoolAI’s Space-level controls are real but at a coarser grain; the consumer apps offer essentially none.

What is the privacy story your district can sign? Humy’s DPA is available through the SDPC Resource Registry ; SchoolAI has a credible privacy posture; ChatGPT custom GPTs inherit a consumer-grade privacy model that is hard to defend in a K-12 review; Character.AI and Hello History do not have credible K-12 privacy stories at all.

What is the deployment friction inside the LMS you already run? Humy’s link-based deployment into Google Classroom or Canvas is a paste-and-go workflow; SchoolAI deploys similarly with some additional configuration; the others vary widely and the consumer apps are nowhere near a normal LMS workflow.

Four answers in, the field has narrowed itself.

A note on responsible use across the category

Whichever tool ends up in your classroom, three practices keep AI history chat use defensible.

The teacher sets the framing for every unit, particularly on sensitive history. UNESCO’s report on AI and Holocaust education is direct: AI interaction with this material has to be surrounded by survivor testimony, archival evidence, and historical context, not used as a substitute for them. The USHMM teaching materials are the right pairing for any Holocaust unit; the Yad Vashem education resources are another. A figure chat is a practice space, not a replacement for the documentary record.

Students cite the chat the same way they cite any other source. If a student includes an insight in their essay that came from a figure conversation, they should be able to point at the exchange and at the underlying primary source the figure was anchored to. That habit alone improves how the discipline carries forward into the writing.

The teacher reviews the chat alongside the student work. The chat transcript is formative-assessment data, not a substitute for grading. John Hattie’s Visible Learning synthesis ranks feedback in the top ten influences on student achievement (effect size 0.73), and the chat transcript is one of the most useful inputs into that feedback that has come into the discipline in a decade.

The recommendation

For a K-12 history department evaluating AI history chat tools today, the working order is:

A discipline-specific, source-grounded platform (Humy) is the right primary tool for sustained classroom use, because the architecture fits the discipline.

SchoolAI is a reasonable secondary tool if the district has it already for general classroom AI work, with the caveat that the social studies depth is shallower.

A teacher-built custom GPT can fill a specific unit-level gap for technically fluent teachers, with privacy and sustainability caveats.

The consumer apps (Hello History, Character.AI) and raw ChatGPT belong off the platform list for K-12 history use.

If you want to test the recommendation in your own classroom on a unit you are teaching, try Humy free and use it on one lesson with one section this month. The architectural argument either translates into actual classroom value on your material, or it does not. The fastest way to decide is on a real lesson.