Academics Be Aware: Why AI Writes so Badly

Academics Be Aware: Why AI Writes so Badly

a header image depicting academics be aware

As professors return from winter break to face another semester of grading, a new anxiety pervades faculty lounges: How do we handle the tsunami of AI-generated essays? The question assumes AI writes well enough to fool experienced academics. At Unemployed Professors, where we work with actual scholars every day, we need to share an important truth that might ease your minds: AI writes badly. Not just detectably badly—fundamentally, structurally, intellectually badly in ways that should reassure rather than alarm thoughtful educators.

This isn’t a temporary problem that will disappear with GPT-5 or GPT-6. The issues we’re about to explore are baked into how large language models work. Understanding why AI writes so badly will help you evaluate student work more confidently, design better assignments, and maintain academic standards without paranoia. Let’s examine what AI writing quality actually looks like when you know what to look for.

The Fundamental Problem: AI Doesn’t Understand

Here’s the core issue with AI writing quality that many discussions miss: large language models don’t understand anything they write about. They predict word sequences based on statistical patterns in training data. This fundamental limitation manifests in specific, identifiable ways throughout AI-generated text.

When ChatGPT writes about Foucault’s concept of biopower, it’s not engaging with Foucault’s argument. It’s arranging words that frequently appear together in texts about Foucault and biopower. The result sounds superficially knowledgeable but lacks genuine engagement with ideas.

Academic writing requires more than arranging appropriate vocabulary. It requires understanding claims well enough to evaluate them, synthesizing contradictory sources, applying theories to novel situations, and developing original arguments. These cognitive tasks require actual comprehension, which AI writing limitations prevent.

Consider what happens when you ask ChatGPT to analyze a specific passage. It will identify themes, note literary devices, and produce text that resembles analysis. But watch what happens when you push deeper. Ask it to explain how a metaphor functions within the work’s larger argument about a specific philosophical problem. The AI academic writing weaknesses become apparent: it either produces generic observations that could apply to any metaphor, or it generates plausible-sounding but ultimately empty claims about connections it cannot actually trace.

Experienced academics recognize this immediately. You’ve spent years learning to distinguish between students who genuinely understand material and students who’ve memorized enough to fake it on exams. AI-generated essays are the ultimate performance of memorization without comprehension. The words are right, but the thinking isn’t there.

A newspaper-editorial aesthetic with a torn paper edge, crimson red accents, and a vintage feel. It breaks down the 5 fundamental flaws of AI writing with numbered cards and real examples.

Why AI Essays Are Bad: The Hallucination Problem

One of the most spectacular AI writing flaws is “hallucination”—the generation of false information presented with confidence. AI hallucinations in academic contexts take several forms, all of them problematic.

First, there’s citation hallucination. Ask AI to write an essay with academic sources, and it will happily cite articles that don’t exist. It generates plausible-looking citations—real author names, appropriate journal titles, believable publication years—that lead nowhere. The sources are fabricated from patterns in real citations, creating believable-looking bibliography entries for nonexistent scholarship.

This happens because AI doesn’t actually access and read sources. It predicts what citations should look like based on training data. When it needs to cite something about, say, postcolonial theory’s application to contemporary media, it generates author names and titles that sound right without checking if they exist.

Second, there’s factual hallucination. AI will confidently state incorrect dates, attribute quotes to wrong authors, misidentify theories, and fabricate historical events. These aren’t occasional errors—they’re systematic problems arising from the probabilistic nature of language generation.

We’ve seen AI-generated essays claim that Foucault wrote about social media (he died in 1984), that certain Supreme Court cases were decided in impossible years, and that historical figures met who never could have met. The errors are often subtle enough that they might slip past quick reading but glaring to anyone who actually knows the subject matter.

Third, there’s conceptual hallucination—misrepresenting theories, arguments, or positions in ways that reveal lack of understanding. AI might claim a theorist argued the opposite of their actual position, or attribute ideas to the wrong school of thought, or conflate distinct concepts that happen to use similar terminology.

These AI writing quality issues should be deeply concerning to anyone considering using AI-generated content in academic contexts. They also provide clear signals for academics evaluating student work: citations that don’t check out, factual errors that reveal no real engagement with material, and conceptual confusions that genuine students rarely make.

The Absence of Original Argument

Perhaps the clearest sign that AI writes badly is its inability to develop original arguments. This is worth examining in detail because it goes to the heart of what academic writing is supposed to accomplish.

Academic essays require thesis statements that make non-obvious claims, then support those claims through careful reasoning and evidence. The thesis should be debatable—something a knowledgeable person could disagree with—and the argument should advance understanding rather than rehashing common knowledge.

AI cannot do this. By design, large language models generate the most statistically probable next words. This means they default to common positions, widely-held interpretations, and safe claims that appear frequently in training data. They cannot take genuinely original stances because originality is, by definition, unpredictable and therefore improbable from the model’s perspective.

When you prompt ChatGPT to argue for a specific interpretation of a text, it will comply. But examine the argument carefully and you’ll notice it lacks conviction. The reasoning is always hedged, qualified, and reversible. Why? Because the training data contains multiple perspectives, and the model can’t commit to one—it can only synthesize what others have said.

This produces what we call “thesis simulacra”—statements that look like thesis statements but actually make no real claims. “Shakespeare’s Hamlet explores the complexity of human nature through the protagonist’s internal struggles” sounds like a thesis, but it’s completely empty. Every play explores human nature through character struggles. The statement is so obvious it’s meaningless.

Compare this to actual student writing, even from mediocre students. Real students commit to positions, even wrong ones. They take stances, even indefensible ones. They have actual thoughts, even confused ones. This intellectual risk-taking, this willingness to be wrong in pursuit of being right, is fundamentally human and fundamentally absent from AI-generated text.

Academics spend their careers teaching students to develop arguments. You can instantly recognize when a student has actually thought about their topic versus when they’re going through motions. AI writing problems for professors include this: it’s all motions, no thought. Every sentence is motion. None of it thinks.

"The Professor's Detection Toolkit" - A professional academic style with a folded corner detail, using deep blues and gold accents. It provides 6 concrete ways professors can spot AI-generated work with red flag warnings for each.

Why AI Cannot Write Well: The Synthesis Problem

Academic writing requires synthesizing multiple sources into coherent arguments that advance understanding. This is extraordinarily difficult for AI, and the failure modes are distinctive.

When students struggle with synthesis, they typically list sources sequentially: “Smith argues X. Jones argues Y. Brown argues Z.” It’s basic, but it shows they’ve read the sources. They just haven’t figured out how to put them in conversation yet.

AI-generated essays often exhibit superficially better synthesis—sources appear integrated into flowing paragraphs. But the integration is hollow. The AI hasn’t read these sources, doesn’t understand their arguments, and cannot identify genuine relationships between them.

What emerges is synthesis theater: the performance of integration without actual intellectual work. Sources are cited in ways that sound connected but reveal no understanding of how their arguments actually relate. Contradictions go unaddressed. Tensions aren’t explored. The hard work of real synthesis—identifying where scholars disagree and why, tracing how arguments build on or challenge each other, positioning one’s own argument within scholarly conversation—simply doesn’t happen.

Experienced academics can spot this instantly. When you’ve spent years doing real synthesis, you recognize its absence. The AI essay touches all the right sources but doesn’t make them speak to each other. It’s like watching someone mime a conversation—the gestures are right, but there’s no actual communication.

This is one of the clearest AI academic writing weaknesses. The technology can simulate the appearance of scholarly engagement without the substance. And once you know this, student work that demonstrates real engagement—however imperfect—becomes easily distinguishable from AI’s polished but empty performance.

ChatGPT Poor Quality: The Depth Problem

Academic writing requires depth—the ability to develop ideas beyond surface-level observations. This is where ChatGPT poor quality becomes most apparent to specialists in any field.

When AI writes about a topic, it accesses surface-level information readily available across its training data. It can produce competent introductory-level content on most subjects. But ask it to engage deeply with complex theoretical debates, trace subtle distinctions between thinkers, or apply abstract concepts to specific cases, and the limitations emerge.

The problem is hierarchical knowledge. Experts organize information in rich conceptual structures with multiple levels of abstraction. They understand not just facts but relationships between facts, not just theories but tensions between theories, not just arguments but the intellectual contexts that make arguments meaningful.

AI flattens this hierarchy. Everything sits at roughly the same level of abstraction. The writing can mention Derrida’s différance and cite deconstruction’s relationship to structuralism, but it cannot actually work with these concepts the way someone who understands them would. It’s like describing a 3D object using only 2D representations—the information is there, but the dimensionality isn’t.

Faculty members recognize this immediately in their areas of expertise. The AI essay on quantum mechanics gets the terminology right but uses concepts in ways that reveal no understanding of the physics. The philosophy paper cites appropriate sources but misses subtle distinctions that any student who’d done the reading would grasp. The literature analysis identifies themes without understanding how they function in the work’s larger architecture.

This depth problem also manifests in AI’s inability to handle complexity and ambiguity. Academic topics are rarely simple, and honest engagement often requires holding multiple perspectives simultaneously, acknowledging tensions without resolving them, or recognizing that evidence points in conflicting directions.

AI defaults to false clarity. It smooths over complications, resolves ambiguities prematurely, and presents complex issues as more straightforward than they are. This makes problems with AI-generated essays obvious to experts: the writing is too certain, too neat, too resolved. It lacks the productive confusion that characterizes genuine intellectual engagement with difficult material.

The Style Problem: Generic Academic Voice

One of the most reliable indicators of AI writing quality issues is the distinctive generic academic voice that large language models produce. If you’ve read ChatGPT essays, you’ve encountered this: formally correct, superficially sophisticated, and utterly soulless.

This voice has specific characteristics. It favors passive constructions and nominalizations that create distance from claims. It uses hedge phrases constantly—”it could be argued,” “this might suggest,” “potentially indicates”—because the model is predicting probable word sequences rather than making actual arguments.

It employs what we call “false sophistication”—unnecessarily complex sentence structures and vocabulary that sound academic but don’t add meaning. Real academic writing uses complex structures when complexity is necessary for precision. AI writing uses complexity for its own sake, having learned that academic text tends toward formal register without understanding when formality serves purpose.

The voice is also remarkably consistent across topics and disciplines. Real academics write differently in different fields—philosophy sounds different from sociology, literary criticism from political science. These disciplinary voices reflect different intellectual traditions, methodological approaches, and rhetorical conventions.

AI produces one voice: generic academic. It lacks the specific markers that characterize writing in particular fields because it’s averaging across all academic writing in its training data. The result sounds academic in a general way but not like actual writing from specific disciplines.

Experienced faculty can hear this generic quality immediately. You’ve been reading student work and scholarship in your field for years. You know how arguments are constructed, how evidence is presented, how sources are engaged in your discipline. AI writing sounds wrong because it’s not writing in your field—it’s writing in a simulation of academic writing generally.

Why Professors Can Spot AI Writing: The Pattern Recognition Advantage

Here’s something that should reassure anxious academics: you’re much better at detecting AI writing quality problems than you think you are. Your expertise makes you sensitive to exactly the issues we’ve been discussing.

You’ve read thousands of student essays. You’ve developed intuition about what genuine intellectual engagement looks like versus what performance looks like. You can tell when students have actually read the material versus when they’re bullshitting. You recognize the difference between confused-but-trying and empty-but-polished.

These same skills work for detecting AI writing flaws. The hallucinations, the absence of original argument, the synthesis theater, the lack of depth, the generic voice—these all register as “something’s wrong” in ways you may not immediately articulate but can definitely sense.

Moreover, you know your students. You’ve seen their previous work, heard them speak in class, worked with them during office hours. You have context that makes sudden dramatic changes in writing quality suspicious. When a student who’s struggled all semester submits a polished essay, your detector should go off—not because the essay is too good, but because it’s inconsistent with everything else you know about this student’s capabilities.

The key is trusting your expertise. Don’t let anxiety about AI make you second-guess your professional judgment. If an essay feels wrong—if it’s technically correct but intellectually empty, superficially sophisticated but actually shallow, well-cited but clearly unread—trust that feeling. Your years of experience reading academic writing have trained you to recognize the real thing.

AI Writing Limitations in Practice: What Academics Should Look For

Given everything we’ve discussed about why AI writes so badly, what should academics actually look for when evaluating potentially AI-generated work?

Start with citations. Check them. AI hallucinations in sources are common and easily verified. When citations don’t exist or don’t say what the essay claims, you’ve got strong evidence of AI use or other serious problems.

Examine the argument. Does the essay actually take a position, or does it perform taking a position? Can you identify a genuine thesis that makes a debatable claim? If the central argument is so obvious or vague that it’s meaningless, that’s a red flag.

Look at synthesis. When sources are cited, do they genuinely relate to each other and the argument? Or are they just name-checked in ways that sound connected but demonstrate no actual engagement? Real synthesis is hard and shows; fake synthesis is easier and also shows, just differently.

Test for depth. Identify a claim in the essay that requires understanding of complex concepts or theories. Does the essay demonstrate that understanding, or does it just mention the right terms? If you could replace sophisticated concepts with simpler ones without changing the argument’s substance, there’s no real depth.

Listen to the voice. Does it sound like writing in your field, or does it sound like generic academic writing? Does it have any personality or perspective, or is it uniformly bland and hedged?

Check for consistency with previous work. Has this student written like this before? If not, what explains the change? Improvement is possible and should be encouraged, but sudden transformation deserves conversation.

Finally, talk to students. Ask them to explain their argument, discuss their sources, or elaborate on specific points. Real engagement reveals itself in conversation. Students who wrote their own work can discuss it even when they’re nervous. Students who used AI often cannot.

The Unemployed Professors Solution

This brings us to why Unemployed Professors remains relevant and valuable despite (or because of) the AI revolution. We employ actual academics—people with the expertise, understanding, and intellectual capacity that AI fundamentally lacks.

When our writers produce model essays, they’re engaging in genuine scholarly work. They read sources, understand arguments, develop original positions, synthesize complex material, and write with the depth and voice that characterize real academic writing.

Students who use these models as learning tools receive examples of what genuine intellectual work looks like—not AI’s simulation of it. They can study how real scholars construct arguments, engage sources, and develop sophisticated analyses. They can learn from work that demonstrates actual understanding rather than pattern-matching.

For academics, this distinction matters. Students using Unemployed Professors to support their learning are accessing genuine expertise. Students using ChatGPT are accessing sophisticated autocomplete. The educational value is categorically different.

Moreover, our work is transparently human. It has personality, perspective, and voice. It engages deeply with material in ways that reveal understanding. It makes bold claims rather than hedging everything. It handles complexity and ambiguity with the intellectual honesty that characterizes real scholarship.

This is why detecting AI writing quality issues is ultimately about recognizing the presence or absence of genuine thought. AI can simulate the appearance of academic writing. It cannot simulate actual intellectual engagement. And academics, with your years of training and experience, are exquisitely equipped to tell the difference.

Implications for Teaching and Assessment

Understanding why AI writes badly has important implications for how academics should approach teaching and assessment in the AI era.

First, stop panicking. AI is not going to replace genuine learning or genuine thinking. It produces work that looks superficially acceptable but is fundamentally inadequate for real academic purposes. Your expertise in recognizing good work from bad work remains valid and valuable.

Second, design assignments that require what AI cannot provide: original argument, deep engagement with specific texts or contexts, synthesis that demonstrates genuine understanding, application of theories to novel situations. These tasks are AI-proof not because they use tricks to evade detection, but because they require actual thinking.

Third, build relationships with students. Know their voices, their strengths, their struggles. Context makes sudden changes visible and conversations about work productive. Students are less likely to cheat when they know you know them, and you’re more confident in your evaluations when you have baseline information.

Fourth, trust your judgment. If work seems wrong to you—too generic, too shallow, inconsistent with what you know about the student—investigate. Your expertise is the best detector available. Don’t let technology make you doubt what your experience tells you.

Fifth, use the AI revolution as a teaching opportunity. Help students understand why AI writes badly and what makes human intellectual work valuable. Teach them to recognize the difference between genuine engagement and sophisticated simulation. These meta-cognitive skills will serve them far better than any specific content knowledge.

Conclusion: The Enduring Value of Human Expertise

The panic about AI-generated essays assumes they’re good enough to fool academics. This assumption is wrong. AI writes badly—not in superficial ways that will improve with better models, but in fundamental ways that reflect its inability to understand, think, or engage genuinely with ideas.

Academics should feel reassured, not threatened. Your expertise in recognizing genuine intellectual work remains the gold standard. Your ability to distinguish between real engagement and performance is more valuable than ever. And your role in teaching students to think rather than simulate thinking is crucial in an era of sophisticated text generation.

At Unemployed Professors, we understand this because we work with real scholars every day. We see what genuine expertise produces versus what AI produces. The difference is stark, recognizable, and not going away.

Students need access to real thinking, real engagement, and real expertise—not AI’s simulation of these things. Academics need to trust their judgment and recognize that their years of training have equipped them to identify the real thing when they see it (and to spot the fake).

AI writes badly. Once you understand why and what this looks like, you’ll wonder why you ever worried. Your expertise is irreplaceable. Trust it.

1 thought on “Academics Be Aware: Why AI Writes so Badly”

  1. Pingback: AI Content Generation is the Future but Do You Know the Present? | The Unemployed Professors Blog

Comments are closed.

Scroll to Top