Beyond Spell Check: 15 Automatable Writing Quality Checks
I've been developing RoastMyPost (currently in beta) and wrestling with how to systematically analyze documents. The space of possible document checks is vast, easily thousands of potential analyses.
Building on familiar concepts like "spell check" and "fact check," I've made a taxonomy of automated document checks. These are designed primarily for detail-heavy blog posts (particularly EA Forum and LessWrong content).
This framework organizes checks into three categories:
- Language & Format: Basic writing quality and presentation
- External References: Links, images, and source validation
- Content Accuracy: Factual correctness and logical consistency
We've implemented a few of these in RoastMyPost already, with more under consideration. The checks listed here are deliberately practical; straightforward to implement, relatively uncontroversial, and technically feasible with current tools.
A caveat: these checks are necessary but not sufficient for good writing. They catch mechanical and factual errors but can't evaluate argumentation quality, insight, or persuasiveness. Think of them as generalized automated proofreading, not a substitute for thoughtful writing and editing.
Language & Format
Spell Check
Well understood, so not too hard to do a basic job. LLMs can be a bit more advanced. One challenge is choosing UK vs. US English, when doing it on English. Some authors use combinations of the two.
Importance: Low | Challenge: Low | Subjectivity: Low | Prevalence: High
Grammar Check
Similar to spell check, but can be more subjective.
Importance: Low | Challenge: Low | Subjectivity: Medium | Prevalence: High
Markdown Check
Is the item formatted correctly?
This can be messy, as different websites format Markdown differently. I think this isn't a major concern for content written by humans, but it seems like something to check when it's by LLMs. I think LLMs often get MD wrong.
Importance: Low | Challenge: Low | Subjectivity: Low | Prevalence: Medium
Proper Noun Check
Are all person/place/etc names in the doc correct? Are they correctly spelled out?
This often will require some searching. Bonus points if the automation can return a relevant link in each case. Wikipedia is the gold standard where it is relevant, but other pages can also work.
Importance: Low | Challenge: Medium | Subjectivity: Medium | Prevalence: Medium
External References
Link Status Check
Does the link exist? (It could be hallucinated, a mistake, or a dead link)
Extra: If it fails, it would be nice if we could have a simple AI agent who would try to search and find it.
Challenge: Many websites block bots, so it can be surprisingly difficult to check the website.
Note that hallucination errors should be treated different from dead links. Dead links can require ongoing monitoring.
Importance: Low | Challenge: Low | Subjectivity: Low | Prevalence: High
Link Permissions Check
Are linked documents accessible to readers?
Checks Google Docs, Notion, and other collaborative platform links to verify they have appropriate public/view permissions.
Common failures:
- Google Docs/Sheets/Slides set to private or "anyone with link can request access"
- Notion pages that are workspace-private
- Dropbox/OneDrive links with expired sharing
- GitHub links to private repos
Ideally would test from an incognito/logged-out browser to verify true public access.
Importance: Low | Challenge: Low-Medium | Subjectivity: Low | Prevalence: Medium
Link Relevancy Check
Does the link have the basic content it is implied to have?
This can be quite tricky to validate. Many links aren't to the direct source referenced, but instead a related website, after which the user is expected to find the source. Ideally we'd have a simple agent run a few steps to investigate.
Importance: Medium | Challenge: Medium | Subjectivity: Medium | Prevalence: Medium
Image Hosting Quality Check
Are images hosted on reliable services?
Flag images hosted on:
- Discord/Slack CDNs (will likely break)
- Free image hosts without paid accounts
- Personal servers with non-professional domains
- Direct social media links
- Any URL with obvious session tokens
This is mainly relevant to the author. I suspect many readers won’t mind if the images work when they read it.
Importance: Low | Challenge: Low | Subjectivity: Low | Prevalence: Medium
Credibility Checks
Are sources of credibility cited in the piece actually as credible as implied?
This will likely involve doing some digging. Also applies for cases where it's claimed that a credible source said X, but they only technically said X. It probably would be good to have a long-lasting list of different sources and their general credibility ratings. A more advanced version would have audience-dependent credibility standards.
Importance: Medium | Challenge: Medium | Subjectivity: High | Prevalence: Medium
Content Accuracy
Math Check: Arithmetic
Are all simple (i.e. not advanced math) equations in the doc correct?
This can ideally be verified with a formal math equation. Math.js can be useful for Javascript ecosystems, Python otherwise.
Importance: Medium | Challenge: Low | Subjectivity: Low | Prevalence: Medium
Math Check: Advanced
Are all examples of advanced mathematics technically accurate?
One major challenge with doing this is context. Many descriptions of math might reference key previous parts. There might be awkward branching with several strands of thought. Ideally this could be formally checked with Python or similar, though this is often fairly slow.
Importance: Medium (Used on LessWrong a fair bit) | Challenge: Medium | Subjectivity: Low-Medium | Prevalence: Medium
Forecast Check
Are all forecasts made by the author (or cited by the author) broadly reasonable?
Validates predictions by converting them to specific, binary forecasting questions and having AI forecasters evaluate them independently. In RoastMyPost, this works by extracting claims, reformulating them as binary prediction questions, then getting assessments from AIs without the original post's framing.
Key challenge: Correlated forecasts that share underlying assumptions. For example, EA Forum posts assuming short AI timelines make multiple predictions that all depend on that core assumption. Basic forecast checkers may evaluate each claim independently and reject them all if they disagree with the underlying premise, missing that they're internally consistent given their assumptions.
Importance: Medium (Used in strategy posts occasionally) | Challenge: Medium | Subjectivity: Medium-High | Prevalence: Medium
Estimation Check
Are all Fermi estimations and back-of-the-envelope calculations broadly reasonable?
Validates rough calculations and order-of-magnitude estimates. SquiggleAI exists, but is likely overkill - generating 100-200 line models when most blog posts need validation of 2-10 line calculations.
The check should verify: order of magnitude correctness, reasonable assumptions, proper unit handling, and uncertainty acknowledgment.
Importance: Medium (Used in strategy posts occasionally) | Challenge: Medium | Subjectivity: Medium-High | Prevalence: Medium
Editorial Consistency Check(s)
Does the document maintain consistent standards throughout?
Includes multiple sub-checks:
- Redundancy - Same points unnecessarily repeated
- Terminology - Same concepts called different things
- Completeness - Missing sections or "see Section X" errors
- Data consistency - Same statistics reported differently
- Style/Format - Inconsistent tone, spelling, or formatting
Importance: Low | Challenge: Medium to Hard | Subjectivity: Medium | Prevalence: Low (Most blog posts are consistent, longer docs less so)
Plagiarism Check
Does the document contain unattributed copied content?
Check for text that appears elsewhere without proper citation. This could range from exact matches to paraphrased content that's too close to the original.
There are a bunch of existing online services that do this, so hopefully one of those can be used.
This probably isn’t very useful for EA Forum and LessWrong writing, but is more useful for the greater blogosphere.
Importance: Medium | Challenge: Medium | Subjectivity: Low-Medium | Prevalence: Low