A schema markup audit is the process of identifying every piece of structured data on your website, verifying that it is technically valid, and determining whether it matches what search engines and AI systems actually need to cite your content. Most websites have a patchwork of schema: some pages have it, others do not, and many that do have it are carrying errors that silently suppress rich results and reduce AI citability. A systematic audit surfaces all of that in one pass and gives you a clear, prioritized list of what to fix and what to add first.
This guide walks through the complete audit process, from inventory to validation to gap analysis, using Google's Rich Results Test, the Schema Markup Validator, and Google Search Console as your primary tools.
Step 1: Inventory Your Existing Schema
Before you can validate anything, you need to know what schema is currently deployed across your site and where it lives. Most teams have less visibility into this than they think.
1a. Crawl Your Site With a Schema-Aware Tool
Use a site crawler that surfaces structured data at the page level. Screaming Frog SEO Spider (with the "Structured Data" report enabled) and Sitebulb both do this well. Run a full crawl and export a list of every URL alongside the schema types detected on each page.
What to capture in your inventory:
- URL
- Schema types present (e.g., Article, Product, FAQPage, Organization)
- Implementation format (JSON-LD, Microdata, or RDFa)
- Number of schema blocks per page
JSON-LD is Google's recommended format for all schema implementations. If your crawl reveals Microdata or RDFa on significant portions of the site, note those pages for eventual migration – the differences between JSON-LD, Microdata, and RDFa have real implications for how reliably crawlers parse your markup.
1b. Spot-Check Schema in Page Source
For any page where the crawl result looks incomplete or unexpected, inspect the source directly. In Chrome, right-click on the page and choose "View Page Source," then search (Ctrl+F or Cmd+F) for application/ld+json or itemtype. Each match is a schema block. Copy the raw JSON to a text editor so you can compare it against what your validator later reports.
1c. Document the Schema Landscape
By the end of this step, you should have a spreadsheet with every crawled URL and the schema types deployed on each. This becomes your audit working document. Add columns now for "Validation Status" and "Priority Action" – you will fill those in as you move through the remaining steps.
Step 2: Validate Schema With Google's Rich Results Test
The Rich Results Test tells you whether a specific page's schema is eligible to generate rich results in Google Search. It is the right starting point for any page where you are targeting a featured appearance (star ratings, FAQ dropdowns, product panels, and so on).
2a. Run the Test on Your Highest-Priority Pages
Navigate to Google's Rich Results Test and enter each URL one at a time, beginning with your most commercially important pages. For SaaS companies, that typically means pricing pages, feature pages, and high-traffic blog posts. For ecommerce, product pages and category pages. For local businesses, the homepage and service pages.
The tool renders the page as Googlebot would, then reports:
- Which rich result types are detected
- Whether each schema block passes, has warnings, or has errors
- Which specific fields are missing or malformed
2b. Interpret the Results
The Rich Results Test distinguishes between errors and warnings. Errors prevent the rich result from appearing. Warnings indicate missing recommended properties that would improve the result but do not block it entirely. Both matter, but errors are the first priority.
Common errors to watch for:
- Missing required fields (e.g., a
Productschema without anameproperty) - Invalid data types (e.g., a price value wrapped in quotes when the schema expects a number)
- Empty fields (properties are declared but the value is blank)
- Schema type mismatches (e.g., a
BlogPostingschema on a page that is structured like anArticle)
The distinction between required and recommended schema.org properties determines whether you can earn a rich result at all. Required properties are non-negotiable; recommended ones add depth and help AI systems extract more complete information.
2c. Record Errors by Page and Schema Type
Back in your audit spreadsheet, update the "Validation Status" column for each page tested. Note the specific error type rather than just marking it "broken." A FAQPage with empty acceptedAnswer values is a different problem than an Article missing its dateModified field, and they require different fixes.
Step 3: Run the Schema Markup Validator
The Schema Markup Validator (at validator.schema.org) is a second validation layer that tests compliance against the schema.org specification itself, independently of Google's rich result requirements. Using both tools is important because they catch different classes of problems.
3a. Validate the Full JSON-LD Block
Paste the raw JSON-LD from each page directly into the validator (use the "Code Snippet" input tab rather than the URL tab, since the URL tab may not render JavaScript-injected schema). The validator will flag:
- Invalid property names not recognized by schema.org
- Property types that conflict with the schema.org specification
- Nested entity errors (e.g., a
Personnested inside anOrganizationwith incorrect properties) - Typos in property names, which schema.org silently ignores rather than throwing an error at render time
3b. Check for Deprecated or Unofficial Properties
Schema.org evolves. Properties that were valid two years ago may now be deprecated or replaced. The validator flags these. Deprecated properties do not necessarily break anything immediately, but they are liabilities in future algorithm updates and signal stale schema hygiene to sophisticated crawlers.
3c. Cross-Reference Both Validation Reports
Some errors appear in the Rich Results Test but not the Schema Markup Validator, and vice versa. A page with clean validator output may still fail the Rich Results Test because it is using a schema type that Google has not enabled for rich result generation. A page that passes the Rich Results Test may still carry schema.org specification violations the validator catches. Your audit is complete only when you have run both.
Step 4: Audit Schema Coverage in Google Search Console
Google Search Console's Enhancements reports give you aggregated data across your entire site, not just individual pages. This is where you identify systemic problems: schema types that are deployed at scale but failing at scale.
4a. Review the Enhancements Section
In Search Console, navigate to the left sidebar and scroll to the Enhancements section. Each schema type detected across your site appears as its own report (e.g., FAQ, Breadcrumbs, Product, Article). Open each one.
Each report shows:
- Total items detected
- Items with errors (blocking rich results)
- Items with warnings (eligible but incomplete)
- Affected URLs for each error or warning type
This view immediately tells you whether a problem is isolated (affecting a handful of pages) or structural (affecting hundreds of pages using the same template).
4b. Identify Template-Level Errors
When the same error type appears across a large number of URLs, the problem is almost always in the template, not in individual pages. For example, if 300 product pages all share the error "missing field: offers," the product page template is not passing price data into the schema block. One template fix resolves all 300 pages simultaneously. Note these template-level errors explicitly in your audit document because they represent the highest-leverage fixes available.
4c. Check for Schema Types Google Has Removed Eligibility For
Google periodically discontinues rich result support for specific schema types. If Search Console shows valid schema for a type Google no longer renders (such as the HowTo or FAQ rich results that Google restricted in 2023), those pages are not broken technically, but the rich result business case for keeping that schema in its current form should be revisited. The schema may still serve AI citability purposes – schema markup's value extends beyond rich results but it should be documented as a strategic decision rather than an assumed win.
The relationship between schema markup and AI search citations is distinct from its relationship to Google rich results. A type that no longer earns a rich result panel may still materially improve how AI systems extract and attribute your content.
Step 5: Identify Schema Gaps by Page Type
Validation tells you what is broken. Gap analysis tells you what is missing entirely. The most significant schema opportunities on most sites are on pages that have no structured data at all.
5a. Map Schema Types to Page Templates
Every major page template on your site should have a corresponding schema strategy. Build a matrix in your audit spreadsheet:
| Page Template | Expected Schema Type(s) | Currently Deployed? |
|---|---|---|
| Homepage | Organization, WebSite, SiteLinksSearchBox | Yes / No / Partial |
| Blog post | Article or BlogPosting, BreadcrumbList | Yes / No / Partial |
| Product page | Product, Offer, AggregateRating | Yes / No / Partial |
| Service page | Service, LocalBusiness (if applicable) | Yes / No / Partial |
| FAQ page | FAQPage | Yes / No / Partial |
| Author bio page | Person | Yes / No / Partial |
| Pricing page | Product, Offer | Yes / No / Partial |
Fill in the "Currently Deployed?" column using your crawl data from Step 1. Every row marked "No" or "Partial" is a gap.
5b. Prioritize Gaps by Traffic and Commercial Value
Not all gaps are equal. A missing Product schema on a high-converting ecommerce page outweighs a missing BlogPosting schema on a low-traffic post. Sort your gaps list by estimated traffic and commercial intent. For SaaS companies, feature and pricing pages carry the highest strategic weight. For local businesses, the LocalBusiness schema on location pages is the most impactful gap to close. For agencies managing multiple clients, schema management across multiple client sites requires a template-first approach that handles gap filling systematically rather than page by page.
5c. Flag AI Citability Gaps Specifically
Beyond rich results, certain schema types directly improve how AI systems like ChatGPT, Perplexity, Gemini, and Claude extract and attribute your content. FAQPage, DefinedTerm, HowTo, Article with full author and publisher attribution, and Organization with complete entity information are all high-value for AI citability. Pages that have no structured data at all are less likely to be cited by AI systems even when the prose content is strong. Mark these pages explicitly in your gap column.
The AuthorityStack.ai AI-Powered Schema Markup Generator addresses this gap efficiently: enter any URL and the tool reads the full page content to select the correct schema types and populate only fields that are actually present, rather than pattern-matching on keywords as rule-based generators do. That distinction matters for complex page types where the wrong schema type is worse than no schema at all.
Step 6: Prioritize What to Fix and What to Add First
A completed audit typically surfaces more issues than any team can address in a single sprint. The following prioritization framework keeps effort focused on the highest-leverage actions.
Tier 1: Fix Errors That Block Rich Results on High-Traffic Pages
These are errors flagged by the Rich Results Test on pages that already have schema deployed and receive meaningful organic traffic. The schema investment has already been made; the fix simply recovers the return on that investment. Template-level errors affecting many URLs belong here, addressed through a single template change.
Tier 2: Add Missing Schema to High-Value Pages With No Coverage
These are the "No" rows in your page template matrix, filtered to high-traffic or high-converting URLs. A product page with no schema is leaving structured data value entirely on the table. Priority order: revenue-critical pages first, then top organic traffic pages, then pages you are actively trying to rank for competitive terms.
Tier 3: Resolve Warnings and Recommended Property Gaps
Warnings on the Rich Results Test indicate missing recommended properties. These do not block rich results but they do reduce the richness of the result and limit the information AI systems can extract. Article schema missing dateModified, Product schema missing aggregateRating, Organization schema missing sameAs links to social profiles – all of these are Tier 3 work. High-value in aggregate, lower urgency than Tier 1 and 2.
Tier 4: Address Deprecated Properties and Specification Violations
These are the validator-only findings: properties that are technically outdated or non-standard. They rarely affect current performance but represent debt that accumulates. Batch these into a schema hygiene sprint rather than treating them as urgent.
Some of the most persistent schema markup myths involve believing that any schema is always better than no schema, or that deprecated properties are harmless indefinitely. Neither is true. Incorrect schema can mislead crawlers, and deprecated properties eventually get ignored or penalized as specifications tighten.
Step 7: Document and Hand Off the Audit
An audit that lives only in the auditor's memory or a disorganized folder does not get actioned. Structure the output so any developer, content manager, or agency partner can execute without needing to re-investigate.
7a. Finalize the Audit Spreadsheet
Your working document should now contain, for each URL:
- Schema types currently present
- Validation status (pass, error, warning) from both the Rich Results Test and Schema Markup Validator
- Search Console error type (if applicable)
- Gap type (missing schema / missing required property / deprecated property)
- Priority tier (1 through 4)
- Recommended action (specific, e.g., "Add
dateModifiedto Article schema on all blog post templates")
7b. Write Fix-Specific Implementation Notes
For every Tier 1 and Tier 2 item, write a one-paragraph implementation note. Specify the schema type to add, the required properties, the correct JSON-LD format, and the exact location in the template where the block should be inserted. Developers should not need to interpret the finding – only execute it.
For complex page types or healthcare-specific schema, the implementation notes should reference the relevant specification details. The schema.org types and properties reference covers the full hierarchy of types and their property requirements, and the process of validating healthcare schema before publishing adds an additional compliance layer that standard audits often omit.
7c. Set a Re-Audit Schedule
Schema audits are not one-time events. New pages are published, templates are updated, schema.org releases new versions, and Google adjusts its rich result policies. A quarterly audit cadence is appropriate for most sites. Large ecommerce sites or content-heavy SaaS platforms with frequent publishing schedules warrant monthly checks through Search Console's Enhancements reports, with full crawl-based audits every six months.
FAQ
What Is a Schema Markup Audit?
A schema markup audit is a systematic review of all structured data deployed on a website. The audit identifies which pages have schema, which schema types are in use, whether those schemas are valid and error-free, and where gaps exist that are costing the site rich results or AI citation opportunities. It produces a prioritized action list covering fixes, additions, and deprecation cleanup.
Which Tools Should I Use for a Schema Markup Audit?
The three primary tools for a schema markup audit are Google's Rich Results Test, the Schema Markup Validator at validator.schema.org, and Google Search Console's Enhancements reports. A site crawler such as Screaming Frog or Sitebulb completes the toolkit by inventorying schema types across the entire site simultaneously rather than page by page.
What Is the Difference Between the Rich Results Test and the Schema Markup Validator?
The Rich Results Test checks whether a page's schema qualifies it for rich result features in Google Search – star ratings, FAQ dropdowns, product panels, and similar enhancements. The Schema Markup Validator checks whether the schema conforms to the schema.org specification itself, independently of Google's requirements. They catch different error types, so a complete audit uses both tools.
How Do I Find Schema Errors Across My Entire Site Without Testing Every Page Individually?
Google Search Console's Enhancements reports aggregate schema validation data across the whole site automatically. Each schema type detected site-wide appears as its own report, showing error counts, warning counts, and the specific URLs affected. This is where template-level errors become visible because the same error type appears across hundreds of pages simultaneously.
Does Fixing Schema Markup Actually Improve SEO Rankings?
Schema markup does not directly improve rankings in the traditional sense, but it does influence how content is displayed in search results and how thoroughly AI systems can extract and cite it. Rich result features like star ratings and FAQ panels demonstrably improve click-through rates from search results. The evidence on whether structured data improves SEO rankings points to indirect effects: enhanced visibility, better click-through rates, and improved AI citability – all of which compound over time.
Can Incorrect Schema Markup Trigger a Google Penalty?
Google can manually penalize sites for schema that is intentionally misleading or that misrepresents page content, such as marking up content with AggregateRating schema when there are no real reviews, or applying Product schema to a page that is not a product page. Accidental errors – missing properties, wrong data types – do not trigger penalties; they simply result in the rich result not being shown. The specifics of when Google penalizes incorrect schema follow a clear pattern: manipulation is penalized, honest mistakes are not.
How Often Should I Re-audit Schema Markup?
Most sites should run a full schema audit every six months, with lighter monthly checks via Search Console's Enhancements reports to catch new errors introduced by template changes or content publishing. Sites with high publishing volume or frequent CMS updates may need monthly crawl-based audits. Schema.org specification updates and changes to Google's rich result policies are the primary external triggers for an unscheduled audit.
What Schema Types Have the Biggest Impact on AI Citation in Addition to Rich Results?
FAQPage, Article with complete author and publisher attribution, Organization with sameAs links, DefinedTerm, and HowTo are the schema types most consistently associated with improved AI citation rates. These types provide the structured, extractable information that AI systems like ChatGPT, Perplexity, and Gemini use when synthesizing answers. Pages with no schema at all are at a meaningful disadvantage for AI citability even when their prose content is well-written and authoritative.
What to Do Now
- Run your site through a schema-aware crawler and build the URL-level inventory spreadsheet described in Step 1.
- Test your ten highest-traffic pages in Google's Rich Results Test and record every error by type.
- Paste the raw JSON-LD from those same pages into the Schema Markup Validator to catch specification-level issues the Rich Results Test misses.
- Open Search Console's Enhancements reports and identify any error types affecting large numbers of URLs – these are your template-level fixes and your highest-leverage actions.
- Complete the page template matrix in Step 5 to surface schema types missing from entire sections of your site.
- Assign every finding a priority tier using the framework in Step 6, then hand the finalized spreadsheet to whoever is responsible for implementation.
Use the AuthorityStack.ai free schema generator to generate accurate JSON-LD for any page during implementation – enter the URL, and the tool scans the content to produce correctly structured markup ready to paste into your page's head section.

Comments
All comments are reviewed before appearing.
Leave a comment