Sitemap URL Inspector

Inspect and validate a sitemap.xml (or sitemap index), including .xml.gz sitemaps. Follow redirects, parse up to a configurable number of URLs, highlight common SEO/crawler issues, and export JSON/PDF reports.

Loading…

About Sitemap URL Inspector

A clean sitemap helps search engines discover, crawl, and understand your URLs efficiently. This tool fetches a sitemap URL, supports redirects and gzipped sitemaps, parses entries (including sitemap indexes), and surfaces common problems such as invalid structure, missing <loc>, suspicious <lastmod>, and other crawler pitfalls. Export the results as JSON/PDF to track fixes over time.

Features

Parse standard sitemaps and sitemap indexes (sitemap-of-sitemaps).
Supports gzipped sitemaps (.xml.gz) for real-world large sites.
Optional redirect following to audit the final fetched sitemap URL.
Configurable parsing limit (max URLs to parse) to keep audits fast and predictable.
Validates core sitemap fields and highlights missing/invalid tags (especially <loc>).
Extracts and reviews <lastmod> usage for consistency and crawler friendliness.
Helps spot sitemap patterns relevant to multi-locale SEO (e.g., URL grouping and hints for hreflang strategies).
Copyable findings and summaries for SEO tickets and debugging.
Export reports as JSON or PDF for documentation, sharing, and regression tracking.

🧭 How to use for sitemap-url-inspector

Paste your sitemap URL

Enter the full sitemap URL. This can be a regular XML sitemap or a gzipped sitemap ending with .xml.gz.

Enable “Follow Redirects” if needed

If your sitemap URL redirects (http→https, non-www→www, CDN rewrites), enabling redirects ensures the tool fetches the final sitemap location.

Set “Max URLs to parse”

Choose how many URL entries to parse. Use smaller limits for quick checks, larger limits for deeper audits (up to the tool's cap).

Review validation results and URL stats

Look for structural issues (missing <loc>, invalid dates, unexpected formats) and any warnings that could affect crawling and indexing.

Export the report (JSON/PDF)

Download a JSON or PDF report to attach to SEO tasks, share with teammates, or compare before/after changes.

Technical specs

Supported inputs

The tool is designed to fetch and parse sitemaps served over HTTP(S), including compressed variants.

Input type	Examples	Notes
XML sitemap	[https://example.com/sitemap.xml](https://example.com/sitemap.xml)	Parses <urlset> entries.
Sitemap index	[https://example.com/sitemap_index.xml](https://example.com/sitemap_index.xml)	Parses <sitemapindex> and nested sitemap URLs.
Gzipped sitemap	[https://example.com/sitemap.xml.gz](https://example.com/sitemap.xml.gz)	Fetches and parses compressed sitemaps.

Fetch behavior and limits

Request behavior is tuned for predictable performance and crawler-like constraints.

Setting	Behavior	Default
Follow Redirects	Follows redirects when fetching the sitemap URL	Enabled
Max Redirects	Maximum redirects followed when enabled	10
Timeout	Request timeout budget	20000 ms
Max URLs to parse	Limits how many entries are parsed from the sitemap content	500 (range 10–5000)
User-Agent	Request identification header	Encode64Bot/1.0 (+[https://encode64.com](https://encode64.com))
Private networks	Blocks private-network targets	Not allowed

What validation focuses on

The inspector prioritizes issues that commonly break sitemap ingestion or reduce crawl efficiency: missing/invalid <loc>, malformed XML structures, suspicious or inconsistent <lastmod>, and patterns that can confuse crawlers when sitemaps are generated incorrectly.

A sitemap can be valid XML but still low-quality for SEO. Use findings to improve clarity, consistency, and maintainability.

Command line

Use curl (or PowerShell) to debug sitemap fetching and redirects the same way crawlers do.

macOS / Linux

Fetch sitemap headers (no redirect)

curl -I [https://example.com/sitemap.xml](https://example.com/sitemap.xml)

Check status code, content-type, and caching headers.

Follow redirects and fetch headers

curl -IL [https://example.com/sitemap.xml](https://example.com/sitemap.xml)

Useful when a sitemap URL is redirected by CDN or HTTPS canonicalization.

Download sitemap content (preview)

curl -s [https://example.com/sitemap.xml](https://example.com/sitemap.xml) | head -n 40

Quickly inspect the XML prolog and root tags.

Inspect a gzipped sitemap (preview)

curl -s [https://example.com/sitemap.xml.gz](https://example.com/sitemap.xml.gz) | gzip -dc | head -n 40

Decompress and preview the beginning of a .xml.gz sitemap.

Windows (PowerShell)

Download sitemap content

Invoke-WebRequest -Uri [https://example.com/sitemap.xml](https://example.com/sitemap.xml) | Select-Object -ExpandProperty Content

Fetches the XML body for quick inspection.

If your sitemap is huge, validate a representative subset first, then run larger parses to spot systemic generation issues.

Use cases

Validate a newly generated sitemap

Quickly verify that sitemap.xml is fetchable, well-formed, and contains correct URL entries.

Confirm your generator outputs valid XML structure
Catch missing <loc> values early

Audit gzipped sitemaps for crawler compatibility

Ensure compressed sitemaps are served correctly and parse cleanly.

Check .xml.gz content is readable and consistent
Spot CDN/proxy content-type issues

Debug redirect and canonicalization problems

Find unexpected redirects or non-200 responses that can block sitemap consumption.

http→https redirect chains
www vs non-www canonicalization

Track sitemap quality over time

Export reports and compare after releases, CMS migrations, or multi-locale expansions.

Before/after deploy regression checks
Monitor <lastmod> consistency after content updates

❓ Frequently Asked Questions

❓What's the difference between a sitemap and a sitemap index?

A sitemap lists URLs directly (usually under ). A sitemap index lists multiple sitemap files (under ) which is common for large sites.

❓Should my sitemap include <lastmod>?

It's optional, but useful if it's accurate and consistently formatted. Incorrect or constantly changing values can reduce trust and may not help crawling.

❓Why would a sitemap be ignored by crawlers?

Common reasons include fetch errors (non-200), blocked access, invalid XML structure, missing , incorrect content type, or redirect loops.

❓Is it OK if my sitemap redirects?

Usually yes, but it's better to submit and publish the final canonical sitemap URL to reduce crawler overhead and avoid accidental breakage.

❓Can this tool check every URL in the sitemap for status codes?

This inspector focuses on parsing and validating the sitemap and extracting stats. Use a dedicated URL status checker or crawler if you want to fetch and validate every listed URL.

❓Does this tool support multi-locale / hreflang sitemaps?

It's designed to help spot patterns relevant to multi-locale SEO. If you publish alternate-language URLs, ensure your sitemap structure and URL grouping is consistent with your hreflang strategy.

Pro Tips

Best Practice

Submit the final canonical sitemap URL in Search Console (avoid relying on redirects).

Best Practice

For very large sites, split sitemaps and use a sitemap index. Keep each sitemap within protocol limits and operationally manageable chunks.

Best Practice

Use <lastmod> only if it's accurate. Don't update it for every deploy if the page content didn't change.

Best Practice

If you have multi-locale URLs (like /fr/, /en/), ensure your sitemap generation is consistent across locales so crawlers don't see partial coverage.

Best Practice

Export JSON/PDF after major releases so you have evidence for debugging Search Console indexing swings.

Additional Resources

Sitemaps XML format (Google documentation)

Documentation

Sitemap protocol (sitemaps.org)

Documentation

Google Search Console: Submit and manage sitemaps

Documentation

Sitemap URL Inspector

About Sitemap URL Inspector

✨ Features

🧭 How to use for sitemap-url-inspector

Paste your sitemap URL

Enable “Follow Redirects” if needed

Set “Max URLs to parse”

Review validation results and URL stats

Export the report (JSON/PDF)

🧩 Technical specs

📦Supported inputs

⚙️Fetch behavior and limits

🧠What validation focuses on

💻 Command line

🧑‍💻macOS / Linux

🪟Windows (PowerShell)

🎯 Use cases

🧪Validate a newly generated sitemap

🗜️Audit gzipped sitemaps for crawler compatibility

🧭Debug redirect and canonicalization problems

📈Track sitemap quality over time

❓ Frequently Asked Questions

❓What's the difference between a sitemap and a sitemap index?

❓Should my sitemap include <lastmod>?

❓Why would a sitemap be ignored by crawlers?

❓Is it OK if my sitemap redirects?

❓Can this tool check every URL in the sitemap for status codes?

❓Does this tool support multi-locale / hreflang sitemaps?

Pro Tips

Additional Resources

Other Tools

Features

Technical specs

Supported inputs

Fetch behavior and limits

What validation focuses on

Command line

macOS / Linux

Windows (PowerShell)

Use cases

Validate a newly generated sitemap

Audit gzipped sitemaps for crawler compatibility

Debug redirect and canonicalization problems

Track sitemap quality over time