Technical SEO

Modern XML sitemaps: priority, lastmod, and what to skip

Por Lucas ·

Sitemaps changed. Google ignores priority and changefreq but treats lastmod as a crawl signal. Here is what still carries weight in 2026.

In February 2026 we audited 41 enterprise sitemaps and found the same pattern: priority pinned at 0.8 everywhere, changefreq lying 'daily' on static pages, and lastmod stamped with the deploy date even when the content never changed. Googlebot warned years ago that priority and changefreq are ignored, but lastmod has become a crawl-scheduling signal. Lie in lastmod once and you lose credibility for several cycles. This post shows what actually matters, backed by log and Search Console data from four tenants between 12k and 4.2M indexable URLs.

Start with the basics: a sitemap should list only canonical, 200 OK, indexable URLs that you want in the index. Sounds obvious, yet 73% of the audits we ran in 2026 still had 301 URLs, noindex pages, or UTM-tagged variants listed. Crawlers read that as noise and lower domain priority. Use Lighthouse plus Screaming Frog to cross the sitemap with real status codes. If you do not yet have that workflow, read How to audit on-page SEO without falling into guesswork before tuning the XML further.

About lastmod: it should reflect meaningful content change, not a CMS timestamp bump. Editing an image alt does not move lastmod. Rewriting 40% of the body, updating tables, swapping the H1, yes. A Gary Illyes statement reconfirmed in January 2026 says Google uses lastmod as a hint and adjusts crawl frequency if you stay consistent for 4-6 weeks. Lie once and you drop off the schedule for months. To decide what is worth rewriting, see Rewrite or rebuild: making the call with SERP data and Content decay: spotting the posts quietly losing traffic.

Segment sitemaps by type and by rate of change. PDPs change daily in e-commerce; PLPs move when merchandising rotates; blog moves when you publish. Cramming everything into a 50MB monolithic sitemap.xml is the classic mistake. The protocol cap still stands: 50,000 URLs or 50MB uncompressed per file. For large sites use a sitemap index, split by language and page type, and cross-reference with hreflang without pain: implementation for multilingual sites. In e-commerce specifically, separate PLP from PDP and attack each with the playbook in E-commerce on-page: PLP vs PDP without cannibalization.

Skip with confidence: priority, changefreq, image:image when your images are already well-marked for lazy loading, and video:video if you already ship Schema VideoObject. Google deprecated news tags in the general sitemap back in 2023; news needs its own file. Do not include paginated URLs (?page=2) if you already use rel=prev/next or view-all canonical. Do not list tag pages with zero organic traffic in the last 90 days, they eat crawl budget. If you have not measured that yet, Crawl budget: when to worry and how to measure it has the BigQuery walkthrough.

Generate the sitemap from the database, not from a crawler. Crawl-based sitemaps are slow, inherit routing bugs, and produce wrong lastmod values. In PostgreSQL a materialized view on content.updated_at (not row.updated_at) solves it. Ping Google via the Search Console API instead of the legacy ping endpoint, which was retired in June 2023. Confirm via log file analysis that Googlebot is actually fetching the sitemap daily, if not, you have a deeper crawl issue described in Log file analysis: what Googlebot is actually doing.

Practical takeaway: tonight, download your sitemap.xml, count URLs with status != 200 and URLs whose lastmod matches 100 other pages exactly. If either is above 5%, you are burning crawl. Trim everything that is not canonical 200 indexable, segment by type, and let lastmod tell the truth. Within four weeks you will see movement in 'Pages crawled per day' in GSC. It is boring, and it works.

Nenhum comentário ainda

Seja o primeiro a comentar.

Deixe seu comentário

Entre com sua conta Canverly para comentar. Você pode usar a mesma conta em qualquer site da rede.

Entrar com Canverly