Skip to main content
team reviewing analytics during a technical SEO planning meeting

Search engines don’t rank what they can’t crawl, render, or trust and users won’t convert on pages that feel slow or broken. Over the last few years, the bar has moved: INP replaced FID as a Core Web Vital, JavaScript-heavy front ends became the norm, and faceted URLs can spiral into infinite spaces if left unchecked. Add in regular platform changes and the occasional redesign or replatform, and it’s easy for technical issues to quietly cap your visibility.

This guide is a practical playbook for modern technical SEO. You’ll find checklists, decision trees, and copy‑paste snippets to help you diagnose indexing problems, tame rendering quirks, clean up canonicals/sitemaps, and ship measurable improvements to Core Web Vitals, without guesswork. It’s written for in‑house marketers, SEOs, and developers who want a clear, prioritised plan.

If you’d prefer expert, hands‑on support, consider partnering with a technical SEO agency and if a redesign, domain change, or platform switch is on the horizon, make sure your plan includes a proven website migration service so you avoid traffic dips and redirect headaches.

Below, we’ll show you how to protect crawl budget, spot rendering pitfalls before they hit production, keep your canonical signals consistent, and use real user data (RUM + CrUX) to prove impact. Let’s get into it.

1) Core Web Vitals 2025 (INP replaces FID)

What changed? Since March 2024, INP (Interaction to Next Paint) replaced FID as a Core Web Vital. Target thresholds: INP ≤ 200 ms, LCP ≤ 2.5 s, CLS ≤ 0.1.

Fix fast:

  • Break long main‑thread tasks (>50 ms) via code‑splitting, requestIdleCallback, or setTimeout.
  • Preload the LCP image and avoid lazy‑loading the LCP element.
  • Reserve space for images/ads; add font fallbacks to stabilise CLS.

Tools:

  • PageSpeed Insights
  • Lighthouse
  • CrUX (field data).

2) JavaScript rendering pitfalls (CSR/SSR/hydration)

Dynamic rendering is a short‑term workaround at best. Prefer SSR, static generation, or hybrid approaches.

Fix fast:

  • Render critical routes on the server; ship meaningful HTML in the first response.
  • Ensure links are real <a> elements with href (not JS click handlers).
  • Avoid client‑only content gates that hide indexable text behind JS.

3) Crawl budget & Crawl Stats (at scale)

Large sites must protect crawl capacity. Use Search Console → Settings → Crawl stats to monitor host health, 5xx/429 spikes, and average response times.

Quick checklist:

  • Keep HTML response times under 500 ms on average.
  • Compress assets (Brotli), leverage caching, and reduce third‑party bloat.
  • Remove low‑value URLs from internal links (endless filters, internal search, duplicates).

4) Faceted navigation & infinite URL spaces

Filters (size, colour, sort, price) can explode URL counts and waste crawl.

Fix fast:

  • Choose one canonical version per category (e.g., default sort). Canonical all variants to it.
  • Block crawl of useless combinations in robots.txt if you don’t need them crawled; otherwise, leave crawlable but canonical to the primary.
  • Consider a “view‑all” page for small categories or maintain clean pagination.

5) Mobile experience (and the retired Mobile‑Friendly Test)

The Mobile‑Friendly Test and Mobile Usability report are retired, so instead use Lighthouse and PageSpeed Insights, plus real device testing.

Fix fast:

  • Use a responsive layout, tap targets ≥ 48 px, no intrusive interstitials, identical content parity to desktop.

6) Indexing diagnostics decoded (GSC states)

Understand these Page Indexing states and what to do:

  • Discovered – currently not indexed: Google hasn’t crawled yet (server limits/crawl budget). Improve internal linking, performance, and reduce duplication.
  • Crawled – currently not indexed: Content seen but not selected (quality/duplication issues). Strengthen uniqueness, intent match, and canonical signals.
  • Duplicate without user‑selected canonical: Conflicting signals. Align canonicals, sitemaps, and internal linking.

7) Robots.txt vs noindex (which directive & when)

  • robots.txt: blocks crawling, not necessarily indexing. A disallowed URL can still be indexed if linked (without content).
  • noindex (meta/X‑Robots‑Tag): removes from the index (requires the URL to be crawlable).

Rule of thumb:

  • To remove from search, use noindex and allow crawling and to save the crawl, disallow low‑value spaces.
  • Copy‑paste (HTTP header for PDFs & other non‑HTML):
  • X-Robots-Tag: noindex, nofollow

8) Status codes, soft 404s & empty templates

Soft 404s occur when thin/empty pages return 200 with “not found” messaging. Return a 404/410 or add real content.

Use 301 for permanent redirects; 302/307 only for truly temporary moves.

9) Redirect hygiene (chains, loops, hops)

Every hop slows users and wastes crawl.

Fix fast:

  • Map legacy URLs to the final destination in one hop. Sweep post‑migration chains quarterly.

Nginx example:

return 301 https://example.com$new_uri;

10) XML sitemaps that scale

  • Include only indexable 200 pages you want to rank.
  • Keep lastmod accurate; split into ≤50k URLs; gzip; submit a sitemap index.
  • Maintain image/video sitemaps where relevant; exclude parameters, soft‑404, and noindex URLs.

11) Canonicals that actually consolidate

  • Self‑canonical on canonical pages; variants point to the preferred.
  • Align canonicals, sitemaps, and internal links as contradictions cause duplication states.

12) Duplicate & thin content patterns

  • Merge near‑duplicates (PDP variants, tag archives) or add unique value.
  • Enrich with specs, FAQs, UGC (moderated), comparisons, and unique media.
  • For multi‑location pages: unique NAP, team, reviews, inventory, local images.

13) Orphan pages & internal link architecture

  • Crawl + export from CMS to find orphans; compare against sitemaps.
  • Add contextual links from hubs; ensure key pages are linked in nav/footers.
  • Use related‑content modules to create crawl paths.

14) Broken links & crawl waste

  • Quarterly audits of internal/external links.
  • Replace or 301 to a relevant alternative (not always the homepage).

15) URL parameters in 2025 (post‑GSC deprecation)

The old URL Parameters tool is gone. Control facets with information architecture + canonicals + robots + UI, not Search Console.

Fix fast:

Ensure internal links point to clean URLs; canonical back from parameter variants; disallow clearly useless combos to cut bloat.

16) International SEO & hreflang traps

  • Map language–region pairs correctly (en-gb, en-us).
  • Use bidirectional hreflang with self‑references; keep canonical and hreflang aligned.
  • Use sitemaps for large setups; consider an x-default for language selectors.

17) Pagination, filters & variants (ecommerce)

  • Keep one canonical per category (default sort/filter).
  • rel=”next”/”prev” is no longer a Google signal, but consistent pagination still aids UX and crawl.
  • Out‑of‑stock: keep indexable if it will return (with schema and alternatives). Permanently gone → 410 or canonical to the successor.

18) Media & assets SEO (LCP images, fonts, video)

  • Mark up products/articles/FAQ/video with appropriate schema; validate regularly.
  • Images: responsive srcset/sizes, preload hero/LCP image, always set width/height.
  • Fonts: preload + font-display: swap to avoid CLS.
  • Video: supply thumbnail, transcript, and VideoObject schema; ensure the video file isn’t blocked by robots.

19) Headers, caching & CDN behaviour

  • Use Cache-Control and ETag for static assets; keep HTML short‑cache with revalidation.
  • Enable Brotli; preconnect critical third‑parties.
  • Prefer CDN edge redirects and origin shields to cut round trips.

20) Security, HTTPS, mixed content & HSTS

  • Site‑wide HTTPS with a 301 from HTTP and zero mixed content.
  • Consider HSTS (with preload only when fully ready).
  • Keep dependency inventory lean; update libraries promptly.

21) Manual actions & hacked‑site recovery

If traffic drops or GSC flags issues:

1) Take compromised content offline and clean CMS/plugins.
2) Fix root cause; patch vulnerabilities.
3) Use Removals for spammy URLs if needed.
4) Submit a reconsideration request with clear evidence.

22) Server & protocol optimisations (HTTP/2)

Googlebot supports HTTP/2 (it can improve crawl efficiency). Keep HTTP/1.1 fallback. HTTP/3 benefits users primarily.

23) Migrations & go‑live safeguards

  • Password‑protect or noindex staging (don’t rely on robots alone).
  • Launch with mapped 301s (single hop), updated canonicals, sitemaps, and hreflang.
  • Verify analytics/ads tags, conversions, and monitoring post‑launch.

24) Log file analysis

Logs show what Googlebot actually fetches. Use them to:

  • Confirm crawl of priority templates.
  • Spot spoofed bots and crawl waste (parameters, 404s, 5xx).
  • Quantify impact after fixes (e.g., fewer 5xx, more HTML hits on key categories).

25) Prioritisation matrix + decision trees

Impact x Effort matrix (quarterly planning)

  • Quick Wins (High impact / Low effort): Fix redirect chains, preload LCP, remove soft‑404 templates, resolve 5xx spikes.
  • Big Bets (High / High): Rendering strategy (SSR/static), IA & internal link refresh, pagination/facets overhaul.
  • Tidy‑ups (Low / Low): Metadata duplicates, stray noindex, sitemap hygiene.
  • Time sinks (Low / High): Micro‑optimisations on non‑commercial pages.

Decision tree: “Why isn’t this URL indexing?”

Step 1. Is it in the XML sitemap? If not, add it.

Step 2. GSC status:

  • Discovered – not indexed → Improve internal links/performance; reduce crawl waste.
  • Crawled – not indexed → Strengthen uniqueness; fix duplication/canonical conflicts.

Step 3. Blocked? Robots/canonical/noindex contradicting? Align signals.

Step 4. Quality: Add substance (FAQs, specs, unique images/UX), match intent.

Tooling stack (what to actually use)

  • Diagnostics: Google Search Console (Page Indexing, Crawl Stats), Lighthouse, PageSpeed Insights, CrUX field data.
  • Crawling: Screaming Frog or Sitebulb for inventories and audits.
  • Monitoring: Server log analysis, uptime/error alerts, synthetic monitoring.
  • Validation: Rich Results Test, schema linters.

Make technical SEO a habit, not a one-off

If there’s a single takeaway, it’s this… technical SEO isn’t a project you “finish”, it’s an operating system for your site. Make sure you prioritise the big wins (INP/LCP, canonical and sitemap hygiene, removing crawl waste), ship changes safely, and measure impact with real user data and crawl stats. Keep a simple release checklist, run quarterly crawl comparisons, and watch your logs, small technical leaks can add up fast.

If you’d like expert support, Circulate, a Manchester SEO agency can help with technical SEO audits, Core Web Vitals tuning, JS rendering/SSR strategies, faceted navigation clean-ups and migrations.

Ready to level up your site’s technical foundations? Get in touch & book a discovery call to see how we can help.