Indexed pages have fallen from 63 to 27 over the past three months. Here's the trend, the underlying reasons, the most likely cause, and what's already been fixed.
A steady decline rather than a catastrophic cliff — which already tells us a lot about the cause.
The decline is steady rather than catastrophic, with the steepest drop in late February and again in late April. There is no single event-shaped cliff in the data, which rules out a one-off technical regression — including the on-page SEO work delivered recently. Those changes touched the homepage template and a handful of schema snippets; they could not have removed 36 pages from the index.
Eight active reasons. The three flagged critical/high at the top together account for ~90% of the volume — that's where the work needs to happen.
| Reason | Severity | Pages |
|---|---|---|
| Crawled — currently not indexedGoogle crawled the page but chose not to index it. Quality signal. | Critical | 399 |
| Alternative page with proper canonical tagPage canonical points elsewhere — Google is respecting it. | High | 74 |
| Page with redirectOld URLs that 301 to new ones. Should drop out naturally. | Medium | 41 |
| Not found (404)Dead pages Google still knows about. | High | 30 |
| Discovered — currently not indexedGoogle knows the URL exists but hasn't crawled it yet. Crawl budget signal. | Medium | 20 |
| Blocked by robots.txtExplicitly disallowed. Needs review to confirm intent. | Medium | 3 |
| Duplicate, Google chose different canonical than userPage declared one canonical, Google preferred another. | High | 3 |
| Excluded by 'noindex' tagTagged noindex in the template or meta. Confirm intent. | Medium | 2 |
| Blocked due to other 4xx issueNo pages affected. | None | 0 |
| Soft 404No pages affected. | None | 0 |
The single largest entry in the report — and the most informative one.
Google found these URLs, fetched them, and then decided they didn't deserve a slot in the index. For a fine jewellery storefront on Shopify, three patterns reliably produce this signal:
Shopify generates a separate URL for every collection-filter combination — ?filter.v.option.metal=18k-gold, ?filter.p.tag=infinity, and so on. These pages display a thin slice of the parent collection's content, with no unique copy of their own. Google crawls them, finds them near-duplicate, and drops them. The same applies to in-collection product URLs (/collections/earrings/products/x vs the canonical /products/x).
Page 2, 3, 4 of a collection contain a different set of product thumbnails but share the same intro copy, headings, and metadata as page 1. Without strong differentiation Google treats them as low-value and skips indexing.
Any product page whose description is short, template-driven, or near-identical to other PDPs (especially across colour or material variants sold as separate products) can land here.
The headline finding: this is a content and architecture issue, not a technical regression. Recent SEO work on schema and headings did not cause it. The fix lives in canonical hygiene, collection differentiation, and addressing the Shopify URL sprawl directly.
The combination of 399 "crawled — not indexed" with 74 "alternative page with proper canonical" points squarely at Shopify URL sprawl: filter parameters, in-collection product URLs, and thin variant pages. The drop from 63 to 27 reflects Google quietly tightening what it considers worth keeping. Fixing this means addressing canonical hygiene, collection content depth, and the noindex/robots configuration — in that order.
In rough priority order. Each is an investigation, not a code change yet — the goal is to confirm cause before touching the live theme.
robots.txt.liquid3 pages are currently blocked by robots.txt — review the file to confirm what is being disallowed and whether the intent matches reality. Shopify's default is well-behaved, but if the theme overrides it anything could be in there. Quick win if a useful section has been accidentally disallowed.
74 pages flagged "alternative with proper canonical" and 3 "Google chose different canonical" suggest the canonical logic is doing something Google isn't happy with. Confirm every collection and PDP self-references its canonical URL, and that /collections/X/products/Y canonicalises to the bare /products/Y.
noindex directiveOnly 2 pages are currently noindex'd, but it's worth knowing where the directive lives in the theme and what rules trigger it. Useful as a sanity check, and we'll want this when we set up index-coverage monitoring.
Pull the 399 affected URLs from GSC and group by pattern: filter URLs (?filter.), paginated (?page=), in-collection PDPs (/collections/X/products/), or genuine standalone product pages. Each needs a different fix, and the proportions show where to put effort.
For the 404s: any with historical backlinks or organic traffic should be redirected to the closest live equivalent. For the redirects: check what still internally links to the old slugs — update those links to point directly to the destination.
"Discovered — currently not indexed" (20 pages) is a crawl-priority signal. Strong internal links from indexed, well-trafficked pages will move them up the queue. Crawl with Screaming Frog, isolate orphan pages, and add links from relevant collections.
Not the recent on-page SEO work. The heading-structure and schema changes deployed recently only modified the homepage template and a handful of JSON-LD snippets. Those changes cannot mechanically deindex 36 pages — and the gradient of the drop (steady decline across three months, not a cliff after a deploy) confirms it.
Not a manual action. No GSC message in the manual actions or security issues reports. If there had been, the drop would be much sharper and more uniform.
Not a robots.txt mass block. Only 3 pages are blocked by robots.txt. A bad Disallow: directive would show hundreds.
Not a hreflang or international SEO issue in isolation. Those would show up under "Alternative page with proper canonical" at higher volumes and with clear cross-market patterns. Worth ruling in or out during the canonical audit, but unlikely to be the primary cause.
What remains, and what the evidence points to, is Shopify URL sprawl combined with Google tightening its quality threshold on thin pages. Fixable, but the fix is structural rather than cosmetic.
A direct crawl of the storefront surfaced five recurring problems. The first two have been fixed in the theme; the rest need action in Shopify admin or Search Console.
| Issue | Count | Impact |
|---|---|---|
| Duplicate H1 tagsMultiple H1s per page — confused heading hierarchy. | 68 pages | Google sees a confused heading hierarchy |
| 404 broken collection URLsMaterial links pointing at URLs that don't resolve. | 8 pages | Wasted crawl budget, poor quality signals |
| 429 rate-limitedShopify throttling crawler requests. | 41 pages | Google can't reliably crawl your products |
| Missing meta descriptionsNo description set — Google writes its own. | 52 pages | Lower click-through in search results |
| Duplicate meta descriptionsSame description reused across products. | 16 pages | Pages competing against each other |
Four changes landed in the theme as a result of the crawl.
"Alternative looks" (62 pages), "What Our Clients Say" (4), and "Alternative styles" (2) were all rendering as <h1> via heading_size: "h1" in the product templates. Changed to h2 across all five product template files. Only the actual product title remains an H1.
product-card.liquid was using the raw material metafield (e.g. "18k yellow gold") as the href, producing URLs like /collections/18k%20yellow%20gold. Now uses routes.collections_url with the handleize filter → /collections/18k-yellow-gold.
Added <meta name="robots" content="noindex, nofollow"> to login, register, cart, search, and account pages. These were consuming crawl budget without contributing to rankings.
Added a full JSON-LD @graph with JewelryStore, WebPage, BreadcrumbList, ItemList, and OfferCatalog. Also fixed the existing Organization schema to filter blank social links, which had been producing invalid JSON.
Three issues can't be fixed in theme code — they need attention in Shopify admin or Search Console.
handleize fix will generate /collections/18k-yellow-gold and similar — but those collections need to actually exist in Shopify. If they don't, create them or remove the material link entirely.How the theme-level fixes should play out as Google recrawls — and the one blocker that still needs content work.
What happens: Google recrawls the homepage and picks up the new heading structure, schema markup, and noindex tags on utility pages. The structured data (JewelryStore, ItemList, OfferCatalog) should start appearing in Search Console's rich results report.
Expect: No ranking changes yet. You may see a small uptick in indexed pages as Google stops wasting crawl budget on cart / login / search pages and redirects it to actual product pages.
What happens: The duplicate H1 fixes propagate across all 68 product pages as Google recrawls them. The broken 404 collection links stop leaking link equity. Google reassesses your site's heading hierarchy and content quality signals.
Expect:
What happens: The expanded About section entity copy, improved heading hierarchy, and structured data compound. Google builds a stronger entity understanding of "Figlio" as a jewellery brand.
Expect:
This is your biggest blocker and we haven't solved it. These fixes improve quality signals, but if Google has decided most of your pages are thin or duplicative, heading fixes alone won't recover them. The likely culprits:
The theme-level fixes create the right foundation, but content uniqueness across those 399 unindexed pages is what will actually move the needle.