I recently discovered one of my articles is labeled as “excluded” when I used the Google Search Console (GSC) tool to see if was indexed.
The reason I checked in the first place is because it’s 5 months old and I couldn’t find it on the first 5 pages of results for a very related (same but different wording) term.
Based on my blog’s track record and niche, this is unsual.
I then decided to check all 70 articles published on my niche blog and found two more that were crawled but not indexed and marked as excluded.
Google says this about “excluded” pages:
“These pages are typically not indexed, and we think that is appropriate. These pages are either duplicates of indexed pages, or blocked from indexing by some mechanism on your site, or otherwise not indexed for a reason that we think is not an error.”
Of course, I was in a real hurry to get these blog posts indexed by Google so that they could start bringing me traffic and earning me money.
Troubleshooting Why My Blog Posts Were Excluded By Google
I began to go through a list of reasons why Google may have chosen to exclude my articles from indexing.
This list was created from experience, research, and suggestions from other bloggers.
Suspected reasons why blog posts might be excluded by Google include:
- The blog post was set to noindex in the WP editor by accident
- A www vs non-www URL issue (this could happen if you recently changed servers)
- Someone pagarized or scraped your article, so a duplicate versionexists somewhere on the internet, and Google thinks yours is the copy
- You’ve written a very similar article on your blog before so Google thinks the new one doesn’t add any new information so isn’t worh indexing too
- The articles were crawled but it’s too early for Google decide it was good information that should be included in search results (ie. a newly published article)
- Other mystery reason: Google says articles may be excluded. “for a reason that we think is not an error”.
Narrowing Down the Culprit
I started going through the list above to see if one of the cuprits above was the reason my blog post was excluded from Google results.
- I verfied that none of the articles were set to noindex in the WordPress editor or theme.
I had not changed servers recently and this was not a www vs non www url issue (which, as I understand, could create a duplicate page in Google’s eyes if both versions exist).
- It wasn’t a www URL issue
The URL for my blog does not incude www in the URL.
It used to. I beleive this change happened when I switched servers and/or installed a security certificate on my site (I have an assitant that handles tech issues like that so I’m not completely clear on how or when it happened).
I inspected the “excluded” URLs in Google Search Console and the “user-declared canonical” URL was the same as the one I had asked Google to inspect.
This indicates to me that the actual URL and the URL Google has assigned to my blog post match (ie. both versions are non-www).
- Plagarisim and duplicate content off your blog
I did not run my articles through a plagarisim checker, but searched Google for my exact article title and primary keyword to see if a scraped version of my article existed and might be competing with mine.
A duplicate of my artice did not show up so there was no scraping/duplicate content issue.
Plus, I’ve had plenty of articles scraped over the years and this “excluded” thing has never happened to be before.
- Duplicate content on your own blog
The niche topics of these articles were unique on my blog.
No other artilces, or even parts of other articles, addressed this topic so it wasn’t an issue of keyword canibalization or Google thinking another one of my articles covered the topic better.
But, for the record, I’ve written basically two different articles on the same topic, that targeted same keyword, and had some duplicate information, and both articles were indexed by Google (However, one ranked high in search results and one barely recieved any traffic).
Also, Google has said they don’t penalize you for duplicate content.
So the excluded articles were not thin content with little value.
- The articles were not to new to be properly indexed
The articles were 3.5-6 months old and articles just prior and after were indexed.
That means that it’s unlikely that these articles weren’t indexed because I didn’t wait long enough for Google to do it.
- The mystery reason was the only reason left
This left me with only a few options to try.
The only explanation as to why this happened from the quote in the intro is that for some reason Google thought excluding my articles was “appropriate” but I didn’t know why.
How I Got My Articles That Were Excluded Indexed
This is what I did.
I’m not saying this is a solution for everyone but want to share my experience and to address the potential solutions in case anyone else discovers their articles have been excluded from indexing by Google and want to try and correct it.
I requested reindexing of these articles and checked them again in GSC a couple hours later. It showed that they were crawled again but still deemed “excluded”.
Based on what I read, and what my SEO friends said, I determined that the “mystery reasons” could include:
- Too may spelling and grammar mistakes
- Thin content or information that was already covered well by several articles on the internet (ie. almost the same information)
- No internal links
So I started going through this list to “correct” these possible issues for one of the articles.
- I proofread and edited the article for wording if it could be improved
- I added added some more information to the article to add value and help is stand out
- Added a few internal links (this particular article didn’t have any).
Then I resubmitted the URL for indexing by Google Serch Console. Almost immediately, it showed as indexed.
So it appears that what I did to improve the article made the difference. Or did it?
For the second article, I just proofread it and edited and spelling mistakes (I think there was one) and added a couple more intneral links.
I didn’t touch the third article.
When I rechecked these other two articles, they also showed as indexed.
The second time I rechecked these articles to see if they were no longer excluded, it was approximately 3 hours after my initial attempt at reindexing.
Given that one article was not improved, I can’t say for sure it was improving the articles that made the differnce. But it possibly did.
The third article already had a lot of internal links, no spelling or grammar mistakes, and contained a lot of unique information.
So the articles being “excluded” may have been a mistake and it took a while after being reindexed for the status to reflect that they were no longer excluded.
Or perhaps it was just a mistake for this one article and the other two were now indexes because they were improved versions that Google now thought had value.
Final Thoughts on Articles Excluded In Google Search Console
So my takeaway is this, if you have an excluded article and you think it’s in error, resubmit it for indexing first.
Even if GSC shows it’s was recrawled but still “excluded”, wait longer.
Going forward, I will wait a day to see if it becomes indexed before spending the time to edit it if I don’t think it needs any improvements.
If it still shows as excluded, I would go through the list of potential reasons I listed in the first section.
If none of those appeared to be the issue, I would review/improve the article as described above (spelling, grammar, value, internal links), and resubmit it again. And wait a day again.
Improving an article never hurts the user expierience and since the blog post wasn’t ranked in search results to beging with, significatnly changing it won’t hurt the position in search results.
If it still showed excluded, I am not sure what I would do. At this point, the only reasons an article may be excluded that I’m aware of are what I listed in this article.
It’s my understanding this “excluded” status is becoming more common as more and more articles on the same topic flood the internet.
Google has come out and said they don’t index all articles published on the internet and they do this on purpose.
Therefore, I expect more information about how to fix the “excluded” status of a blog post may become available. I will update this post if hear about any other potential causes.
Also, as more and more articles are published on the internet, and Google’s aim is to serve the best answer to people, it may become more common for articles to be crawled but “excluded”.
In other words, this may become a commonplace thing that bloggers just have to accept.
Have you had an article go from excluded in Google Search Console to indexed? If so, what worked for you?