I get lots of emails where people have difficulties in getting site indexed and crawled by Google
Google is not crawling my XYZ website. I don’t know why? Can you help me please? My URL is somedomain.com
And if you post such question on any forum like one of the user of Go4Expert.com, IndianSword posted Google indexing and crawling issue? people always tend to suggest to get more incoming links for new websites and I also did the same but then after some general answers he still had problems and so I needed to see if there is something which is missing on his website. Digging his website’s HTML deeper I found it had some Meta issues for some misconfiguration settings in WordPress.
Now let’s see what could be the possible reason if your website in not in Google’s Index or if Google is not crawling your website
1. Blocked by robots.txt
We should always block certain areas of the website like admin area or test areas but doing it rightly with the right syntax is also a must. Done wrongly you can have problems with indexing in Google. See if you have any code in your robots.txt file which is blocking Googlebot to index your main pages. If you know the syntax rightly you can check yourself and if you are not aware of what should be the correct syntax add your website in Google Webmaster Tools and see if you have any wanted urls blocked by your robots.txt file.
2. Meta tags
Many webmaster by mistake have noindex Meta like
- <meta name=”robots” content=”noindex, nofollow”>
- <meta name=”robots” content=”noindex”>
If you have any such HTML on pages you want to be in Google Index try getting rid of such Meta.
Apart from noindex and nofollow I have also seen revisit-after Meta.
- <meta name=”revisit-after” content=”1 day”>
Though I haven’t seen any evidence of this being supported by major search engines but still if your page says <meta name=”revisit-after” content=”1 day”>, that won’t be taken as an instruction for search engine bots to return every day but rather as an instruction to go away from your site if it hasn’t been 1 day since the last visit.
3. Site is new
New websites are crawled less often in Google than more established websites and if your website is new there are high chances that Google would not be indexing your site very often. Instead of worrying much about it in Google’s index you should focus more on core of your business i.e. if its forum or blog getting more content or if its an e-commerce making more sales because eventually Google would start on your site when you have the right kind of mix with content and incoming links. If you addicted to checking your site stats or Google index read Getting out of Recession for Blogger
4. Less update frequency
If your websites pages do not update often then there is no reason for Googlebot to visit your site and update its cache. If your website does not update content regularly then there are chances that when you update the content Googlebot would not cache the new version that often.
Now if you have a static HTML homepage or website there is no need for Googlebot to come to your site very often. Now when you do some design changes or text changes Googlebot would not be updating your cache often. To avoid such issues you can always add Blog / Forum to your static site which gets more frequently updated. Blog could be as simple as what you or your company are undertaking.
5. Low incoming links
When your site is new you are bound to have few or no incoming links to your site and so try focusing on getting some good quality links to your site and instead of checking your site in Google’s index try getting more incoming links. See how to Build Links to Website. You should never be satisfied with the number of links your website has and always think about getting few more.
The Possible reason why Google may not crawl or index your website can be endless but I have tried to list few of the reasons I have experienced over time.
Still have issues post them in comments and I would be more than happy to give it a look.
But if you are not planning to subscribe through Email. Try subscribing to RSS.