7.5 Search engines submission
As we have seen at the "SEO tweaking" SEO tweaking subsection we already have made our site search engine friendly but this is not good enough.The search engines have automatic crawlers but they are relatively slow and they are not searching everything. In the November 2007 finally the search engines made the sitemap protocol.
Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling.
In it's simplest form, a Sitemap is a XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.
For more informations take a look at www.sitemaps.org, make sure that you read the sitemaps FAQ too. It is suggested to use sitemap.xml with Google and urllist.txt with Yahoo.
Examples of both files are ported below.
<?xml version='1.0' encoding='UTF-8'?> <urlset xmlns="http://www.sitemaps.org/ schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/ XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/ schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/ sitemap.xsd"> <url> <loc>http://www.mysite.com</loc> <lastmod>2008-10-04</lastmod> <changefreq>daily</changefreq> <priority>1.0</priority> </url> <url> <loc>http://www.mysite.com/index.php </loc> <lastmod>2008-10-04</lastmod> <changefreq>weekly</changefreq> <priority>0.5</priority> </url> <url> <loc>http://www.mysite.com/book1.php </loc> <lastmod>2008-10-04</lastmod> <changefreq>weekly</changefreq> <priority>0.5</priority> </url> <url> <loc>http://www.mysite.com/book2.php </loc> <lastmod>2008-10-04</lastmod> <changefreq>weekly</changefreq> <priority>0.5</priority> </url> <url> <loc>http://www.mysite.com/cd.php </loc> <lastmod>2008-10-04</lastmod> <changefreq>weekly</changefreq> <priority>0.5</priority> </url> <url> <loc>http://www.mysite.com/author.php </loc> <lastmod>2008-10-04</lastmod> <changefreq>weekly</changefreq> <priority>0.5</priority> </url> </urlset>Example of sitemap.xml
http://www.mysite.com/ http://www.mysite.com/index.php http://www.mysite.com/book1.php http://www.mysite.com/book2.php http://www.mysite.com/cd.php http://www.mysite.com/author.phpExample of urllist.txt
These files are included at '06Demo' example. Modify them at your needs with notepad++.
Once you have modified them, upload them at the root of your site
Example :
http://www.mysite.com/sitemap.xml
http://www.mysite.com/urllist.txt
As you can see urllist.txt is really easy to make and check, but sitemap.xml is a bit complicated so it is better to test it's validation before submitting it on the search engines. Once again W3C will help us to make a qualification test of our sitemap xml file. Go to : www.w3.org/QA/Tools and select the XML Schema validator. (figure fig:45)
fig:45 W3C : XML and Sitemap Validator
Enter your's website url sitemap (example : http://www.mysite.com/sitemap.xml) and press the 'Get Results' button. Figure fig:46
fig:46 Validate Sitemap
Take a look at the results. A valid response should look like the figure fig:47
fig:47 Sitemap Results
Now we are ready to submit our sitemaps to the most popular search engines. Let's see how this can be done.