What is Google Sitemap and how to generate it?

What is Google Sitemap and how to generate it?

What is a sitemap?

A sitemap is an XML file contains links of a website's URLs. Sitemap is used to tell search engines that these URLs exist on your website, and you want to crawl them and to appear them in search engines. It also allows website owners or webmasters to add additional information with each URL of their website in sitemap including lastmod, changefreq, and priority, where lastmod tells search engines that when that particular resource/file/data URI was modified on server, changefreq tells search engine that how frequently the resource content will change on the server, possible values for changefreq include always, hourly, daily, weekly, monthly, yearly and never, priority tells search engines that how much important a URL is relative to other URLs in that sitemap for same domain/website for crawling purpose, possible values of priority could range from 0.0 to 1.0 where 1.0 means the URL is more important. XML based minimal sitemap structure is given as below:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> 
    <url>
        <loc>http://www.example.com/</loc> 
    </url>
    <url>
        <loc>http://www.example.com/file-url.html</loc> 
    </url>
</urlset>

Other sitemap formats:

A sitemap could be submitted to search engines in three formats: XML, RSS Feed and Text file. If a website or blog has RSS or Atom feed, that feed could be submitted to search engines as sitemap. Mostly the open source blog software like WordPress contains the feature to generate RSS feed which could be used as sitemap for search engines. Text file based sitemap is a simple form of sitemap contains one URL of a website per line for example:

http://www.example.com/url-1
http://www.example.com/url-2
http://www.example.com/my-file-1.html
http://www.example.com/my-file-2.html

Sitemap limitations:

Sitemap files have a limit of 50,000 URLs per one sitemap and 50 mega bytes of limit on its file size. But sitemap file could be compressed to gzip file format. Websites having more than 50,000 URLs should contain a sitemap index file with sub-sitemap files where sitemap index file will contain the list of sitemap files and each sitemap file should not contain more than 50,000 URLs and should not be larger than 50 mega bytes in size. Another sitemap limitation is any data or any website URL in sitemap should be escaped that is the special entities characters should be properly escaped like ampersand (&) should be written as (& amp;), single quote (') should be written as (& apos;), double quote (") should be written as (& quot;), less than symbol (<) should be written as (& lt;), and greater than symbol (>) should be written as (& gt;).

Generate a sitemap for search engines (Google, Yahoo, Bing and other)

Creating a sitemap for a small website with few static pages is very simple. But what will happen if a website is built on custom platform and not using any open source software and has hundreds of dynamic resources/URLs on the server? There are tons of online sitemap generators, which use a methodology where you supply them the URL of your website, and then a sitemap generator crawls that URL and scans for other URLs usually mentioned in anchor tag <a href="http://www.example.com/another-page.html">Another Page</a> on that main page and collects all the URLs related to main domain. And this way the sitemap generator tool iterate each collected URL for more URLs of the same domain for some depth level, and finally makes a list of collected URLs and then allows site owner or webmaster to download XML sitemap containing those collected URLs. So creating an XML sitemap for a custom website is not really hard today.

Comments

Recent Posts

Explore the recent blog posts about our Responsive Checker Tool and other Online Tools and get to know the latest information and latest technology trends from the whole world.

Best JSON formatter and JSON validator
  • 15 Jun
    2017

What is JSON? JSON stands for JavaScript Object Notation. JSON is a lightweight and data interchange format which is used to change the content into readable and writeable formats for humans. The JSON formatter formats the given JSON string into the formatted form with different colors. It has also colored format. JSON string output can be understood and can be read easily... Read More

Test your website on different mobile phones
  • 11 Jun
    2017

What is a mobile friendly website? It is a type of website which displays properly and perfectly on small screen devices such as: Android phones, iPhones, iPads, tablets, mini laptops etc. A mobile friendly website has many features like, it loads faster than non-mobile friendly website as many irrelevant sections are being hidden. A user can easily read the content on a mobile... Read More

How to generate an HTML sitemap
  • 06 May
    2017

What is a sitemap? "Sitemap is a type of file which consists of a huge number of URLs or LINKS of all web pages of your website." Working of a sitemap: Work of a sitemap is to provide a file of URLs of all pages of your website for Google and other search engines. In this manner, a user can easily navigate to... Read More