Umbraco XML Sitemap

Creating an XML (Google) Sitemap for Umbraco using LinqToXml

It's becoming more and more commonplace to add an XML Sitemap to your website. In a nutshell a sitemap allows search engines to easily discover every page in your site hierarchy and therefore crawl them. A lot of people think sitemaps are a Google thing, but actually they are a standard defined by http://www.sitemaps.org/, though of course Google does utilise them. But so do many other search engines.

So what does a sitemap look like? In essence it's a very simple XML file that contains a list of all the pages in your site. The sitemap protocol defines the structure of the XML.

Creating a Sitemap for Umbraco CMS

In this post I want to concentrate on automatically generating a sitemap for an Umbraco site (Umbraco is a popular .NET CMS). There are a couple of Umbraco packages that do this, but I find my way simpler and quicker to deploy - simply copy the file into the root of your website and that is it - no packages needed.

I also aim to show you how you can use the Umbraco API and LinqToXml (introduced in .NET 3.5) to do this. The code will be in C#, but should be easily convertible to other .NET languages, and is created using a standard generic handler .ashx file.

The basic principle is to create an XDocument and then iterate over every published node using the Umbraco nodeFactory creating an XElement for each page. The handler then outputs the resulting XML document, setting the correct content-type.

The C# Code for the Handler

using System;
using System.Web;
using System.Xml.Linq;
using umbraco.presentation.nodeFactory;

public class SiteMap : IHttpHandler
{
    /* Generates an XML Sitemap for Umbraco using LinqToXml */
   
    private static readonly XNamespace xmlns = "http://www.sitemaps.org/schemas/sitemap/0.9";

    public void ProcessRequest(HttpContext context)
    {
        // Set correct headers from XML
        context.Response.ContentType = "text/xml";
        context.Response.Charset = "utf-8";
       
        // Get the absolute base URL for this website
        Uri url = HttpContext.Current.Request.Url;
        string baseUrl = String.Format("{0}://{1}{2}", url.Scheme, url.Host, url.IsDefaultPort ? "" : ":" + url.Port);

        // Create a new XDocument using namespace and add root element
        XDocument doc = new XDocument(new XDeclaration("1.0", "utf-8", "yes"));
        XElement urlset = new XElement(xmlns + "urlset");

        // Get the root node
        Node root = new Node(-1);

        // Iterate all nodes in site and add them to document
        RecurseNodes(urlset, root, baseUrl);
        doc.Add(urlset);

        // Write XML document to response stream
        context.Response.Write(doc.Declaration + "\n");
        context.Response.Write(doc.ToString());
    }

    // Method to recurse all nodes and create each element
    private static void RecurseNodes(XElement urlset, Node node, string baseUrl)
    {
        foreach (Node n in node.Children)
        {
            // If the document has a property called "hidePage" set to true then ignore this node
            if (n.GetProperty("hidePage") == null || n.GetProperty("hidePage").Value != "1")
            {
                string url = umbraco.library.NiceUrl(n.Id);
                // Tidy up home page so it's more canonical
                if (url.EndsWith("/home.aspx"))
                    url = url.Replace("/home.aspx", "/");
                // Create the XML node
                XElement urlNode = new XElement(xmlns + "url", new XElement(xmlns + "loc", baseUrl + url), new XElement(xmlns + "lastmod", n.UpdateDate.ToUniversalTime()));
                urlset.Add(urlNode);
            }

            // Check if the node has any child nodes and, if it has, recurse them
            if (node.Children != null && node.Children.Count > 0)
                RecurseNodes(urlset, n, baseUrl);
        }
    }

    public bool IsReusable
    {
        get
        {
            return false;
        }
    }
}

Download Link

XML SiteMap Handler
Posted: 16 June 2010 | Bookmark: Permalink | Comments: Post Comment | Follow: RSS Feed | Bookmark and Share
Tagged: , , | Topics: Technology_Internet | Social Tags: Sitemaps, Site map, Sitemap index, ASP.NET, XML, Computing, Search engine optimization, World Wide Web, Technology_Internet | Industry Terms: search engines | Organisations: Google

1 comment for “Umbraco XML Sitemap”

  • Gravatar of eran
    eran

    thanks, looking forward to read more linq to umbraco posts! as i understand, this will create a url like wwww.mysite.com/sitemap.ashx and i can then tell google that my sitemap located in this address? thanks.

     
     

Post a Comment

Please fill the form in below if you wish to add a comment:
Gravatar

Please don't post raw HTML - use editor above - otherwise you will get an error.