|
Introduction
The simple sitemap library is, as stated, an easy way to add sitemap support to your ASP.NET application. Just add the SitemapLib.cs file to your project and you are ready to go. The library implements the sitemaps.org standard for creating an XML sitemap file, and includes Ping support for all the search engines that currently support it: Google.com, Yahoo.com and Ask.com.
Please feel free to modify, reuse or redistribute the code as you see fit.
Background on Sitemaps
The sitemap standard was created by Google, Yahoo and Microsoft to provide an easy way for web site owners to inform search engines about pages on their site that are available for crawling.
Web crawlers usually discover pages from links within the site and from other sites. Sitemaps supplement this data to allow crawlers that support sitemaps to pick up all URLs in the sitemap and learn about those URLs using the associated metadata. Using sitemaps does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site.
Using SitemapLib
The library supports two main aspects of the standard, creating XML sitemaps, and Pinging the major search engines to notify them of updates in the sitemap.
Creating an XML Sitemap
The code snippet below demonstrates all the code required to create a sitemap using the library. Since the sitemap will be an XML file, it is better to use a "Generic Handler" file type (ashx) than an ASPX file. ASPX is optimized to generate HTML files, which doesn't work in this case. The ASHX file on the other hand, allows you to create any type of file you would like.
using System;
using stem.Web;
using SitemapLib;
public class Handler : IHttpHandler
{
public void ProcessRequest (HttpContext context)
{
SitemapLib.Sitemap sitemap = new SitemapLib.Sitemap();
sitemap.AddLocation("http://mysite.com/default.aspx");
sitemap.AddLocation(http:
DateTime.Today);
sitemap.AddLocation(http:
DateTime.Today, "0.8", ChangeFrequency.Monthly);
context.Response.ContentType = "text/xml";
context.Response.Write(sitemap.GenerateSitemapXML());
}
public bool IsReusable
{
get
{
return false;
}
}
}
Once you have instantiated a new SitemapLib.Sitemap class, you can use one of the many versions of the AddLocation() method to add URLs to your sitemap, depending on which metadata you would like to include in your sitemap file.
Note
The library validates the input URLs and the output file for common issues like malformed URLs, max URL length, max Filesize length, etc. To be a good programmer, you should use try{} catch(Exception e) {} and handle all errors thrown. Failure to manage these errors could mean that the search engines disregard your sitemap file.
Notifying Search Engines with a Ping
You have the option to notify each search engine that supports sitemaps of updates by using an HTTP Ping command. The following code shows you how to use the SitemapLib to ping all the Search Engines that support it.
Sitemap.Ping("http://mysite.com/sitemap.ashx", "Yahoo Application ID");
The first parameter should be the fully qualified path to your sitemap.xml file. Note that your XML file does not need to have an "XML" extension to be valid. The only odd thing about using the Ping is that Yahoo breaks the standard by requiring that developers include an application id with each ping. You can easily provision one here: http://developer.yahoo.com/search/siteexplorer/V1/updateNotification.html and include it in your program.
Notes on Implementing Sitemaps on your Website
-
Don't forget to add the "sitemap:" directive to your Robots.txt file
Instead of registering your sitemap with every search engine, you can support all of them by adding one line to your Robots.txt file, "Sitemap: <fully qualified path to your sitemap file, or sitemap index file>".
-
Sitemaps are a suggestion, not a command
The search engines will take all of the information provided by your sitemaps as one of many inputs into determining what pages should be included in their index, what is the relative priority of each page on your site, and how frequently your pages should be crawled.
-
Generally, sitemaps will not impact your page rank
They are most likely to affect the number of pages that are crawled and indexed, as well as the speed at which they are indexed. However, there are some cases where sitemaps have significantly impacted the page rank of a site, and that is when the sites were not getting fully crawled, and the sitemap led to the inclusion of some really good content into the index.
-
Give your converting pages the highest priority
Product pages, newsletter signup pages and other pages that generate the most value from your customers should receive the highest priority. Also include the pages that you know your customers will be the most interested in.
-
Don't give all your pages 1.0 priority
This just tells all the search engines that all your pages are of equal importance, it doesn't help boost the importance of any of your pages specifically.
-
Give sitemaps time to see impact
You may not see an impact right away from implementing sitemaps, but that doesn't mean that they are not being used. Search engines are still optimizing their implementations and figure out how exactly to use all of this information.
| You must Sign In to use this message board. |
|
| | Msgs 1 to 13 of 13 (Total in Forum: 13) (Refresh) | FirstPrevNext |
|
 |
|
|
Hey! Thanks for the code. I took it one step further and altered the above VB translation to use XmlTextWriter instead of StringBuilder. All the changes needed to do so are in the GenerateSiteMapXML function. See below.
Public Function GenerateSiteMapXML() As String Dim sb As New StringBuilder With New XmlTextWriter(New StringWriter(sb)) .WriteStartDocument() .WriteStartElement("urlset") .WriteAttributeString("xmlns", "http://www.sitemaps.org/schemas/sitemap/0.9")
' ERROR CHECK: see if there are more than 50k URLs in file If SiteMapList.Count > 50000 Then Throw New Exception("Sitemap file cannot contain more than 50,000 URLs. Refer to http://sitemaps.org for details")
For Each item As SiteMapItem In Me.SiteMapList ' ERROR CHECK: Make sure a URL was entered If item.Loc.Length < 9 Then Throw New Exception("Sitemap entry must include URL. Refer to http://sitemaps.org for details") End If
' ERROR CHECK: URL must include protocol (http, https) If (Not item.Loc.Substring(0, 7).ToLower().Equals("http://") And _ Not item.Loc.Substring(0, 8).ToLower().Equals("https://") And _ Not item.Loc.Substring(0, 7).ToLower().Equals("feed://")) Then
Throw New Exception("Sitemap URLs must include protocol (e.g. http://). Refer to http://sitemaps.org for details") End If
' ERROR CHECK: URL must be smaller than 2048 characters If item.Loc.Length >= 2048 Then Throw New Exception("Sitemap URLs cannot have more than 2048 characters. Refer to http://sitemaps.org for details") End If
.WriteStartElement("url") .WriteElementString("loc", _entityEscape(item.Loc))
If (0L <> item.LastMod.Ticks) Then .WriteElementString("lastmod", item.LastMod.ToString("yyyy-mm-dd")) End If
' CHANGE FREQUENCY FIELD If (ChangeFrequency.DontUseThisField <> item.ChangeFreq) Then
Select Case item.ChangeFreq Case ChangeFrequency.Always .WriteElementString("changefreq", "always") Exit Select Case ChangeFrequency.Daily .WriteElementString("changefreq", "daily") Exit Select Case ChangeFrequency.Hourly .WriteElementString("changefreq", "hourly") Exit Select Case ChangeFrequency.Monthly .WriteElementString("changefreq", "monthly") Exit Select Case ChangeFrequency.Never .WriteElementString("changefreq", "never") Exit Select Case ChangeFrequency.Weekly .WriteElementString("changefreq", "weekly") Exit Select Case ChangeFrequency.Yearly .WriteElementString("changefreq", "yearly") Exit Select End Select
End If
If item.Priority.Length > 0 Then .WriteElementString("priority", item.Priority) End If .WriteEndElement() Next .WriteEndDocument() .Close() End With Return sb.ToString End Function
|
| Sign In·View Thread·PermaLink | 5.00/5 (1 vote) |
|
|
|
 |
|
|
Here's my VB.NET translation if it saves anyone any time -
Thanks for the Article!!
-------------------------------------------------------------------------
Imports Microsoft.VisualBasic Imports System.Collections.Generic Imports System.Net
' Original from: http://www.codeproject.com/aspnet/simplesitemaps.asp Namespace SiteMapLib
Public Enum ChangeFrequency Always = 0 Hourly Daily Weekly Monthly Yearly Never DontUseThisField End Enum
#Region " SiteMapItem Class " Public Class SiteMapItem Private _loc As String Public Property Loc() As String Get Return _loc End Get Set(ByVal value As String) _loc = value End Set End Property
Private _lastmod As DateTime Public Property LastMod() As DateTime Get Return _lastmod End Get Set(ByVal value As DateTime) _lastmod = value End Set End Property
Private _priority As String Public Property Priority() As String Get Return _priority End Get Set(ByVal value As String) _priority = value End Set End Property
Private _changeFreq As ChangeFrequency Public Property ChangeFreq() As ChangeFrequency Get Return _changeFreq End Get Set(ByVal value As ChangeFrequency) _changeFreq = value End Set End Property
Public Sub New() End Sub End Class
#End Region
#Region " SiteMap Class " Public Class SiteMap
Public SiteMapList As List(Of SiteMapItem) = Nothing
Private Function _entityEscape(ByVal s As String) As String Return s.Replace("&", "&").Replace("'", "'").Replace("""", """).Replace(">", ">").Replace("<", "<") End Function
Public Sub AddLocation(ByVal location As String) Me.AddLocation(location, New DateTime(0), "", ChangeFrequency.DontUseThisField) End Sub
Public Sub AddLocation(ByVal location As String, ByVal lastmod As DateTime) Me.AddLocation(location, lastmod, "", ChangeFrequency.DontUseThisField) End Sub
Public Sub AddLocation(ByVal location As String, ByVal lastmod As DateTime, ByVal changeFreq As ChangeFrequency) Me.AddLocation(location, lastmod, "", changeFreq) End Sub
Public Sub AddLocation(ByVal location As String, ByVal lastmod As DateTime, ByVal priority As String, ByVal changeFreq As ChangeFrequency) Dim item As New SiteMapItem() item.Loc = location item.LastMod = lastmod item.Priority = priority item.ChangeFreq = changeFreq
Me.AddLocation(item) End Sub
Public Sub AddLocation(ByVal item As SiteMapItem) If Me.SiteMapList Is Nothing Then SiteMapList = New List(Of SiteMapItem) Me.SiteMapList.Add(item) End Sub
Public Function GenerateSiteMapXML() As String Dim sb As New StringBuilder("<?xml version=""1.0"" encoding=""UTF-8""?>") sb.Append("<urlset xmlns=""http://www.sitemaps.org/schemas/sitemap/0.9"">")
' ERROR CHECK: see if there are more than 50k URLs in file If SiteMapList.Count > 50000 Then Throw New Exception("Sitemap file cannot contain more than 50,000 URLs. Refer to http://sitemaps.org for details")
For Each item As SiteMapItem In Me.SiteMapList ' ERROR CHECK: Make sure a URL was entered If item.Loc.Length < 9 Then Throw New Exception("Sitemap entry must include URL. Refer to http://sitemaps.org for details") End If
' ERROR CHECK: URL must include protocol (http, https) If (Not item.Loc.Substring(0, 7).ToLower().Equals("http://") And _ Not item.Loc.Substring(0, 8).ToLower().Equals("https://") And _ Not item.Loc.Substring(0, 7).ToLower().Equals("feed://")) Then
Throw New Exception("Sitemap URLs must include protocol (e.g. http://). Refer to http://sitemaps.org for details") End If
' ERROR CHECK: URL must be smaller than 2048 characters If item.Loc.Length >= 2048 Then Throw New Exception("Sitemap URLs cannot have more than 2048 characters. Refer to http://sitemaps.org for details") End If
sb.Append("<url>")
' LOCATION FIELD sb.Append("<loc>") sb.Append(_entityEscape(item.Loc)) sb.Append("</loc>")
' LAST MODIFIED FIELD If (0L <> item.LastMod.Ticks) Then sb.Append("<lastmod>") sb.Append(item.LastMod.ToString("yyyy-MM-dd")) sb.Append("</lastmod>") End If
' CHANGE FREQUENCY FIELD If (ChangeFrequency.DontUseThisField <> item.ChangeFreq) Then sb.Append("<changefreq>")
Select Case item.ChangeFreq Case ChangeFrequency.Always sb.Append("always") Exit Select Case ChangeFrequency.Daily sb.Append("daily") Exit Select Case ChangeFrequency.Hourly sb.Append("hourly") Exit Select Case ChangeFrequency.Monthly sb.Append("monthly") Exit Select Case ChangeFrequency.Never sb.Append("never") Exit Select Case ChangeFrequency.Weekly sb.Append("weekly") Exit Select Case ChangeFrequency.Yearly sb.Append("yearly") Exit Select End Select
sb.Append("</changefreq>") End If
If item.Priority.Length > 0 Then sb.Append("<priority>") sb.Append(item.Priority) sb.Append("</priority>") End If
sb.Append("</url>") Next
sb.Append("</urlset>")
If sb.Length > 10485760 Then Throw New Exception("Sitemap file cannot be larger than 10MB. Refer to http://sitemaps.org for details") End If
Return sb.ToString End Function
Public Sub Ping(ByVal sitemapFileURL As String, ByVal yahooAppID As String) Dim STD_PING_PATH As String = "/ping?sitemap=" + sitemapFileURL
Try Dim request As WebRequest = System.Net.HttpWebRequest.Create("http://www.google.com/webmasters/tools" + STD_PING_PATH) Dim response As WebResponse = request.GetResponse() Catch e As Exception ' TODO: handle this error! Throw e End Try
' Yahoo: Try Dim request As WebRequest = System.Net.HttpWebRequest.Create("http://search.yahooapis.com/SiteExplorerService/V1/updateNotification?appid=" + yahooAppID + "&url=" + sitemapFileURL) Dim response As WebResponse = request.GetResponse() Catch e As Exception ' TODO: handle this error! Throw e End Try
' ASK.COM: Try Dim request As WebRequest = System.Net.HttpWebRequest.Create("http://submissions.ask.com" + STD_PING_PATH) Dim response As WebResponse = request.GetResponse() Catch e As Exception ' TODO: handle this error! Throw e End Try End Sub End Class
#End Region
End Namespace
Mike Joseph
|
| Sign In·View Thread·PermaLink | 5.00/5 (2 votes) |
|
|
|
 |
|
|
 |
|
|
This page converts between C# and VB.NET: http://labs.developerfusion.co.uk/convert/csharp-to-vb.aspx
|
| Sign In·View Thread·PermaLink | 1.00/5 (1 vote) |
|
|
|
 |
|
|
Hi,
could you explain, why this character not allowed used in sitemap file? What should i do to handle it?
Thanks,
Regards me, Jariaman Barus
Regards me,
Jariaman Barus jariamanbarus@gmail.com
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Jariaman, the ampersand (& is not allowed because sitemaps must be in the UTF-8 encoding to be properly processed.
However, you shouldn't have to worry about that, the SitemapLib takes care of this encoding process for you, so you can just give it any valid URL and it will properly encode to UTF-8. For example, if your URL was the following:
http://mysite.com/products?id=3&color=red
you can just use the following method from the library like this:
sitemap.AddLocation("http://mysite.com/products?id=3&color=red", DateTime.Today, "0.8", ChangeFrequency.Monthly);
regards, nate
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Rather than using a custom XML escaping function, it would be better to use an System.Xml.XmlWriter to render the XML:
public string GenerateSitemapXML() { using (StringWriter writer = new StringWriter()) { RenderSitemapXML(writer); return writer.ToString(); } } public void RenderSitemapXML(TextWriter writer) { if (null == writer) throw new ArgumentNullException("writer"); using (XmlWriter xml = XmlWriter.Create(writer)) { RenderSitemapXML(xml); } } public void RenderSitemapXML(XmlWriter writer) { if (null == writer) throw new ArgumentNullException("writer"); .... }
This would also allow you to render the XML directly to the response output stream:
context.Response.ContentType = "text/xml"; sitemap.RenderSitemapXML(context.Response.Output);
Also, it would be useful to provide a method to populate the sitemap from the ASP.NET web.sitemap file. The priority, change frequency and last modification time can be stored as custom attributes on the sitemap nodes.
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
|
| Sign In·View Thread·PermaLink | 1.00/5 (1 vote) |
|
|
|
 |
|
|
That's the problem with feedback - there are too many people smarter than you You are right, I just got a little lazy. I'll make those changes and repost the code sometime this week.
Support for the web.sitemap file will still take a little bit longer.
|
| Sign In·View Thread·PermaLink | 1.80/5 (2 votes) |
|
|
|
 |
|
|
Something like this should do the trick:
public void LoadSiteMap() { LoadSiteMap(HttpContext.Current); } public void LoadSiteMap(HttpContext context) { try { if (SiteMap.Enabled) { LoadSiteMapNode(context, SiteMap.RootNode); } } catch (System.Configuration.ConfigurationErrorsException ex) { Trace.Warn("PopulateSiteMap", "Error loading site-map:", ex); } } private void LoadSiteMapNode(HttpContext context, SiteMapNode node) { if (null != node) { string location = ResolveUrl(context, node.Url); if (!string.IsNullOrEmpty(location)) { SiteMapItem item = new SiteMapItem(); item.Loc = location; item.Priority = node["sitemaps.priority"]; string value = node["sitemaps.changefreq"]; if (!string.IsNullOrEmpty(value)) { try { item.ChangeFreq = (ChangeFrequency)Enum.Parse(typeof(ChangeFrequency), value, true); } catch (ArgumentException) { item.ChangeFreq = ChangeFrequency.DontUseThisField; } } value = node["sitemaps.lastmod"]; if (!string.IsNullOrEmpty(value)) { DateTime lastMod; IFormatProvider provider = DateTimeFormatInfo.CurrentInfo; DateTimeStyles style = DateTimeStyles.AllowWhiteSpaces | DateTimeStyles.NoCurrentDateDefault; if (!DateTime.TryParse(value, provider, style, out lastMod)) lastMod = DateTime.MinValue; item.LastMod = lastMod; } this.AddLocation(item); } if (node.HasChildNodes) { foreach (SiteMapNode child in node.ChildNodes) { LoadSiteMapNode(context, child); } } } } private string ResolveUrl(HttpContext context, string value) { Uri result = null; if (null != context && !string.IsNullOrEmpty(value)) { value = VirtualPathUtility.ToAbsolute(value); if (!Uri.TryCreate(context.Request.Url, value, out result)) { result = null; } } if (null == result) return null; return result.AbsoluteUri; }
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
|
| Sign In·View Thread·PermaLink | 1.00/5 (1 vote) |
|
|
|
 |
|
|
 |
|
|
Does the .zip included on this site now include all of the latest/updated code (even from comments from other users)?
Thanks.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
|
Nice article. Good approach which will hopefully help many folk get their dynamic content sites spidered by the search engines.
But (short of going blind) from where may one get the SitemapLib.cs file you refer to using in the article?
Wolfgang
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
General News Question Answer Joke Rant Admin
|