This is a showcase review for our sponsors at CodeProject. These reviews are intended to provide you with information on products and services that we consider useful and of value to developers.
This article compares various approaches to storing, editing, and displaying structured content on the web. It shows the advantages and disadvantages of using XML and relational SQL databases.
Choosing the right content storage is the crucial decision of every dynamic web site. How to store content? How to edit it? How to transform it? This article discusses various approaches based on XML and relational SQL databases.
Choosing the Data Storage
The basic question is obviously: Where do I store the data? You can choose between:
- XML files stored on the disk,
- an XML database,
- some proprietary storage, or
- a relational SQL database.
XML files stored on the disk can be used for very basic web sites since they do not provide any data management features, such as concurrent access, indexing, security management, etc.
While XML (or object-oriented) databases can be an optimal solution in some specific cases, their lack of standards, higher cost of ownership, and limited adoption on the market will most likely let you choose a different solution. You will also most likely choose not to use a proprietary storage since you want to build a web site, not your own database system.
Traditional (relational) SQL databases seem to be a reasonable choice – they’re affordable and wide-spread. They are also optimized for high performance. It makes them the right solution for a general-purpose CMS.
Storing Structured Content
Now that we have chosen an SQL database for storing data and other system information, such as users and permissions, we need to choose how to store our content in the database. There are two basic approaches that you can see in the current content management systems:
- Storing whole documents as XML in a single field.
- Storing documents of different types in separate tables.
Storing Documents as XML
This approach assumes that you store all documents in a single database table that has a single field for holding the whole document. It may look like this:
DocumentXML field contains XML documents like this one:
<title>My first article<title>
<summary>This is a summary.</summary>
<articletext>This is full article text.</articletext>
Storing Documents in Separate Tables
If you are accustomed to standard relational database schemas, you will most likely want to have a separate database table for every document type. So you will have database tables like these:
And you may want to join the common fields into one Documents table while keeping document type-specific fields in separate tables:
Which One is Better?
Both of these approaches have their pros and cons. The XML approach is generally easier to implement and more flexible to changes. It is suitable for semi-structured content, such as a page with several text-only sections:
You will benefit from this option if you need to edit semi-structured, text-oriented content while keeping it separated from the page design. You can simply change the presentation by modifying the page template design at any time.
However, you may encounter severe problems when you need to display more than one XML document on the page. Just consider a product catalog with dozens of items: when you need to display a listing of products, you need to go through all database records, retrieve the XML document, parse it, and render. As you can imagine, this is not straightforward, and the performance will not be optimal.
Besides, the XML approach doesn’t really support typed data (such as integer, datetime, etc.) as all content is serialized to text. If you decide to sort products by price using XSLT, you will sort them as text which will result in incorrect sorting. You will also need to ensure that date-time values or decimal numbers are parsed and displayed in the correct format.
In the case of strictly structured data, such as product specifications, you will benefit from the traditional relational database approach. It allows you to easily retrieve data from the database and display them on the web site using standard ASP.NET controls, such as
DataGrid, while having a complete control over the format. You can also easily sort them or filter them on the database server, while leveraging the indexes that provide optimal performance. However, this approach results in several challenges that we discuss in the following paragraphs.
Facing the SQL Challenge
The traditional SQL approach is (by design) less flexible for content management. The database structure is not so flexible to changes. Look at this table:
Let’s consider that your HR manager wants to publish an annual salary for this job opening. What you typically have to do is:
- add a table column
- modify the SQL queries,
- modify the code you use for editing the content and add the salary field to the editing form, and
- modify the code you use for displaying job descriptions.
The solution for this issue is to add a "meta" layer to your code. Instead of writing all code by hand or generating code, you may want to create a system where you describe the data structure or editing form and let your content management engine process it.
Here’s what we have done in Kentico CMS for .NET: When you want to add a new field to the document type, you simply use the web-based user interface and add a new attribute to the document of type “Job”. You also specify that this attribute is of decimal type, and that it should be displayed as a textbox in the editing form:
The system automatically updates the related database table:
It also updates the standard SQL queries for
When you edit the job document now, you will see a dynamically rendered editing form like this one:
The only hand coding you need to do is to add an appropriate field to the transformation that represents the
ItemTemplate section of the ASP.NET
The document on the web looks like this:
How it Works
You may wonder how this works inside: All document types (news, jobs, product specifications) are described using configuration files that define the:
- data schema,
- standard SQL queries,
- editing form, and
The following figure shows how these elements work together:
The CMS engine receives a request for a page. It reads the settings (SQL queries, etc.) for the given document type. It runs an SQL query to select data from the database. Then, it binds the retrieved
DataSet to the
Repeater control (or some other control) that displays the content using a predefined transformation. The page is displayed to the visitor.
As you can see, there’s a place for both XML and SQL-oriented approach in the content management world. Both of them have their pros and cons, and you need to choose them depending on the particular situation.
Now, you can download the free trial of Kentico CMS for .NET and try it for yourself. The Kentico CMS Tutorial will guide you through the process of creating web pages with both XML-oriented and SQL-oriented design.
Kentico CMS for .NET Fast Facts
|Key content management features||
- Tree organization of documents
- Content/design separation
- Content scheduling
- Multilingual content
- Document-level permissions
- Content Staging (synchronization between servers)
|Key web site features||
- Flexible navigation using several types of navigation controls
- Full-text search in both content and files
- Secured areas for registered users
- Multilingual content
- E-commerce/Shopping cart
- On-line forms
- Image gallery
- Visual Studio .NET 2003
- Visual Studio 2005
- Visual Web Developer 2005 Express Edition
|Supported programming languages||
- Microsoft SQL Server 2000
- MSDE 2000
- Microsoft SQL Server 2005
- Microsoft SQL Server 2005 Express
|Supported user interface languages||
- (you can easily create your own translation)
|Supported content languages||
- Any language – Kentico CMS uses UNICODE encoding for all content.
|Supported web browsers for editors||
- IE 6.0+
- Mozilla 1.7.1+
- Netscape 7.1+
- FireFox 1.3+
|Supported web browsers for visitors||
- Depends mostly on your HTML code, transformations, and used controls.
Other Kentico Products
You may also want to try the following Kentico products for developers: