Click here to Skip to main content
Click here to Skip to main content
Go to top

Beginners' guide to using MongoDB 2.2 and the official C# driver

, 9 Jan 2013
Rate this:
Please Sign up or sign in to vote.
Highlights the latest developments in both the Mongo open-source document database and the open-source official C# driver.

Introduction

This article attempts to highlight the latest developments in both the Mongo open-source document database and the open-source official C# driver and to supplement the previous reviews on CodeProject in the light of these improvements.

Overview of Document Databases.

Document databases store information relating to a record in a contiguous blob of data known as a document . A document’s structure usually follows the JSON format and consists of a series of key-value pairs. Unlike the schema of relational databases, the document’s structure does not reference empty fields. This flexible arrangement allows fields to be added and removed with ease. What’s more, there is no need to rummage about in various tables when trying to assemble the data; it’s all there in one solid block. The downside of all this is that Document databases tend to be meaty. But, now that disk drives are in the bargain basement, the trade off between speed of access and storage costs has shifted in favour of speed and that has given rise to the increased use of document databases. The Large Hadron Collider at Cern uses a document database but that's not why it keeps breaking down.

Hosted Web Server for MongoDb.

There is free web hosting of MongoDb at MongoHQ . The sandbox database plan provides 512mb of storage and is a good way to test drive the database. There is no need to download the mongoDb binaries and the web site’s user interface provide allows administrative tasks to be carried out. Just sign up, down load the driver and you’re cooking with gas. I’ve used this service, it seems to be genuinely free and there is no badgering to upgrade.

Desktop Installation of MongoDb and the C# Driver

All you need to get started is on the Mongodb website. Installation instructions are well documented although you might have to wade through some detritus to get the correct set for your system. You also need to download the C# .Net driver as well as the MongoDb binaries. The C# Driver consists of two libraries: the BSON Library, MongoDB.Bson.dll, and the C# Driver, MongoDB.Driver.dll. There is a basic user interface situated 1000 ports above the port the database listens on. For the default installation, this is at http://localhost:28017/. You need to have the line 'Set rest=true' in the mongod.cfg file to enable this interface. There are also more sophisticated open-source applications available for carrying out administration tasks on the database.

The Database Structure

The basic structure for storing data fields is the BsonElement. It’s a simple KeyValue pair . The Key contains a field name and the Value its value. The Value can itself be a BsonElement, so they can be nested, Russian doll style. Records are stored as documents. The Document is a collection of BsonElements. Here is an example document.

{
 _id : 50e5c04c0ea09d153c919473,
  Age : 43,
 Cars : {0:Humber,1: Riley},
 Forename : Rhys,
 Lastname : Richards

}

Every record does not need to contain every field. The only required field is the _id and fields can be added at a future date without having to change the existing records. In this example, the Cars field is an array. Its Value field contains a nested Document. The elements in the nested Document are KeyValue pairs. The key is the array index number and the value is the name of the car.

The C# driver.

The driver is used to interface your code to a Mongo database. The driver can serialize data classes to the database without the need for special attributes. All that's required is a unique Id. This is usually of type BSON.ObjectId, a 12 byte time-stamped value which is automatically assigned by Mongodb . You can use a GUID instead but it needs to be mapped to a string. The reason for this is that a GUID is usually stored as a binary and the driver’s Aggregation Framework has problems digesting binary data. I get the same sort of trouble with cucumber sandwiches.

Connecting to the database.

The first requirement is to have a connection string. If you plan to use hosted version, you need to sign up to mongoHQ with a username and password, create a database and register yourself as a new user for the database. Make a note of the login string provided , it will look something like.

const string hostedWebConnectionString = 
   "mongodb://myUserName:myPassword@linus.mongohq.com:myPortNumber/";

The default connection string for the desktop server is simply mongodb://localhost. Here is the code for accessing a database named test.

const string connectionString = "mongodb://localhost";

// Create a MongoClient object by using the connection string
var client = new MongoClient(connectionString);

//Use the MongoClient to access the server
MongoServer server = client.GetServer();

// Use the server to access the 'test' database
MongoDatabase database = server.GetDatabase("test");

These calls will fail on the hosted site if the test database does not exist or you are not a registered user of the database. This is because new databases must be created in admin mode on the hosted web site. These constraints do not apply to the desktop server -it will go ahead and create a new database called ‘test’ if it does not already exist.

Accessing collections.

Documents with a similar structure are arranged as named collections of data in the database. The driver has a Collection object that acts as a proxy for a database’s collection. The following code shows how to access and enumerate a collection, named 'entities', of type ClubMember.

//Builds new Collection if 'entities' is not found
MongoCollection<ClubMember> collection = database.GetCollection<ClubMember>("entities");

Console.WriteLine("List of ClubMembers in collection ...");
MongoCursor<ClubMember> members = collection.FindAll();
foreach (ClubMember clubMember in members)
{
    clubMember.PrintDetailsToScreen();
}

It’s recommended that the foreach method is used wherever possible as it cleans up after itself. Boring housekeeping duties such as calling AttachDatabase(),DropDatabase() are a thing of the past. You should avoid calling DropDatabase() as it closes down the database's connection pool.

Indexes.

MongoDB indexes use a B-tree data structure. All queries only use one index and a query optimiser chooses the most appropriate index for the task. The following code builds an index to sort data based on the Lastname property then by the Forename sorted A-Z and finally by the Age property, oldest to youngest.

//Build an index if it is not already built
IndexKeysBuilder keys = IndexKeys.Ascending("Lastname", "Forename").Descending("Age");

//Add an optional name- useful for admin
IndexOptionsBuilder options = IndexOptions.SetName("myIndex");

//This locks the database while the index is being built
collection.EnsureIndex(keys, options);

This index is great for searching on Lastname or Lastname, Forename or Lastname, Forename, Age. It is not useful for sorting on Forename or Age or any combination of the two. The default behaviour is for indexes to be updated when the data is saved as this helps to prevent concurrency problems. But there is still a potential problem if newly written data is immediately read back. The way round this is to ensure that the write and read operations are performed on the same thread by enclosing the operations within the following.

using (server.RequestStart(database))
{
}

Querying Data Using Linq.

This is done by referencing the Collection’s AsQueryable method before writing the Linq statements All the usual methods are available. Here are a few examples

var names =
    collection.AsQueryable().Where(p => p.Lastname.StartsWith("R") && p.Forename.EndsWith("an")).OrderBy(
        p => p.Lastname).ThenBy(p => p.Forename).Select(p => new { p.Forename, p.Lastname });

Console.WriteLine("Members where the Lastname starts with 'R' and the Forename ends with 'an'");
foreach (var name in names)
{
    Console.WriteLine(name.Lastname + " " + name.Forename);
}

var regex = new Regex("ar");
Console.WriteLine("List of Lastnames containing the substring 'ar'");
IQueryable<string> regexquery =
    collection.AsQueryable().Where(py => regex.IsMatch(py.Lastname)).Select(p => p.Lastname).Distinct();

foreach (string name in regexquery)
{
    Console.WriteLine(name);
}

Querying Data Using The QueryBuilder Class.

Using the query builder classes is not as exciting as writing Linq, you don’t get the opportunity to put lots of arrows in your code, but there are still some methods that are worth highlighting.

DateTime membershipDate = DateTime.Now.AddYears(-5);

//DateTime is stored in the BsonElement as a UTC value so need to convert
DateTime membershipDateUTC = membershipDate.ToUniversalTime();

//Query.GT implements a 'greater than' query.
// The parameters are a field name and its Value 

MongoCursor<ClubMember> recentMembers =
collection.Find(Query.GT("MembershipDate", membershipDateUTC));
Console.WriteLine("Members who have joined in the last 5 years ...");
foreach (ClubMember clubMember in recentMembers)
{
    clubMember.PrintDetailsToScreen();
}

There are methods to carry out most of the common sorts of comparisons.The Query.And method does a logical AND on successive Query objects. The next bit of code illustrates this by finding all members called David Jones and then updating the Forename to Dai. The Update.Set() method sets the Forename field to its new value on all documents selected. Finally , Collection.Update performs the update on the server side.

//Change the name of every David Jones to Dai Jones
IMongoQuery davidJonesQuery = Query.And(Query.EQ("Lastname", 
  "Jones"), Query.EQ("Forename", "David"));
UpdateBuilder update = Update.Set("Forename", "Dai");
collection.Update(davidJonesQuery, update, UpdateFlags.Multi);

Querying Data Using Map Reduce.

MapReduce is a heavy-duty method used for batch processing large amounts of data. There are two main parts to it. A map function that associates a field with a value and a reduce function that reduces the input values to a single output. There is an example using Map Reduce in the sample code as it may come in handy but for most users the Aggregation Framework is a better way of collating data.

Querying Data Using The Aggregation Framework.

The Aggregation Framework is used to collect and collate data from various documents in the database. It’s new in version 2.2 and is an attempt to bring the functionality of SQL to a document database. The aggregation is achieved by passing a collection along a pipeline where various pipeline operations are performed consecutively to produce a result. It’s an oven-ready chicken type production line -there is less product at the end but it is more fit for purpose. Aggregation is performed by calling the Collection’s Aggregate method with an array of documents that detail various pipeline operations.

Aggregation Example.

In this example there is a document database collection consisting of the members of a vintage car club. Each document is a serialized version of the following ClubMember Class

public class ClubMember
{
    #region Public Properties
    public int Age { get; set; }
    public List  <string> Cars { get; set; }
    public string Forename { get; set; }
    public ObjectId Id { get; set; }
    public string Lastname { get; set; }
    public DateTime MembershipDate { get; set; }
    #endregion

    #region Public Methods and Operators

    public void PrintDetailsToScreen()
    {
        Console.WriteLine(String.Format("{0,-12}{1,-10}{2,4}{3,14}",
                         this.Lastname, this.Forename, 
                         this.Age, this.MembershipDate.ToShortDateString()));
    }

    #endregion
}

The ClubMember Class has an array named Cars that holds the names of the vintage cars owned by the member. The aim of the aggregation is to produce a list of owners who have joined in the last five years for each type of car in the collection.

Step 1 Match Operation.

The match operation selects only the members that have joined in the last five years. Here's the code.

var utcTime5yearsago = DateTime.Now.AddYears(-5).ToUniversalTime();

var matchMembershipDateOperation = new BsonDocument
{
 { "$match", new BsonDocument {  { "MembershipDate", 
     new BsonDocument { { "$gte",utcTime5yearsago  } } } } }
};

As you can see, the code ends up with more braces than an orthodontist but at least itelliSense assists when you are writing it. The keyword $gte indicates a greater than or equal query.

Step 2 Unwind Operation.

Unwind operations modify documents that contain a specified Array. For each element within the array a document identical to the original is created. The value of the array field is then changed to be equal to that of the single element. So a document with the following structure

_id:700,Lastname: “Evans”, Cars[“MG”,”Austin”,Humber”]

Becomes 3 documents

_id:700,Lastname: “Evans”, Cars:“MG”
_id:700,Lastname: “Evans”, Cars:“Austin”
_id:700,Lastname: “Evans”, Cars:“Humber”

If there are two or more identical elements, say Evans has two MGs, then there will be duplicate documents produced. Unwinding an array makes its members accessible to other aggregation operations.

var unwindCarsOperation = new BsonDocument { { "$unwind", "$Cars" } };

Step3 Group Operation.

Define an operation to group the documents by car type. Each consecutive operation does not act on the original documents but the documents produced by the previous operation. The only fields available are those present as a result of the previous pipeline operation. You can not go back and pinch a field from the original documents. The $ sign is used in two ways. Firstly, to indicate a keyword and, secondly, to differentiate field names from field values. For example, Age is a field name, $Age is the value of the Age field.

var groupByCarTypeOperation = new BsonDocument
{
    {
        //Sort the documents into groups
        "$group",
        new BsonDocument
        {
            //Make the unique identifier for the group a BSON element consisting
            // of a field named Car.
            // Set its value to that of the Cars field
            // The Cars field is nolonger an array because it has now been unwound
            { "_id", new BsonDocument { { "Car", "$Cars" } } },
            {
                //Add a field named Owners
                "Owners",
                new BsonDocument
                {
                    {
                        //add a value to the Owners field if it does not
                        //already contain an  identical value.
                        //This makes the field Value an array
                        "$addToSet",
                        //The value to add is a BsonDocument with an identical structure to
                        // a serialized ClubMember class.
                        new BsonDocument
                        {
                            { "_id", "$_id" },
                            { "Lastname", "$Lastname" },
                            { "Forename", "$Forename" },
                            { "Age", "$Age" },
                            {"MembershipDate","$MembershipDate"}
                        }
                    }
               }
           }
        }
    }
};

Step 4 Project Operation.

The _id field resulting from the previous operation is a BsonElement consisting of both the field name and its Value. It would be better to drop the field name and just use the Value. The following Project operation does that.

var projectMakeOfCarOperation = new BsonDocument
{
    {
        "$project", new BsonDocument
        {
            // drop the _id field. A 0 as used here means drop
            { "_id", 0 },
            //Add a new field. Make its Value equal to the value of the _id's Car field
            { "MakeOfCar", "$_id.Car" },
            //Keep the Owners field. A 1 as used here means keep
            { "Owners", 1 }
        }
     }
};

Step 5 Sort Operation.

Define an operation to Sort the documents by car type.

  var sortCarsOperation = new BsonDocument { { "$sort", new BsonDocument { { "MakeOfCar", 1 } } } };

The number 1 means perform an ascending sort. A 0 is used to indicate a decending sort

Step 6 Run the Aggregation and output the result.

AggregateResult result = collection.Aggregate(
  matchMembershipDateOperation,
  unwindCarsOperation,
  groupByCarTypeOperation,
  projectMakeOfCarOperation,
  sortCarsOperation);

The AggregateResult class returned has a bool field named Ok. It is set to true if there were no errors. The resulting documents are returned in the AggregateResult.ResultDocuments collection. The easiest way to deserialize the collection is to call its Select method passing in the Deserialize method of the BsonSerializer as follows.

public class CarStat
{
#region Public Properties

public string MakeOfCar { get; set; }

public BsonDocument[] Owners { get; set; }

#endregion
}

IEnumerable<CarStat> carStats = 
  result.ResultDocuments.Select(BsonSerializer.Deserialize<CarStat>);
foreach (CarStat stat in carStats)
{
    Console.WriteLine("\n\rCar Marque : {0}\n\r", stat.MakeOfCar);
    IEnumerable<ClubMember> clubMembers =
        stat.Owners.AsEnumerable().Select(BsonSerializer.Deserialize<ClubMember>).OrderBy(p => p.Lastname).
            ThenBy(p => p.Forename).ThenBy(p => p.Age).Select(p => p);

    foreach (ClubMember member in clubMembers)
    {
        member.PrintDetailsToScreen();
    }
}

The sample application has an aggregation example that performs various calculations on the data set such as Count, Min, Max and Total.

GridFS.

GridFS is a means of storing and retrieving files that exceed the BsonDocument size limit of 16MB. Instead of storing a file in a single document, GridFS divides a file into chunks and stores each of the chunks as a separate document. GridFS uses two collections to store files. One collection stores the file chunks and the other stores the file’s metadata. The chunk size is about 256k. The idea here is that smaller chunks of data can be stored more efficiently and consume less memory when being processed than large files. It’s generally not a good idea to store binary data in the main document as it takes up space that is best used by more meaningful data. Uploading data into GridFs is straight forward. Here are a couple of examples.

const string fullyQualifiedUpLoadName = @"C:\temp\mars.png";

//Here the uploaded file is given the name 'C:\temp\mars.png'
MongoGridFSFileInfo gridFsInfo = database.GridFS.Upload(fullyQualifiedUpLoadName);

//Here the uploaded file is given the name 'mars.png'
using (var fs = new FileStream(fullyQualifiedUpLoadName, FileMode.Open))
{
   gridFsInfo= database.GridFS.Upload(fs, "mars.png");
}

The GridFS.Upload method returns an object of type MongoGridFSFileInfo. This contains the file’s metadata. Only basic details such as the file’s name and length are included by default but the metadata can be customised to facilitate searching. Here's how.

BsonDocument photoMetadata = new BsonDocument
        { { "Category", "Astronomy" }, { "SubGroup", 
            "Planet" }, { "ImageWidth", 640 }, { "ImageHeight", 480 } };
database.GridFS.SetMetadata(gridFsInfo,photoMetadata);
//Get the collection of metadata
var coll= database.GetCollection("fs.files");

//Build an index using the customised metadata fields.
IndexKeysBuilder keys = IndexKeys.Ascending("metadata.Category", "metadata.SubGroup");
coll.EnsureIndex(keys);

//Find files using the indexed metadata fields
var astronomyPics= database.GridFS.Find(Query.EQ("metadata.Category", "Astronomy"));

//use the GridFSFileInfo object to download the file
const string fullyQualifiedDownLoadName = @"C:\temp\mars2.png";
database.GridFS.Download(fullyQualifiedDownLoadName, gridFsInfo);

//Delete all files in the Astronomy category      
database.GridFS.Delete(Query.EQ("metadata.Category", "Astronomy"));

MongoDB Replica Sets.

A replica set is a cluster of mongoDB instances that replicate amongst one another so that they all store the same data. One server is the primary and receives all the writes from clients. The others are secondary members and replicate from the primary asynchronously. The clever bit is that, when a primary goes down, one of the secondary members takes over and becomes the new primary. This takes place totally transparently to the users and ensures continuity of service. Replica sets have other advantages in that it is easy to backup the data and databases with a lot of read requests can reduce the load on the primary by reading from a secondary. You can not rely on any one instance being the primary as the primary is determined by members of the replica set at run time.

Installing A Replica set as a Windows Service.

This example installs a replica set consisting of one primary and two secondary instances. The instances will be name MongDB0. MongoDB1, MongoDB2. They will use IP address localhost and listen on ports 27017, 27018 and 27019 respectively. The replica set name is myReplSet.

Step 1 Housekeeping tasks.

In the mongodb folder add three new folders named rsDataDb0, rsDataDb1, rsDataDb2. These are the data folders. Remove any instance of mongo that may be already running. In this example the service name to be removed is MongoDB. Open a command prompt in administrator mode, navigate to where mongod.exe is installed and enter:

mongod.exe --serviceName MongoDB  --remove

Step 2 Install three new service instances.

The best way to do this is to have three configuration files, one for each instance . The format of these files is very similar. Here is the congfig file for MongoDB0 . The hash sign comments out a line

#Use this to direct output to a log file instead of the console
#*******************************
logpath=C:\mongodb\log\rsDb0.log
#********************************
logappend = true
journal = true
quiet = true
#Enable this is you wish to use the user interface situated at 1000 ports above the server port
rest=true
#
# The port number the mongod server will listen on
# change port for each server instance
#**************************************
port=27017
#****************************************
# Listen on a specific ip address
# This is needed if running multiple servers.Comment out to access mongod remotely.
bind_ip=127.0.0.1
# This sets the database path, change the database path for each server instance
#**********************************************
dbpath=C:/mongodb/rsDataDb0
#*****************************************
# Keep same replica set for all servers in the set
replSet=myReplSet

The config files are included in the sample code bundle, but, basically, you change the port, dbpath and logpath for each instance. Store the config files in the bin directory and enter the following commands

C:\mongodb\bin\mongod.exe --config C:\mongodb\bin\repSetDb0.cfg --serviceName MongoDB0 --serviceDisplayName MongoDB0 --install
C:\mongodb\bin\mongod.exe --config C:\mongodb\bin\repSetDb1.cfg --serviceName MongoDB1 --serviceDisplayName MongoDB1 --install
C:\mongodb\bin\mongod.exe --config C:\mongodb\bin\repSetDb2.cfg --serviceName MongoDB2 --serviceDisplayName MongoDB2 --install

Check the log files to confirm all is well and enter the following commands to start the services.

net start MongoDB0
net start MongoDB1
net start MongoDB2

Step 3 Configure the Replica Set.

To configure the Replica set you need to use the Mongo shell. Make sure you are in the \mongodb\bin directory and enter the command

mongo MongoDB0

The shell will connect to the MongoDB0 instance. Now initialise a variable called config by entering the following:

config = { _id : "myReplSet",members : [ {_id : 0, host :"localhost:27017"}, 
  {_id : 1, host : "localhost:27018"}, {_id : 2, host :"localhost:27019"},]}

Pass this variable to the rs.initiate() method by entering the command

rs.initiate(config)

You now have time to put the kettle on while Mongo takes your hard drive for a spin. When the method returns you are ready to go. You can find out the status of your replica set by entering rs.status() in the mongo shell. To connect to the Replica set with the C# driver use this connection string.

const string connectionString = 
  "mongodb://localhost/?replicaSet=myReplSet&readPreference=primary";

Conclusion

There is much more to MongoDB than is detailed in this article but the hope is that there is enough information here for you to be able to begin exploring the capabilities of this open source software. Finally, I’d like to express my gratitude to the many developers who have worked tirelessly on the MongoDB  project with little prospect of reward other than the satisfaction of having helped others. Thanks very much  –I take my hat off to you.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

George Swan
Student
Wales Wales
No Biography provided

You may also be interested in...

Comments and Discussions

 
QuestionExpected element name to be '_t', not '0'. [modified] Pinmemberchutya1127-Dec-13 8:32 
QuestionEntity Classes for MongoDB PinmemberSahilj8813-Dec-13 4:12 
AnswerRe: Entity Classes for MongoDB PinmemberGeorge Swan13-Dec-13 22:08 
QuestionMy vote of 5 PinmemberTrollTier7-Aug-13 22:58 
AnswerRe: My vote of 5 PinmemberGeorge Swan8-Aug-13 12:03 
QuestionVote of 5 PinmemberGanesanSenthilvel15-Jul-13 10:36 
AnswerRe: Vote of 5 PinmemberGeorge Swan17-Jul-13 6:50 
GeneralMy vote of 5 PinprofessionalRob Philpott26-Jun-13 11:19 
GeneralRe: My vote of 5 PinmemberGeorge Swan26-Jun-13 19:34 
GeneralMy vote of 5 PinmemberAmund Gjersøe25-Feb-13 1:58 
GeneralRe: My vote of 5 PinmemberGeorge Swan1-Mar-13 9:08 
QuestionThis looks great but... PinmemberSorenDalby9-Jan-13 23:06 
AnswerRe: This looks great but... PinmemberGeorge Swan10-Jan-13 0:22 
GeneralMy vote of 5 Pinmemberdinhienhy9-Jan-13 19:35 
GeneralRe: My vote of 5 PinmemberGeorge Swan10-Jan-13 0:39 
GeneralMy vote of 5 PinmemberPCoffey9-Jan-13 10:35 
GeneralRe: My vote of 5 PinmemberGeorge Swan10-Jan-13 0:37 
GeneralMy vote of 5 PinmvpFlorian Rappl9-Jan-13 8:04 
Great one! Actually I would give you 5 only for showing a pic of CMS at Cern (?), but I am also a huge fan of Mongo.
 
Good job!
GeneralRe: My vote of 5 PinmemberGeorge Swan10-Jan-13 0:33 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web02 | 2.8.140905.1 | Last Updated 9 Jan 2013
Article Copyright 2013 by George Swan
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid