Click here to Skip to main content
Click here to Skip to main content

Best Practice in File Storage while Building Applications - Database (Blob Storage) Vs File System

By , 7 Aug 2008
 

Introduction

This article discusses the pros and cons of data storage for files v/s blobs in databases while designing data store application systems.

Background

For most of the web applications, this remains a common design question “whether the images should be stored in the database or in a file system?” In the web based apps like shopping cart, the site has to deal with a number of product images and the best choice is to store the images in the database together with the meta data. However, having said that, there are pros and cons for both the approaches.

While designing any data store applications, Sys Architect should take the right decision on this by taking into account the behavior of the proposed systems in a broader perspective.

As everyone knows, the IO operation is a bit faster and hence the file retrieval is when we use the file servers for storage whereas the database as the document or file store will deliver a central repository of data for the organizations and hence data management is more efficient and hence more under control. A database may follow good entity modelling and have a logical design, but if the application and users cannot access and manipulate the BLOB data stored in it efficiently, the database is not effective.

BLOBs require database designers to understand how the application will access and use the BLOBs in practice. Spend time designing how the BLOB objects will be represented in the database. How you lay out the large objects within the database affects both the response time and throughput of the application.

Advantages and Disadvantages

BLOB Storage as the Best Solution

  1. For better scalability. Although file systems are designed to handle a large number of objects of varying sizes, say files and folders, actually they are not optimized for a huge number (tens of millions) of small files. Database systems are optimized for such scenarios.
  2. For better availability. Database servers have availability features that extend beyond those provided by the file system. Database replication is a set of solutions that allow you to copy, distribute, and potentially modify data in a distributed environment whereas Log shipping provides a way of keeping a stand-by copy of a database in case the primary system fails.
  3. For central repository of data with controlled growth. DBA has the privilege to control and monitor the growth of database and split the database as and when needed.
  4. For full-text index and search operations. You can index and search certain types of data stored in BLOB columns. When a database designer decides that a table will contain a BLOB column and the column will participate in a full-text index, the designer must create, in the same table, a separate character-based data column that will hold the file extension of the file in the corresponding BLOB field. During the full-text indexing operation, the full-text service looks at the extensions listed in the character-based column (.txt, .doc, .xls, etc.), applies the corresponding filter to interpret the binary data, and extracts the textual information needed for indexing and querying.

File System Storage as the Best Solution

  1. For the application in which the images will be used requires streaming performance, such as real-time video playback.
  2. For applications such as Microsoft PhotoDraw® or Adobe PhotoShop, which only know how to access files.
  3. If you want to use some specific feature in the NTFS file system such as Remote Storage.

History

  • 7th August, 2008: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Ziyad Mohammad
Architect
United States United States
Systems Architect & Analyst with concentration in various IT technologies, especially on Microsoft platform. He is an active member of Sharepoint Community and areas of expertise are ASP.Net, SharePoint, WinForms and Workflow technologies. You can visit his blog
http://www.dotnetsoldier.blogspot.com/

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralAt least reference where you copied this frommembercmk16-Nov-08 11:49 
http://www.microsoft.com/technet/prodtechnol/sql/2000/reskit/part3/c1161.mspx?mfr=true[^]
 
What have you added that i don't get from the above (in particular the 'Database or File System' section) ?
 
...cmk
 
The idea that I can be presented with a problem, set out to logically solve it with the tools at hand, and wind up with a program that could not be legally used because someone else followed the same logical steps some years ago and filed for a patent on it is horrifying.
- John Carmack

GeneralProblem with BLOBsmemberGevorg8-Aug-08 16:41 
The biggest problem with BLOBs is backup.
Right now you can only backup database (in MS SQL), and if you have 10 thousand images in there as BLOB everytime you do a full backup (you suppose to do it everyday) your back up file will be huge....
 
With file system you can back up only new/modified files.
 
George.
 
My site - Body Jewelry
Not my site - Piercing info

GeneralRe: Problem with BLOBsmemberZiyad Mohammad9-Aug-08 18:25 
You can go for an incremental back up model with no issue and SharePoint is dealing with thousands and millions of document with all those documents stored as blobs in SQL database. The same is true for any other content management system where the database size would be in GiGs.
 
I dont see any risk!...But it is always better to have an archival policy in place if you go for database storage of blobs. And as said, after all, both approaches has its own advantages and disadvantages.. Frown | :(
 

GeneralRe: Problem with BLOBsmemberchrwal28-Jul-09 13:23 
I think you can better use a combination of the two. I once did some work for an image delivery company and what we did was maintain a server that hosted the images on the file system while the metadata and the image's href was stored on a database server. This worked quite well and was extremely efficient. Sometime its better to think outside of the box and find an answer somewhere in the middle rather than create false either-or extremes.
GeneralAnother advantage for storing in DB...memberstavinski7-Aug-08 4:56 
would be the out of the box transactional support provided by a DB, so that you aren't in the position where the details saved to the DB correctly however the image/document wasn't saved to the filesystem for some particular reason.
 
Also the maintenance is reduced because all the data is stored in 1 central repository rather than scattered
GeneralRe: Another advantage for storing in DB...memberRay Cassick7-Aug-08 7:24 
I would think you could mimic the transactional part by just wrapping the file system write inside the DB transaction begin\commit code.
 
IO agree on the maintenance part, but I wonder how blobs effect performance and if the impact on that would outweigh the maintenance issue.
 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web01 | 2.6.130619.1 | Last Updated 7 Aug 2008
Article Copyright 2008 by Ziyad Mohammad
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid