Click here to Skip to main content
15,867,453 members
Articles / Programming Languages / C#

Reading Image Headers to Get Width and Height

Rate me:
Please Sign up or sign in to vote.
4.97/5 (16 votes)
28 Apr 2009CPOL3 min read 125.4K   2.6K   44   16
Looks at techniques for getting an image's width and height quickly

Introduction

I had a requirement to cache the orientation of JPEGs within a set of folders. The easiest way is just to load each image and work out whether it is landscape or portrait based on the width and height, possibly like this:

C#
public bool IsLandscape(string path)
{
  using (Bitmap b = new Bitmap(path))
  {
    return b.Width > b.Height;
  }
}

This is great when there are only a few images but it is incredibly slow as the framework has to load the image into GDI and then marshal it over to .NET.

Improvement One - Multi-Threading

Performance could be improved by creating a queue of image paths and then loading them on multiple threads potentially using the ThreadPool bound to the number of physical core on the machine.

C#
foreach (string path in paths)
{
  // add each path as a task in the thread pool
  ThreadPool.QueueUserWorkItem(new WaitCallback(ThreadCallback), path);
}

// wait until all images have been loaded
lock (this.Landscapes)
{
  Monitor.Wait(this.Landscapes);
}
C#
private void ThreadCallback(object stateInfo)
{
  string path = (string)stateInfo;
  bool isLandscape = this.IsLandscape(path);
           
  lock (this.Landscapes)
  {
    if (isLandscape)
    {
      this.Landscapes.Add(path);
    }

    imagesRemaining--;

    if (imagesRemaining == 0)
    {
      // all images loaded, signal the main thread
      Monitor.Pulse(this.Landscapes);
    }
  }
}

Ok so performance is improved however new issues arise:

  1. The main performance bottleneck is IO bound as loading an image from disk and converting it to a usable bitmap takes phenomenally longer than getting the size once it is in memory.
  2. Bitmaps required a lot of memory, we can be talking about upwards of ten megabytes depending on the total number of pixels.
  3. Most computers only have a couple of core so threading is of limited benefit.

Improvement Two – Reading the Headers

It occurred to me that there were a number of applications that read width and height information remarkably quickly, too fast to have read the whole file; turns out there are headers in image files which contain width and height – bingo.

After some searching, I came across this buried forum post which gives a great example of how to read not only JPEG headers but also GIF, PNG and BMP:

The post is great although I found it couldn't read all JPEG file headers for some reason. Firstly I modified DecodeJfif so that the chunk length could be an unsigned 16 bit integer (ushort in C#):

C#
private static Size DecodeJfif(BinaryReader binaryReader)
{
  while (binaryReader.ReadByte() == 0xff)
  {
    byte marker = binaryReader.ReadByte();
    short chunkLength = ReadLittleEndianInt16(binaryReader);
    if (marker == 0xc0)
    {
      binaryReader.ReadByte();
      int height = ReadLittleEndianInt16(binaryReader);
      int width = ReadLittleEndianInt16(binaryReader);
      return new Size(width, height);
    }

    if (chunkLength < 0)
    {
      ushort uchunkLength = (ushort)chunkLength;
      binaryReader.ReadBytes(uchunkLength - 2);
    }
    else
    {
      binaryReader.ReadBytes(chunkLength - 2);
    }
  }

  throw new ArgumentException(errorMessage);
}

Secondly, I added a try/catch block around getting the dimensions so that if the header isn't present, it falls back to the slow way:

C#
public static Size GetDimensions(string path)
{
  try
  {
    using (BinaryReader binaryReader = new BinaryReader(File.OpenRead(path)))
    {
      try
      {
        return GetDimensions(binaryReader);
      }
      catch (ArgumentException e)
      {
        string newMessage = string.Format("{0} file: '{1}' ", errorMessage, path);

        throw new ArgumentException(newMessage, "path", e);
      }
    }
  }
  catch (ArgumentException)
  {
    //do it the old fashioned way

    using (Bitmap b = new Bitmap(path))
    {
      return b.Size;
    }              
  }
}

Reading just the headers produced such a massive performance improvement that I removed the multi-threading and just used one thread to process each image sequentially.

Putting It All Together

To further increase performance, I created an XML cache file with width, height and date modified information so that only images that had changed would have their headers checked. I didn't want the XML file to be saved every time an image was cached as that would be a new bottleneck. So I added a timer which saved the data to XML 5 seconds after the save method was called. I used Linq-To-XML to save the list of ImageFileAttributes to disk:

C#
class ImageListToXml
{
  private const string XmlRoot = "Cache";
  private const string XmlImagePath = "ImagePath";
  private const string XmlWidth = "Width";
  private const string XmlHeight = "Height";
  private const string XmlImageCached = "ImageCached";
  private const string XmlLastModified = "LastModified";

  public static void LoadFromXml(string filePath, ImageList list)
  {
    list.Clear();
    XDocument xdoc = XDocument.Load(filePath);
    list.AddRange(
      from d in xdoc.Root.Elements()
      select new ImageFileAttributes(
        (string)d.Attribute(XmlImagePath),
        new Size(
          (int)d.Attribute(XmlWidth),
          (int)d.Attribute(XmlHeight)),
        (DateTime)d.Attribute(XmlLastModified)));
  }

  public static void SaveAsXml(string filePath, ImageList list)
  {
    XElement xml = new XElement(XmlRoot,
      from d in list
      select new XElement(XmlRoot,
        new XAttribute(XmlImagePath, d.Path),
        new XAttribute(XmlWidth, d.Size.Width),
        new XAttribute(XmlHeight, d.Size.Height),
        new XAttribute(XmlLastModified, d.LastModified ?? DateTime.MinValue)));

    xml.Save(filePath);
  }
}

Results

Using header produces tremendous performance improvements and caching the dimension results from the images takes the process from seconds to milliseconds.

Performance improvements for 563 images, from 198257ms to 69ms.

This console output gives an indication of the orders of magnitude that can be gained; we are talking about improving the total time taken from more than 3 minutes to less than one-tenth of a second.

History

  • Version 1.0 - Initial release

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior) Play.com
United Kingdom United Kingdom
Hi, my name's Andy Wilson and I live in Cambridge, UK where I work as a Senior C# Software Developer.

Comments and Discussions

 
QuestionGood article BUT I have a problem with your solution. Pin
TJTex30-Mar-15 17:42
TJTex30-Mar-15 17:42 
GeneralMy vote of 5 Pin
Md. Marufuzzaman14-Mar-12 23:29
professionalMd. Marufuzzaman14-Mar-12 23:29 
QuestionVery helpful! Pin
GTatham25-Jun-11 4:20
GTatham25-Jun-11 4:20 
QuestionBrilliant article Pin
borgy337723-Jun-11 21:00
borgy337723-Jun-11 21:00 
GeneralMy vote of 5 Pin
toddsecond13-Feb-11 1:38
toddsecond13-Feb-11 1:38 
GeneralSupport for JPEGs with progressive encoding (etc.) Pin
Uli Hutzler25-Jan-10 4:28
Uli Hutzler25-Jan-10 4:28 
GeneralTranslation to .Net 2.0 Pin
Ankit Rajpoot4-Jun-09 13:16
Ankit Rajpoot4-Jun-09 13:16 
AnswerRe: Translation to .Net 2.0 Pin
andywilsonuk4-Jun-09 22:43
andywilsonuk4-Jun-09 22:43 
Hi Ankit,

Thanks for the feedback I'm glad you found the article useful!

I see what you mean about converting to .net 2.0, my original idea was to just sort the Keys but of course you can't, and in .net 2.0 you can't convert the Keys to a List and sort that either D'Oh! | :doh: . I think this should do what you want however:

C#
int maxMagicBytesLength = 0;

foreach (byte[] bytes in imageFormatDecoders.Keys)
{
    if (bytes.Length > maxMagicBytesLength)
    {
        maxMagicBytesLength = bytes.Length;
    }
}


The Linq to Object extensions really do make for more compact code (9 lines to 1).

Good luck with your project.

- devwilson

GeneralRe: Translation to .Net 2.0 Pin
Ankit Rajpoot5-Jun-09 1:52
Ankit Rajpoot5-Jun-09 1:52 
GeneralRe: Translation to .Net 2.0 Pin
eonic10-Jun-11 19:59
eonic10-Jun-11 19:59 
AnswerRe: Translation to .Net 2.0 Pin
headkaze15-Aug-12 18:30
headkaze15-Aug-12 18:30 
GeneralModifitication? [modified] Pin
RussClarke7-May-09 6:02
RussClarke7-May-09 6:02 
GeneralRe: Modifitication? Pin
andywilsonuk8-May-09 1:32
andywilsonuk8-May-09 1:32 
GeneralUsing BitmapDecoder Pin
gordonwatts5-May-09 17:30
gordonwatts5-May-09 17:30 
GeneralRe: Using BitmapDecoder Pin
andywilsonuk6-May-09 0:39
andywilsonuk6-May-09 0:39 
GeneralRe: Using BitmapDecoder Pin
gordonwatts10-May-09 12:32
gordonwatts10-May-09 12:32 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.