Introduction
Recently, I was tasked by my boss to come up with an app that can read the info tags buried inside JPEG files… Knowing nothing at the time about meta data standards, I embarked on a bumpy adventure on finding information on the internet on the subject. Unfortunately, at the time, not knowing the acronym for IPTC (International Press Telecommunications Council), I couldn’t locate a beautiful article about it on CodeProject by Christian Tratz, which I just found out about, when I tried to post my work on the subject…
To cut the story short, it took me quite some time, analyzing, reverse engineering cryptic PHP bits and pieces of samples, to come up with this simple C# class that can parse a JPEG file and extract tags from the Photoshop 3.0 section of it, codenamed APP14 section by Adobe standards. I strongly recommend reading the theory behind meta data in JPEG file located here.
The JPEGMetaData
class contains a constructor that takes a reference to the location of the JPEG file on its corresponding drive. It encodes the headers in a separate Hash-Table for clarity. The APP14 section is characterized with the opening tag of 0xFF & 0xED. It should contain a Zero terminated string “Photoshop 3.0” in it. Within the section, various tags could exist, depending on whether the author of the image or whoever authored it last in an app like PhotoShop or Photo Mechanic, has populated any of the available meta data fields. If any of the sought field are not found in the meta-data, an appropriate message is returned back to the user.
public JPEGMetaData(string FileName)
{
PS3Tags.Add("PS3SectionHeader", "\u00FF\u00ED");
PS3Tags.Add("PS3SectionIDTag", "Photoshop 3.0\u0000");
PS3Tags.Add("PS3SectionObjNameTag", "\u001C\u0002\u0005");
PS3Tags.Add("PS3SectionHeadlineTag", "\u001C\u0002\u0069");
PS3Tags.Add("PS3SectionCaptionTag", "\u001C\u0002\u0078");
JPEGContentBuffer = LoadJPEG(FileName);
PS3SectionContentBuffer =
ExtractPS3ContentSection(PS3Tags["PS3SectionHeader"],
PS3Tags["PS3SectionIDTag"]);
PS3TagContents.Add("PS3SectionObjNameTag",
ExtractTag(PS3Tags["PS3SectionObjNameTag"].ToString()));
PS3TagContents.Add("PS3SectionHeadlineTag",
ExtractTag(PS3Tags["PS3SectionHeadlineTag"].ToString()));
PS3TagContents.Add("PS3SectionCaptionTag",
ExtractTag(PS3Tags["PS3SectionCaptionTag"].ToString()));
}
The actual raw JPEG file is loaded internally and converted to a string
in a local buffer private string JPEGContentBuffer
for further slicing.
private string LoadJPEG(string FileName)
{
FileStream fs = new FileStream(FileName,
FileMode.Open,
FileAccess.Read);
byte[] RAWdata = new byte[fs.Length];
fs.Read(RAWdata, 0, RAWdata.Length);
fs.Close();
return Encoding.Default.GetString(RAWdata, 0, RAWdata.Length);
}
The class exposes only a one Hash-table named PS3TagContents
, that holds the contents of the following three major IPTC tags, identified by Adobe as:
IPTC | ApplicationRecord | Tags
|
5 | ObjectName | string[0,64] |
105 | Headline | string[0,256] |
120 | Caption-Abstract | string[0,2000] |
The actual data extraction is performed in the ExtractTag
method of the class. It searches for the corresponding tag header, acquires its block length, and then extracts the actual content from that location.
private string ExtractTag(string currTagSought)
{
int pos = PS3SectionContentBuffer.IndexOf(currTagSought);
if (pos > 0)
{
pos += 3;
int BlockSize = (int)(PS3SectionContentBuffer[pos] * 256) +
(int)(PS3SectionContentBuffer[pos + 1]);
pos += 2;
byte[] tagHeaderContent = new byte[BlockSize];
System.Buffer.BlockCopy(Encoding.Default.GetBytes
(PS3SectionContentBuffer),
pos, tagHeaderContent, 0, BlockSize);
return Encoding.Default.GetString(tagHeaderContent);
}
else
return currTagSought + " is not available!";
}
Finally, the harvested meta data could be rendered to the output console by invoking the DisplayAllTags()
method of the class.
Hope this may help someone in their quest to process JPEG meta-data tags, the way I did at the time having to hustle to get this functionality together. I am attaching the full source code with the accompanying sample harness for the class.
History
- 27th April, 2011: Initial version
An ex-Sinclair ZX Spectrum developer turned IT Professional...
Worked for Northrop Grumman Space Technology, Unmanned Systems and Corporate Legal in Redondo Beach, El Segundo and Century City.
The peak of my career would be my FOX Channels Project Management position back in 2000.