Click here to Skip to main content
Click here to Skip to main content

Extracting IPTC header information from JPEG images

, 27 Aug 2001
Rate this:
Please Sign up or sign in to vote.
A sample class that manipulates information (e.g. caption, author, copyrights,...) that is stored in a JPEG file

This is our sample file

The sample application

File info from Adobe Photoshop

Introduction

Until I had to write an application that extracts image information from a JPEG image, I didn't know that a JPEG file can contain various information beside the pure image information (size, colors and image data). But it can contain quite a whole lot of textual and other sort of information like copyrights, captions, keywords and other stuff.

Adobe Photoshop can be used to manipulate this information from the File->File information menu. (This is probably the right place to mention that all Photoshop information here is translated from a German copy of Photoshop and thus might not exactly match the real names of the English version).

Ok. So you can edit the data with Photoshop. But what if you would like to use this data in your own application, let's say to store the images in a database along with the extracted information? That's the task of this class. You pass a file name, and then the member variables are filled. You can even modify them and write them back to the JPEG file. However this currently only works with files that already contain IPTC information.

Decoding the file format

As I tried to develop my application I searched for a similar application to reduce my work. I didn't find one. I even had a hard time to gather information on the JPEG file format specification (without buying some books) let alone the specification on the Photoshop headers. After firing up the Hex Editor I tried to find out how the things are stored. So all the structure of the Photoshop specific headers is more or less based on some findings from some sample files. If you have found a complete specification feel free to drop me a line or post it in the comments.

The basic structure is as follows. More information on that can be found here and at The Graphics File Formats Page.

JPEG image Contents Name Description
0xFF 0xD8 SOI Start of image
Segments (see below)
0xFF 0xD9 EOI End of image

JPEG segments Description
Segment marker (2 bytes)
Segment size (2 bytes) excl. marker
Segment data

Some JPEG segment markers
Contents Name Description
0xFF 0xE0 APP0 Application marker (in every JPEG file)
0xFF 0xDB DQT Quantization Table
0xFF 0xC0 SOF0 Start of frame
0xFF 0xC4 DHT Define Huffman Table
0xFF 0xDA SOS Start of scan
0xFF 0xED APP14 This is the marker where Photoshop stores its information

The Photoshop segment

The APP14 segment is the one we are after. Here starts the non-documented area.

APP14 segment Contents Description
0xFF 0xED APP14 marker
Segment size (2 bytes) excl. marker
Photoshop 3.0\x00 Photoshop identification string
8BIM segments (see below)

A JPEG file from Photoshop has various 8BIM (I don't know the real name) headers. The one with the type 0x04 0x04 contains the textual information. The image URL is stored in a different header. That's why it is currently not supported by the demo class. Other headers contain a thumbnail image and other information.

Photoshop 6 introduced a slight variation in this header segment. Basically the 4 byte padding has been replaced by a header description text of variable length. The updated sample can now handle these files as well.

8BIM segment Description
8BIM Segment marker (4 bytes)
Segment type (2 bytes)
Zero byte padding (4 bytes)
Segment size (2 bytes excl. marker, type, padding and size)
Segment data

The 8BIM header with the text is divided by even more headers, prefixed by 0x1C 0x02. These blocks then finally contain the information. Multiple blocks with the same type (e.g. Keywords) form a list.

0x1C 0x02 segment Description
0x1C 0x02 Segment marker (2 bytes)
Segment type (1 byte)
Segment size (2 bytes excl. marker, type and size)
Segment data

The sample application

With a description of the format of the information it is easy to write an application that scans through the file and extract the interesting bytes. This is exactly what the provided sample class does.

Have fun with it!

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

Share

About the Author

Christian Tratz
Web Developer
Germany Germany
If you are not living on the edge you are wasting space Wink | ;)

Comments and Discussions

 
Generalit's really helpful!!! Pinmemberlucid_bao3-Jun-12 2:59 
GeneralVery good. Header parser needs small fix. PinmemberRoni Fital7-Sep-10 21:21 
Generalgood job PinmemberMember 455517230-Jun-10 11:38 
GeneralVB.NET/C# Pinmemberpvatanpour4-May-09 4:46 
General8BIM format table is wrong PinmemberVidbuff28-Dec-06 3:19 
Generalnothing happen PinmemberIbrahim Hamoud11-Dec-06 11:47 
Generalthanks Pinmemberscarabée15-Nov-06 22:40 
GeneralJPEG header format Pinmember8820910-Jul-06 0:00 
GeneralSave crush PinmemberArnon A19-Jan-06 7:22 
GeneralRe: Save crush PinmemberChristian Tratz2-May-06 22:05 
GeneralNew header don't work.... Pinmembergothico23-Jan-05 12:48 
GeneralRe: New header don't work.... PinmemberChris Blue1-Apr-10 18:05 
GeneralIt's APP13, not APP14 PinsussPhil Harvey13-Jan-05 3:07 
GeneralProblem PinmemberAndré-perpetuo3-Sep-04 6:12 
GeneralCode Errors PinsussCodeChampion22-Mar-04 21:55 
GeneralRe: Code Errors Pinmembersaveorg17-Aug-04 5:42 
GeneralSlightly wrong - here's the real 8BIM fmt Pinmemberimagerodeo21-Feb-04 20:41 
GeneralRe: Slightly wrong - here's the real 8BIM fmt PinmemberCodeChampion22-Mar-04 22:01 
GeneralRead Header of all JPEG Pinmemberalperen10-Jan-04 18:02 
GeneralRe: Read Header of all JPEG PinmemberChristian Tratz11-Jan-04 7:16 
GeneralRe: Read Header of all JPEG Pinmemberalperen11-Jan-04 8:44 
GeneralNeed 8BIM Help Pinmemberjdoklovic24-Oct-03 7:00 
GeneralApplication Independent IPTC data extraction PinmemberPriyanka Ahuja18-Sep-03 4:31 
GeneralRe: Application Independent IPTC data extraction PinmemberChristian Tratz18-Sep-03 5:09 
GeneralRe: Application Independent IPTC data extraction PinmemberPriyanka Ahuja18-Sep-03 22:21 
GeneralRe: Application Independent IPTC data extraction PinmemberChristian Tratz18-Sep-03 23:22 
Questionhow when no iptc are still stored? Pinmemberbosie3-Sep-03 17:37 
AnswerRe: how when no iptc are still stored? PinmemberChristian Tratz3-Sep-03 21:59 
GeneralRe: how when no iptc are still stored? Pinmemberbosie3-Sep-03 23:59 
GeneralRe: how when no iptc are still stored? Pinmemberbosie4-Sep-03 2:31 
GeneralRe: how when no iptc are still stored? PinmemberChristian Tratz4-Sep-03 5:20 
AnswerRe: how when no iptc are still stored? PinmemberThierry Lebrun9-Jan-04 11:42 
GeneralRe: how when no iptc are still stored? Pinmemberbosie5-Mar-04 15:51 
GeneralPhotoshop File Format / IPTC add'l info PinmembermwSteidl15-Jun-03 7:24 
GeneralRe: Photoshop File Format / IPTC add'l info Pinmemberclairec27-Jan-04 0:58 
GeneralQuantization Table Pinsussbigmack21-May-03 6:18 
QuestionHow to get JPEG and BMP header information Pinsusspraveen K P27-Apr-03 7:38 
Generalextracting iptc info using VB or VB.NET Pinmembersuperkate21-Mar-03 12:11 
GeneralRe: extracting iptc info using VB or VB.NET PinmemberLepsik28-Mar-03 9:57 
GeneralRe: extracting iptc info using VB or VB.NET Pinmemberlukethepunk10-Oct-03 10:05 
GeneralAdobe Photoshop 7 non compatible PinmemberLepsik21-Mar-03 10:10 
GeneralRe: Adobe Photoshop 7 non compatible PinmemberCodeBaan17-Jan-04 10:23 
GeneralProblem! Extracting Info from Tiff images Pinmemberyok_kk619-Feb-03 22:09 
GeneralRe: Problem! Extracting Info from Tiff images PinmemberLepsik28-Mar-03 10:01 
Generalalso here detail description PinmemberLepsik28-Mar-03 12:11 
GeneralImport File Text into JPEG Pinsusslegrand3-Sep-02 22:09 
GeneralLSB Encoding of JPEG images .. PinmemberMastermind_0071-Sep-02 20:48 
GeneralRe: LSB Encoding of JPEG images .. PinmemberChristian Tratz1-Sep-02 23:10 
GeneralPlease help, Cflash, JPEG PinsussAnonymous31-Aug-02 22:16 
GeneralImage Display in JAI Pinmembersharib31-Jul-02 1:14 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web04 | 2.8.150414.1 | Last Updated 28 Aug 2001
Article Copyright 2001 by Christian Tratz
Everything else Copyright © CodeProject, 1999-2015
Layout: fixed | fluid