Click here to Skip to main content
15,885,757 members
Please Sign up or sign in to vote.
3.67/5 (3 votes)
See more:
While searching the Internet I came across a link about reading and editing metadata contained in image files.

I read the link and thought to myself; "If the an image file of type PNG contained such information as Name, Size, Item Type, Date Created and so forth. What type of information is stored in the other file types?"

To answer the aforementioned question I prowled around the Internet some more and using various examples I created a simple Windows application using the Microsoft Shell and Automation reference to display the metadata contained in file formats like EXE, DLL, DOCX and VBPROJ just to name a few.

While looking at the metadata contained in one of my Word documents I noticed the "Author" tag contained an email address I no longer use. Then I started to wonder, "Can I edit the incorrect metadata item?" Unfortunately, the Shell32.dll does not allow this.

So my question is: Since metadata is written by various applications: What namespace or whatever-it-may-be do I use in my application to edit specific tags stored in the metadata?

Thanks,
MRM
Posted
Updated 25-Dec-13 7:56am
v2
Comments
Quecumber256 25-Dec-13 17:40pm    
SA,
Strange answer: Was the question somehow unclear? I'm proceeding on the assumption that is was.

To clarify: Using the Shell32.dll I read the metadata from: After_the_Storm_23-Jan-2013.docx. Shell32.dll returned the following metadata:

0: Name: After_the_Storm_23-Jan-2013
1: Size: 11.4 KB
2: Item type: Microsoft Office Word Document
3: Date modified: 1/23/2013 4:49 PM
4: Date created: 1/23/2013 4:49 PM
5: Date accessed: 1/23/2013 4:49 PM
6: Attributes: A
7: Offline status:
8: Availability: Available offline
9: Perceived type: Document
10: Owner: MyPCName\User
11: Kind: Document
12: Date taken:
13: Contributing artists:
14: Album:
15: Year:
16: Genre:
17: Conductors:
18: Tags:
19: Rating: Unrated
20: Authors: Unused_Email_Address@someplace.com
21: Title:
22: Subject:
23: Categories:
24: Comments:
25: Copyright:
26: #:

To me this is binary data contained inside the file itself. I want to create an application where I can change the information in tag number 20 from Authors: Unused_Email_Address@someplace.net to Authors: Current_Email_Address@theotherplace.com.

This information is placed inside the file when it is created by the application itself. In this case MS-Word. If the data can be written; it can be modified or changed. My educated guess would be that the writing of metadata would be a function inside the System.IO Namespace, but so far I haven't been able to find any information on how developers can write their own metadata into the files their applications create, or if it is even possible using Microsoft Visual Studio Professional 2008.

Thanks,
MRM

1 solution

Strange question. It looks like, by some reason, you are convinced that everything in the world (or at least every single widely used media format) has some "namespace" (as if namespaces were some libraries), or in one or another way is exposed to the API you use, such as .NET FCL. This is not so and wasn't meant to be so. Sometimes you need just grab some standard (it it is even publicly available), read it and implement how semantically read/write it. Or find some 3rd-party library.

And this case is way much more typical then the opposite, when the file format is indirectly accessible (say, you can view an image and save your image in this format) but detailed access to the detail of format is not available. This is quite explainable. Exposing it all would require exposing even the detail which remains untouched by the library. Doing it would enormously bloat the library code, and, it would bloat the API even more. And this bloat would be done for the save of some 0.1% of the users, if not less. No such things would really be separate. Say, PE files (DLL and EXE) are written by compilers directly and through CodeDOM (which also uses compilers), but how many perverts would want to dig directly into the internal structure of PE?

At the same time, many formats are available. It's funny, but I can find out the all the formats you listed in your question. What to see:

  1. PE files (EXE, DLL and more):
    http://en.wikipedia.org/wiki/Portable_Executable[^].

    Described here:
    http://msdn.microsoft.com/en-us/windows/hardware/gg463119.aspx[^],
    http://msdn.microsoft.com/en-us/magazine/cc301805.aspx[^],
    http://msdn.microsoft.com/en-us/magazine/cc301808.aspx[^].
  2. DOCX:
    This is a part of OpenXML standard, standardized as ECMA-376 and ISO/IEC 29500:
    http://en.wikipedia.org/wiki/Open_XML[^].

    ECMA standard, as always, is publicly available: http://www.ecma-international.org/publications/standards/Ecma-376.htm[^]
  3. VBPROJ:
    The part of MSBuild standard which conforms to the appropriate XML schema and comprehensively described:
    http://msdn.microsoft.com/en-us/library/5dy88c2e.aspx[^].

    See also:
    http://msdn.microsoft.com/en-us/library/dd393574.aspx[^],
    http://msdn.microsoft.com/en-us/library/0k6kkbsd.aspx[^].


Enjoy,
—SA
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900