Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
i wanted to know the file name even after renaming.
 
for example i changed .jpeg to .txt then how can find that extension using header.
Posted 2-May-13 19:59pm
Edited 2-May-13 20:35pm
v2
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 4

You are looking for something like the file command on Linux with it's magic data file.
 
If you only want to identify a few image file types, just read some bytes from the begin of the file and check them for image file specific sequences:
 
const char *lpszType = NULL;
char pBuffer[16] = "";
FILE *f = fopen(lpszFile, "rb");
if (f)
{
    fread(pBuffer, 1, 16, f);
    if (0 == memcmp(pBuffer, "BM", 2) && 0 == memcmp(pBuffer + 6, "\0\0\0", 4))
        lpszType = "BMP";
    else if (0 == strcmp(pBuffer, "II*") || 0 == strcmp(pBuffer, "MM*"))
        lpszType = "TIFF";
    else if (0 == memcmp(pBuffer, "\x89PNG\r\n\x1A\n", 8))
        lpszType = "PNG";
    else if (0 == memcmp(pBuffer, "GIF87a", 6) || 
        0 == memcmp(pBuffer, "GIF89a", 6))
        lpszType = "GIF";
    else if (0 == memcmp(pBuffer, "\xFF\xD8\xFF", 3) && 
        pBuffer[3] >= 0xE0 && pBuffer[3] <= 0xEF &&
        4 == strlen(pBuffer + 6))
        lpszType = "JPEG";
    fclose(f);
}
  Permalink  
Comments
p.uday kishore at 3-May-13 6:07am
   
can i have brief explanation on this please????
Jochen Arndt at 3-May-13 6:24am
   
This will open the file which name is specified by lpszFile and read 16 bytes into pBuffer.
Then the content of the buffer is then checked for signatures indicating specific image file types. E.g. bitmap files begin with the upper case letters 'B' and 'M' and have four NULL bytes at offset 6. JPEG files begin with the hex bytes FF, D8, and FF, followed by a byte in the range E0 to EF. At offset 6 of JPEG files is 4 character wide null terminated string (this indicates the type; e.g. 'EXIF').
 
To know the format of image files, have a look at the Wikipedia entering the image type.
 
If you don't know the used functions (fopen, fread, fclose, memcmp, strcmp) so far, look them up in the MSDN. They are all standard C library functions.
p.uday kishore at 6-May-13 7:03am
   
thanks for the explanation and code.I have one more doubt here is there any change in binary format in the new versions of jpeg/gif/png etc....
Jochen Arndt at 6-May-13 7:15am
   
It depends. Some formats have a versions field which may be used to indicate how the file must be parsed. But for the basic checks from my code there should be no changings in the future (otherwise, it would be a new format).
 
JPEG is a little bit special because it covers multiple types. The code from my example is mainly for JFIF data. But with all JPEG formats, there is a 4 character wide null terminated string at offset 6 indicating the type (e.g. "JFIF", "JFXX", "EXIF"). When checking only for the first two bytes (0xFF, 0xD8), all future formats should be also catched.
p.uday kishore at 10-May-13 8:28am
   
Hi,i got a new work can u help me in this context??/
 
how to get a file type if it is given without extension as input.
do we have any criteria to find that???
Jochen Arndt at 10-May-13 10:33am
   
Your question is unclear.
When passing a file name without extension to the above code, you will get the type. E.g., when having a JPEG file named 'image', the above code will set lpszType to "JPEG".
p.uday kishore at 10-May-13 12:35pm
   
yeah for images am clear to use that.in the same way only do we have any for remaining file types.
is there any Microsoft predefined functions for that?????
Jochen Arndt at 10-May-13 12:42pm
   
No. You may look for a Windows port of the Linux file command or or lookup the file formats of interest and add the checks to the code from my solution.
p.uday kishore at 13-May-13 1:15am
   
whats that windows port of the Linux file command mean can you please brief it???
Jochen Arndt at 13-May-13 3:04am
   
The Linux file command is a command line utility that tries to identify the type of a file. Asking Google for Windows ports finds this: http://gnuwin32.sourceforge.net/packages/file.htm.
p.uday kishore at 14-May-13 2:19am
   
do we have any format to identify text file????
Jochen Arndt at 14-May-13 2:46am
   
Like for binary files this can be done by analyzing the content. Text files did not contain null bytes and UTF encoded files may/must (UTF-8/UTF-16) begin with a BOM.
p.uday kishore at 16-May-13 0:52am
   
me again.if i change file name to filename.txt.jpg am not able to read it in binary mode even.am getting wrong data while reading.
Jochen Arndt at 16-May-13 2:47am
   
How do you open and read the data from the file?
p.uday kishore at 16-May-13 5:31am
   
eg: filename.txt.jpg i have.when i hide the extension filename.txt only it will show i passed this as the input then am not able to get the data.later when I've given with full extension i got output.
Got to know that file path should be clear.i missed it while scanning.Thank you..
Jochen Arndt at 16-May-13 5:36am
   
You must of course use the real file name, not those shown by the Explorer when hiding the extension of known types (which is generally a bad idea due to security reasons).
p.uday kishore at 16-May-13 5:47am
   
ya ya.Now issue got into deep like i have to deal with that registry key(to UN-HIDE extensions) when passing input to scan.
Jochen Arndt at 16-May-13 5:51am
   
How did you get the list of file names?
When using a shell function, there is probably a member variable or function to get the real name instead of the display name.
p.uday kishore at 16-May-13 6:04am
   
using cfilefind in MFC.
using this am getting the file-names one by one in a directory if the extension is in hidden mode(if two extensions)i cant scan it that's why i have to deal with registry key to unhide the extension and then scan for image.
Jochen Arndt at 16-May-13 6:26am
   
I have just tested it:
Using CFileFind with hidden extension for known file types in the Explorer returns the real file name and path when using GetFileName() and GetFilePath().
There is no need to fiddle with the registry.
 
Background: CFileFind uses the API functions FindFirstFile / FindNextFile which return the real names.
 
To get the name as shown by the Explorer, you can call SHGetFileInfo().
p.uday kishore at 16-May-13 7:08am
   
Thank u again....
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 3

It sounds as if you mean that you want to be able to find out what the content of a file is, regardless of filename and extension. In order to do so you would need to research various file format specifications.
 
Wrt JPEGs, all JPEGs always begin with hexadecimal FF D8 (ASCII ÿØ) - the rest is wrapper specific. You can search for the specification yourself. However, I did find this handy page with a lot of different file format identifiers (though I cannot guarantee the accuracy) including a set of various JPEG format headers: http://www.garykessler.net/library/file_sigs.html[^]
 
Regards,
Ian.
  Permalink  
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

It can't be done. The only way to do this kind of thing is to record the filename before it is renamed and look at the cached value.
  Permalink  
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 2

Basically you cant because,Even you read all files as binary but , for reading binary you need file format of that file and but obeouse reseon you dont have it.
Second is that you can make log of file like before name change you stores all file names in some log file or variables and cross verify after name changed.
As H.Brydon says ,you may check chached value but you can get only reccent entries not all updates.
  Permalink  
Comments
p.uday kishore at 3-May-13 2:32am
   
idea behind this is in my application i have to find the pornographic image in a given set of images.
if any one changed the extension of that image even then i have to do it for that i need know how to read file in binary mode.
Coder Block at 3-May-13 2:36am
   
yup then you even make it more simple do you know the path of file where that image resided???
p.uday kishore at 3-May-13 2:52am
   
ya we wil hav the path.
i hav to check the extension and then i hav to send it to the scanning..so can i have any way to get that information.
Coder Block at 3-May-13 3:00am
   
Best way is that,look you have binary data before renaming file and after renaming file so just compare both and if compare match then you will get the file that you want.Look
Maybe following manner,
//Before rename of the file..
//Read file data in binary format and store it in CByteArray.
//now user will change the file extention or name and not content of the file so,
//Scan all files in same path match data that you have with each and every file data and if match found then work it!!!
p.uday kishore at 3-May-13 3:03am
   
but we cant get that info in investigation.that's y we have to our self find the file type and send it to scanning.
Coder Block at 3-May-13 3:15am
   
thats restrict...:(
p.uday kishore at 3-May-13 6:10am
   
but what to do requirement.???

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 OriginalGriff 385
1 Sergey Alexandrovich Kryukov 329
2 CPallini 270
3 DamithSL 214
4 Maciej Los 192
0 OriginalGriff 5,515
1 DamithSL 4,451
2 Maciej Los 3,902
3 Kornfeld Eliyahu Peter 3,480
4 Sergey Alexandrovich Kryukov 3,175


Advertise | Privacy | Mobile
Web02 | 2.8.141216.1 | Last Updated 3 May 2013
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100