Click here to Skip to main content
Click here to Skip to main content

Simplest PDF Generating API for JPEG Image Content

By , 19 Dec 2008
 

Introduction

I was working on a project in which I need to wrap a JPEG file into PDF format. The program needs to be done in C, and after searched on the Internet, I could not find anything that I can refer to. Most of the Open Source PDF engine is based on either Java or PHP, and a few C PDF engines are huge and will add a lot of unnecessary code to my project. I decided to write this simple JPEG to PDF wrapper. And it's the result of reverse-engineering of the simplest PDF file that contains one single JPEG file. I just want to share this API so you can grab and use it if you have a similar requirement.

Using the Code

Just give an example to demonstrate how to generate a 2 page PDF file based on 2 JPEG files. Please refer to testMain.c for details. But the idea is:

PJPEG2PDF pPDF;
int pdfByteSize, pdfOutByteSize;
unsigned char *pdfBuf;

pPDF = Jpeg2PDF_BeginDocument(8.5, 11);    
	/* pdfW, pdfH: Page Size in Inch ( 1 inch=25.4 mm ); Letter Size 8.5x11 */

if(NULL != pPDF) {
    
    Loop For All JPEG Files {
        ... Prepare the current JPEG File to be inserted.
        /* You'll need to know the dimension of the JPEG Image, 
	and the ByteSize of the JPEG Image */
        Jpeg2PDF_AddJpeg(pPDF, JPEG_IMGW, JPEG_IMGH, JPEG_BYTE_SIZE, 
		JPEG_DATA_POINTER, IS_COLOR_JPEG);
    } 
    
    /* Call this after all of the JPEG image has been inserted. 
	The return value is the PDF file Byte Size */
    pdfByteSize = Jpeg2PDF_EndDocument(pPDF);
    
    /* Allocate the buffer for PDF Output */
    pdfBuf = malloc(pdfByteSize);
    
    /* Output the PDF to the pdfBuf */
   Jpeg2PDF_GetFinalDocumentAndCleanup(pPDF, pdfBuf, &pdfOutByteSize);
   
   ... Do something you want to the PDF file in the memory.
}

There are several places that you can fine-tune in the Jpeg2PDF.h file:

#define MAX_PDF_PAGES        256     /* Currently only supports less than 256 Images */

#define PDF_TOP_MARGIN        (0.0 * PDF_DOT_PER_INCH)    /* Currently No Top Margin */
#define PDF_LEFT_MARGIN        (0.0 * PDF_DOT_PER_INCH)   /* Currently No Left Margin */

That's it, guys. Enjoy.

History

  • Updated [2008-12-19] - Added some extra code to auto scan the current folder and automatically obtain the JPEG image dimension from the JPEG file instead of using hard coded value before.
    The JPEG image dimension code is borrowed from here.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Hao Hu
Software Developer
United States United States
No Biography provided

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralDesired enhancementsmembermonday200014-Aug-10 0:09 
Hello, Hao Hu.
 
I'm planning to enhance a bit your program.
 
First of all, it is required to remove the limitation of 256 pages. This might be done via the temp file usage - instead of the RAM memory usage. It is currently the main flaw - the max files amount shouldn't be limitated at all.
 
Also I'd like to consider the real files DPI instead of the hard-coded 72 dpi.
 
Besides it is not the good idea to use the memory buffers of a pre-set size - what is the guarantee that their length would be always enough?
 
And I used A4 paper size in my clone of your program. I write the canvas size to PDF as double (%.6f)- and you - as integer (%.d) (which is not precise). I am from Russia, and in Russia we do not use inches - instead we use only the millimeters.
 
Also in your program a small JPG is stretched to the whole caanvas. I will make an option to keep the JPG original size (placing it in the center of the canvas).
 
I found also some analoguous PDF library - libharu.org. But I don't know how to compile it in my MS VC++ 6.0 for Windows.
GeneralRe: Desired enhancementsmemberHao Hu14-Aug-10 15:12 
Thanks a lot for your enthusiastic about this small library.
 
The original purpose of the library is to provide a start point of anyone who need a simple library code like this. Feel free to change it and use it at your own risk.
 
Regarding some of your concerns:
 
Computers always have trade-off. As an example, UNIX timestamps also have limitation of the possible time range. The trade-off of changing 256 to a unlimited number is the simplicity of the program. For my original motivation for writing this piece of code, 256 is enough for me. So I just want to keep the code simple.
 
Regarding 72DPI, I'm not 100% sure. But I think this is the predefined value in PDF spec. So I think changing this number to something else will only give yourself trouble.
 
Memory buffer size is also a trade-off technique for coding simplicity. I think you might not be able to find a case that will cause buffer overflow, since all the estimated sizes are safe enough. To give you an example, 2.45m is the current Olympic record for men high jump. So 3m will be a safe enough number to say no human can jump over. By using this kind of safe estimation, you might waste a little bit of memory, but you don't need to do dynamic memory allocation, which will spend extra time and also need add a lot of error checking logic.
 
For the paper size, I think even you use floating point number, but PDF actually will only use the integer value. You can easily verify this by opening any A4 document that generated by other PDF engine. I guess the reason is very simple, the error will only be 0.5*25.4/72=0.176mm, which is not significant at all.
 
Of course, there are many PDF libraries, they might provide more features that the code I have here, if you need those features, you should consider to switch.
GeneralRe: Desired enhancementsmembermonday200015-Aug-10 19:44 
Computers always have trade-off.
So I just want to keep the code simple.
I totally agree with you. I also will do my best to keep the derived code as simple as possible.
The trade-off of changing 256 to a unlimited number is the simplicity of the program.
By the way - I did not understand yet - where does this limitation come from? Maybe it is the probable RAM memory size limitation? If yes - than introducing a temp file usage would not break the trade-off significantly. The code will remain enough simple and understandable.
Of course, there are many PDF libraries, they might provide more features that the code I have here, if you need those features, you should consider to switch.
It is so much illusionary actually. I need exactly something for C++ - not C# and not Java. And the libs which conform - are too complicated to even compile them (not to tell about anything else).
 
Probably the trade-off you are speaking about may go broken if I stop using the memory buffers of the pre-set length. So I will think this issue through thouroughly before applying.
GeneralI made my program based on this onemembermonday200013-Aug-10 1:12 
Hello, Hao Hu.
 
I made my own program based on your utility.
 
My program is called "fi2pdf". See more details here:
 
https://sourceforge.net/projects/freeimage/forums/forum/36111/topic/3721193/index/page/1
 
My program is intended to use the FreeImage library to create PDFs.
GeneralRe: I made my program based on this onemembermonday200016-Mar-11 22:03 
I made a new version of my program:
 
fi2pdf v2.1
 
See the details in the same place:
 
https://sourceforge.net/projects/freeimage/forums/forum/36111/topic/3721193/index/page/1
GeneralRe: I made my program based on this onememberHao Hu18-Mar-11 20:16 
That's great.
The idea of posting code here is just to share with others.
And I'm happy to see people can use it and make change as necessary.
 
In fact, the original reason of developing this code was to generate PDF file on an embedded environment. So I don't have the luxury of file system. Everything need to be done within memory. e.g. jpegs are buffer in memory, and need to generate PDF in memory. Then send out by network.
 
I saw you took the advantage of file system to reduce the memory usage, that's totally good for an utility that you want.
GeneralBug foundmembermonday200013-Aug-10 1:08 
Hello, Hao Hu.
 
I found a bug in your program.
 
Inside the function
 
STATUS Jpeg2PDF_GetFinalDocumentAndCleanup(PJPEG2PDF pPDF, UINT8 *outPDF, UINT32 *outPDFSize)
 
you should comment the statement:
 
if(outPDF && (*outPDFSize >= pPDF->currentOffSet))
 
To understand why, initialize *outPDFSize with zero (in the calling function) - and your program is to stop working (at least on the big-filesize 24 bit JPEGs).
 
It currently relies on the randomly non-zero initialized *outPDFSize - which works in Debug, but does not in Release (on the big-filesize 24 bit JPEGs).
GeneralRe: Bug found [modified]memberHao Hu13-Aug-10 5:43 
Thanks for your message.
 
However, I think the place you point out is not right. The issue is within the test code.
Basically, when calling Jpeg2PDF_GetFinalDocumentAndCleanup(), I want user to pass in
their buffer size to avoid the situation of overflow. So the *outPDFSize should be initialized
in the caller as the size of the outPDF buffer. Later, I'll change that value and let caller
know the exact byte size that has been used. This is a very common way for API to get then set
a buffer size.
 
It seems complicate for me to update this article right now. (codeproject ask to submit my change to their editor for the update, which is much more complicate than before)
 
So what should been done is in the testMain.c:
Just remove the declaration of: pdfFinalSize and replace all pdfFinalSize with pdfSize.
So pdfSize will be a In/Out variable.
In: Caller let the jpeg2pdf know the byte size of pdfBuf
Out: jpeg2pdf let the caller know the actual byte size that has been used.
 
Thanks.

modified on Saturday, August 14, 2010 4:26 AM

QuestionMay I re-use your code under the "GPL 2 and later" license?membermonday20002-Jul-10 1:32 
Hello Hao Hu.
 
May I re-use your code under the "GPL 2 and later" license?
AnswerRe: May I re-use your code under the "GPL 2 and later" license?memberHao Hu2-Jul-10 7:33 
Even I wrote this small module for an existing product. But I don't see any problem that will prevent you from using it without any worry. Feel free to use it even for commercial product.
 
Thanks a lot for the 5 star.
GeneralMy vote of 5membermonday20001-Jul-10 23:06 
I desperately seeked for a solution like that. That's the only one in the whole Web! Thanks a lot.
GeneralBmp 2 pdf functionmemberJunson_Feng13-May-10 20:32 
Hi Hao:
 
Thanks for you sharing this,it's usefule for myself to study.
 
But can you add the function that conversion Bmp format image to pdf?
 
Jason.Feng
GeneralRe: Bmp 2 pdf functionmemberHao Hu13-May-10 21:20 
Hi, Jason,
 
What you need is a BMP to JPEG converter. Then you can pack JPEG into PDF.
NConvert is a very good image conversion program, it's available here:
[^]
 
Good Luck.
QuestionHow to keep image's aspect ratio?memberSuper Garrison6-Nov-09 17:50 
I'd appreciate if someone can share.
The ratio changes, the PDF looks no good.
 
Super.
AnswerRe: How to keep image's aspect ratio?memberHao Hu6-Nov-09 20:27 
This code insert one JPEG file per page.
So possible solutions are:
* If all your image files are the same size. Then define the PDF page as the same aspect ratio.
* You can also set the correct margin based on your image files under
"/* Contents Object in Page Object */" section of Jpeg2PDF.c file
 
Or if you really need complicate PDF generator. Then find a full size PDF library.
 
Good luck.
QuestionHow to add more than 2 images?memberWaleedH11-Jul-09 4:54 
Hi...
Great App I gotta say Smile | :)
I need to create a PDF with 10 PDF images.... how can I do this?
 
Thanks

AnswerRe: How to add more than 2 images?memberHao Hu11-Jul-09 13:03 
Thanks for your comment.
 
Just put all your images inside the same folder and run the program.
It should pick up all the JPEG images and generate the PDF file for you.
 
Good luck.
GeneralRe: How to add more than 2 images? [modified]memberWaleedH11-Jul-09 19:58 
Thanks....
But I tried, only 3 images can be converted Frown | :(
Does the Imaege need to be in a fixed size?
 
Thanks
modified on Sunday, July 12, 2009 2:05 AM

GeneralRe: How to add more than 2 images?memberHao Hu11-Jul-09 20:33 
I just tried and it picks up all 12 JPEG files.
It should be able to handle up to 256 JPEG files.
(If you modify the MACRO in the source file and re-compile.
Then you can increase the upper limit)
 
If your JPEG file is larger than 8M, then they might not
be able to inserted correctly due to Windows fread() issue.
(Single fread() can't read in more than 8M)
 
In this case, let me know and I'll update the source code.
GeneralRe: How to add more than 2 images?memberWaleedH11-Jul-09 21:30 
Yeah, it worked!
I'm Sorry, I was using the demo Smile | :)
I have a small question: do you have this code in C#? I intend to use it within my application written in C#.
Thanks in advance.
 
Thanks

GeneralRe: How to add more than 2 images?memberHao Hu11-Jul-09 21:48 
Glad to know that it works for you.
Sorry. I don't have it in C#.
But since the code just requires very basic ANSI C.
It should be very straight forward to port it into any language.
Best of luck.
Generalhimemberxyang_200928-May-09 14:55 
thanks lot
GeneralRe: himemberHao Hu28-May-09 15:17 
My pleasure.
Generalvery good!memberganfeng200320-Dec-08 16:14 
very good program!
 
ganfeng

GeneralRe: very good!memberHao Hu20-Dec-08 22:01 
Thanks a lot. Man. Poke tongue | ;-P

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web04 | 2.6.130619.1 | Last Updated 19 Dec 2008
Article Copyright 2008 by Hao Hu
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid