Click here to Skip to main content
15,884,388 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
I have a pdf which contains three tables,with the purchase details,my task is to extract all the 3 tables from the pdf and convert each into an excel sheet(three excel sheets)using c# code.,i google'd for 3days,all i could find was code to extract the text from pdf(but without any formatting),i cant purchase any third party tools,i need a way to atleast extract the text in proper table formats,then i will convert it to excel using interop,OR a code to directly convert to excel,whatever the solution is i need it urgently,pls help.
Posted
Updated 29-Sep-13 2:57am
v2

You can check below mentioned links for more info.

////////////////////////////////////////////////////////////////////////////////////////////////////
// This example was designed for using in Microsoft Visual C# from 
// Microsoft Visual Studio 2003 or above.
//
// 1. Microsoft Excel 97 or above should be installed and activated on your PC.
//
// 2. Before using this example, please read this article from Microsoft Excel 2003 knowledge base:
//    http://support.microsoft.com/kb/320369/en-us/
//    A workaround for this issue is available in this example.
//
// 3. Universal Document Converter 5.2 or above should be installed, too.
//
// 4. Add references to "Microsoft Excel XX.0 Object Library" and "Universal Document Converter Type Library"
//    using the Project | Add Reference menu > COM tab.
//    XX is the Microsoft Office version installed on your computer.
////////////////////////////////////////////////////////////////////////////////////////////////////
 
using System;
using System.IO;
using UDC;
using Excel = Microsoft.Office.Interop.Excel; //using Excel; in VS2003
 
namespace ExcelToPDF
{
    class Program
    {
        static void PrintExcelToPDF(string ExcelFilePath)
        {
            //Create a UDC object and get its interfaces
            IUDC objUDC = new APIWrapper();
            IUDCPrinter Printer = objUDC.get_Printers("Universal Document Converter");
            IProfile Profile = Printer.Profile;
 
            //Use Universal Document Converter API to change settings of converterd document
            Profile.PageSetup.ResolutionX = 600;
            Profile.PageSetup.ResolutionY = 600;
 
            Profile.FileFormat.ActualFormat = FormatID.FMT_PDF;
 
            Profile.FileFormat.PDF.ColorSpace = ColorSpaceID.CS_TRUECOLOR;
            Profile.FileFormat.PDF.Multipage = MultipageModeID.MM_MULTI;
 
            Profile.OutputLocation.Mode = LocationModeID.LM_PREDEFINED;
            Profile.OutputLocation.FolderPath = @"c:\UDC Output Files";
            Profile.OutputLocation.FileName = @"&[DocName(0)] -- &[Date(0)] -- &[Time(0)].&[ImageType]";
            Profile.OutputLocation.OverwriteExistingFile = false;
 
            Profile.PostProcessing.Mode = PostProcessingModeID.PP_OPEN_FOLDER;
 
            //Create a Excel's Application object
            Excel.Application ExcelApp = new Excel.ApplicationClass();
 
            Object ReadOnly = true;
            Object Missing = Type.Missing; //This will be passed when ever we don’t want to pass value
 
            //If you run an English version of Excel on a computer with the regional settings are configured for a non-English language, you must set the CultureInfo prior calling Excel methods.
            System.Threading.Thread.CurrentThread.CurrentCulture = new System.Globalization.CultureInfo("en-US");
            //Open the document from a file
            Excel.Workbook Workbook = ExcelApp.Workbooks.Open(ExcelFilePath, Missing, ReadOnly, Missing, Missing, Missing, Missing, Missing, Missing, Missing, Missing, Missing, Missing, Missing, Missing);
 
            //Change active worksheet settings and print it
            Excel.Worksheet Worksheet = (Excel.Worksheet)Workbook.ActiveSheet;
            Excel.PageSetup PageSetup = Worksheet.PageSetup;
 
            PageSetup.Orientation = Excel.XlPageOrientation.xlLandscape;
 
            Object Preview = false;
            Worksheet.PrintOut(Missing, Missing, Missing, Preview, "Universal Document Converter", Missing, Missing, Missing);
 
            //Close the spreadsheet without saving changes
            Object SaveChanges = false;
            Workbook.Close(SaveChanges, Missing, Missing);
 
            //Close Microsoft Excel
            ExcelApp.Quit();
        }
 
        static void Main(string[] args)
        {
            string TestFilePath = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "TestFile.xls");
            PrintExcelToPDF(TestFilePath);
        }
    }
}


For more info:

http://social.msdn.microsoft.com/Forums/vstudio/en-US/a56b093b-2854-4925-99d5-2d35078c7cd3/converting-pdf-file-into-excel-file-using-c[^]

http://stackoverflow.com/questions/769246/xls-to-pdf-conversion-inside-net[^]

Convert data from PDF invoice to Excel CSV file in C# using PDF Extractor SDK

http://bytescout.com/products/developer/pdfextractorsdk/extract-from-pdf-to-excel-csv-in-csharp[^]

How To Convert PDF to Excel in .NET Framework

http://www.moretechtips.net/2013/01/how-to-convert-pdf-to-excel-in-net.html[^]

I hope this will help to you.
 
Share this answer
 
v2
Comments
sundaram meenakshi 29-Sep-13 9:26am    
thank you @Sampath Lokuge for your reply,but,the code and 1,2 links are for converting excel to pdf,i want pdf to excel.,the Bytescout link is exactly the output what i want,but,i need any free dll's,so,pls suggest.
Sampath Lokuge 29-Sep-13 10:07am    
What about the Last link ? Did you check that ?
sundaram meenakshi 30-Sep-13 0:05am    
@Sampath Lokuge the last link requires executing in command line interface.,but i want my winform application to perform this process,that is the problem :(
For that purpose you need to use some third party tool. Becasue i dont think .NET support that. There are many thord party dll available which you can use in your project and implement the desire functionlaity. SOme of them are:
Quote:
PDF Converter Services

iTextSharp

Excel to PDF .NET
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900