Click here to Skip to main content

FDF .NET Parser

Introduction

FDF stands for Forms Data Format, and similar to XML FDF is used to store data for archiving purposes.

The MIME-type for FDF files is Application/FDF and can be opened by Acrobat PDF plug-in. 

Background

Looking inside FDF files, you will see that it's straightforward, consists of the list of fields value-name pairs, and then a URL to the actual PDF file with the form to be filled with this data.

Using the code

Since the structure of the file is pretty easy and straightforward, parsing it was pretty easy too, I used regex (Regular Expressions) to find the fields name-value pairs, and the URL. I added a method to download the PDF file and return that as byte array. 

 using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.IO;
using System.Net;

namespace FDF
{
    public class Parser
    {
        public static FDFData Parse(String FileName)
        {
           
            FDFData result = new FDFData();
            StreamReader reader = new StreamReader(FileName);
            String FDFData = reader.ReadToEnd();
            string strRegex = 
                @"<<\s/V\s\((?<Value>.*?)\)\s/T\s\((?<Name>.*?)\)\s\s>>|/F\s\((?<URL>.*?)\)";
            RegexOptions myRegexOptions = RegexOptions.None;
            Regex myRegex = new Regex(strRegex, myRegexOptions);
            foreach (Match myMatch in myRegex.Matches(FDFData))
                if (myMatch.Success)
                    if (!String.IsNullOrEmpty(myMatch.Groups["URL"].Value)) 
                        result.Url = myMatch.Groups["URL"].Value;
                    else
                        result.Fields[myMatch.Groups["Name"].Value] = 
                                myMatch.Groups["Value"].Value.Replace("\\", "");
                  
            return result;
        }
    }
    public class FDFData
    {
        private Dictionary<String, String> _Fields;
        private String _Url;
        private byte[] _PDF;

        public byte[] PDF
        {
            get 
            {
                if (_PDF == null)
                    _PDF = (new WebClient()).DownloadData(this.Url);
                return _PDF; 
            }
            set { _PDF = value; }
        }
        public FDFData()
        {
            _Fields = new Dictionary<string, string>();
            _Url = "";
        }

        public String Url
        {
            get { return _Url; }
            set { _Url = value; }
        }

        public Dictionary<String, String> Fields
        {
            get { return _Fields; }
            set { _Fields = value; }
        }

    }
} 

Points of Interest

FDF files can include duplicate field names with same or different values, I think if you reached this case you have something wrong in the FDF creation process. to keep the fields searchable I used a Dictionary.

History 

Version 1.0 10/02/2012.


Web01 | 2.8.160207.1 | Advertise | Privacy
Copyright © CodeProject, 1999-2016
All Rights Reserved. Terms of Service