Click here to Skip to main content
15,867,330 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi,

I am using itextsharp to read data from PDF file. I am having problem to read data from checkbox. Here i can't use any OCR. To use OCR again i need to convert PDF to Image and then can be processed by OCR.

Other than OCR, just by using itextsharp. Is there any possibility to read data from checkbox based on coordinates.

I am able to read all content from PDF using itextsharp other than checkbox value.

Please help me on this.
Posted

1 solution

You might try this code here - I remembered that it was not so straight forward - you actually had to read a value of the field. If it does not have a field name - you will need to get the reference to the checkbox and read values - it does not automagically translate to bool.
C#
public String getCheckboxValue(String src, String name) 
{
   PdfReader reader = new PdfReader(SRC);
   AcroFields fields = reader.getAcroFields();
   // CP_1 is the name of a check box field
   String[] values = fields.getAppearanceStates("CP_1");
   StringBuilder sb = new StringBuilder();
   for (String value : values) 
   {
       sb.append(value);
       sb.append('\n');
   }
   return sb.ToString();
}

Or you can use this [From Here:
Get the Export Value of a Checkbox[^]
C#
public string GetCheckBoxExportValue(AcroFields fields, string cbFieldName)
{
    AcroFields.Item item = fields.GetFieldItem(cbFieldName);
    if (item.values.Count > 0)
    {
        PdfDictionary valueDict = item.values[0] as PdfDictionary;
        PdfDictionary appDict = valueDict.GetAsDict(PdfName.AP);

        if (appDict != null)
        {
            PdfDictionary normalApp = appDict.GetAsDict(PdfName.N);

            foreach (PdfName curKey in normalApp.Keys)
            {
                if (!PdfName.OFF.Equals(curKey))
                {
                    // string will have a leading '/' character
                    return curKey.ToString();
                }
            }
        }

        PdfName curVal = valueDict.GetAsName(PdfName.AS);
        if (curVal != null)
        {
            return curVal.ToString();
        }

    }

    return null;
}
 
Share this answer
 
v2
Comments
KVPalem 15-Jun-15 3:16am    
Thanks for the response. I think above solution will work for E-Form PDF but not for the
Vector or Flattened PDF's. Because in normal PDF's (i.e. other than E-Form) we can't get the AcroFields using iTextSharp.

I don't have any problem with E-Form. i am able to read all data. only problem with non editable PDF forms.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900