Click here to Skip to main content
15,919,778 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Is there a magic solution? Is it possible to create a PDF document based on an existing document without losing all its accessibility tags?
Our situation is quite complex here and I am looking for a clean and elegant solution to our problem that we have been trying to solve for quite some time.
Our system allows the customer to load a PDF file, and we have a UI interface with which the customer can draw on the file, write, fill in signatures
and finally also download the file to him via the browser.
The problem occurs when the client loads a PDF file with accessibility for those with disabilities or the disabled.
When he downloads the PDF, it is no longer accessible and all its tags are gone.
We use PDFSharp as the library that helps us perform all the manipulations on the PDF.
From a check in our code we do not refer to accessibility tags at all so we are now trying to check how we manage to find them and keep them during the code run.
In a nutshell, what we do is the following code:
using PdfSharp.Drawing;
using PdfSharp.Pdf;
using PdfSharp.Pdf.IO;
using System;
using System.Collections.Generic;
using System.IO;
private PdfDocument DrawOnPdf(Stream stream, dynamic layers, dynamic parameters, bool bIsAutomated = false)
            //This is the object generated from pdf the client uploaded
            PdfSharp.Drawing.XPdfForm formOriginal = XPdfForm.FromStream(stream);
            //We now create a new PdfSharp.Pdf.PdfDocument
            PdfDocument outputDocument = new PdfDocument();
            XGraphics gfx = null;
            XRect box;
            outputDocument.PageLayout = PdfPageLayout.SinglePage;
            //We're iterating through all of the uploaded pdf pages
            for (int idx = 1; idx <= formOriginal.PageCount; idx += 1)
                // Add a new page to the output document
                PdfPage page = outputDocument.AddPage();
                // Set page number (which is one-based)
                formOriginal.PageNumber = idx;
                page.Orientation = formOriginal.Page.Orientation;
                page.Width = formOriginal.Page.Width;
                page.Height = formOriginal.Page.Height;

                double width = page.Width;
                double height = page.Height;
                gfx = XGraphics.FromPdfPage(page);
                box = new XRect(0, 0, width, height);
                // Draw the page identified by the page number like an image
                gfx.DrawImage(formOriginal, box);
                dynamic currentPage = GetPageLayer(layers, idx, idx == formOriginal.PageCount);
                //This method of ours can write text, date, block, radio, circle etc.
                DrawPdfLayer(gfx, currentPage, parameters, bIsAutomated);

            return outputDocument;
        private static void DrawPdfLayer(XGraphics gfx, dynamic page, dynamic parameters, bool bIsAutomated = false)
            object item = null;

                foreach (var layer in page.layer)
                    item = layer;

                    double x = layer.x;
                    double y = layer.y;

                    string input = null;
                    dynamic arrInput = null;
                    if (parameters != null && layer.input_id != null &&
                            parameters[layer.input_id.ToString()] != null)
                        var tmp = parameters[layer.input_id.ToString()];
                        if (tmp is Newtonsoft.Json.Linq.JArray)
                            arrInput = tmp;
                            input = tmp.ToString();

                    bool clear = (bool)layer.cleardash;
                    if (clear && input != null)
                        input = input.Replace("-", "   ");

                    bool emailUser = (bool)layer.emailuser; 
                    if (emailUser && input != null)
                        if (!String.IsNullOrEmpty(input))
                            input = input.Remove(input.IndexOf('@'));

                    bool emailDomain = (bool)layer.emaildomain; 
                    if (emailDomain && input != null)
                        if (!String.IsNullOrEmpty(input))
                            input = input.Remove(0, input.IndexOf('@') + 1);
                    var brush = XBrushes.Black;
                    if (layer.color != null)
                        brush = new XSolidBrush(XColor.FromArgb(255, (int)layer.color.R, (int)layer.color.G, (int)layer.color.B));

                    string layerType = layer.type.ToString();
                    switch (layerType)
                        case "text":
                            WriteText(ref gfx, brush, layer, input, x, y);
                        case "numberinput": 
                            int maxLength = (int)layer.maxlength;
                            if (input != null)
                                if (input.Length < maxLength && maxLength > 0)
                                    input = input.PadLeft(maxLength);
                            WriteText(ref gfx, brush, layer, input, x, y);

                        case "date":
                            WriteDate(ref gfx, brush, layer, input, x, y);

                        case "block":
                            WriteBlock(ref gfx, brush, layer, input, x, y);

                        case "circle":
                        case "radio":
                            if (input != null) arrInput = new string[] { input };

                            if (arrInput != null)
                                foreach (var val in arrInput)
                                    if (layerType == "radio")
                                        WriteRadio(ref gfx, brush, layer, val.ToString(), x, y, bIsAutomated);
                                        WriteCircle(ref gfx, brush, layer, val.ToString(), x, y);

            catch (Exception ex)

What I have tried:

What we tried so far was cloning the PDFDocument object, save all of its pages on a temp list, remove all of the original one pages and then working on the empty pdf but with the accessibility tags. We found out that the tags are missing as well.
Updated 20-Dec-22 12:02pm
[no name] 19-Dec-22 9:29am
oronsultan 19-Dec-22 10:27am    
Thanks for the answer Gerry but we saw that post too. Read till the end :-). We tried that, Its not working right.

1 solution

You can try to copy the accessibility tags from the original document to the new document. The accessibility tags are stored in the document's structure tree, which can be accessed using the PdfDocument.StructureTreeRoot property. You can copy the entire structure tree from the original document to the new document using the PdfStructureTree.Clone method.

So, after copying the pages, also copy the StructureTreeRoot, like this:

// Load the original document
PdfDocument originalDocument = PdfReader.Open("original.pdf");

// Create a new document
PdfDocument newDocument = new PdfDocument();

// Copy the pages from the original document to the new document
foreach (PdfPage page in originalDocument.Pages)
    PdfPage newPage = newDocument.AddPage();
    newPage.Size = page.Size;
    newPage.Orientation = page.Orientation;
    XGraphics gfx = XGraphics.FromPdfPage(newPage);

// Copy the structure tree from the original document to the new document
if (originalDocument.StructureTreeRoot != null)
    newDocument.StructureTreeRoot = originalDocument.StructureTreeRoot.Clone();

// Save the new document
Share this answer

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900