Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C# HTML PDF itextsharp
I keep on getting a null exception when I attempt to parse an Html string built in a stringbuilder object to an list. If I swap out the string builder object for some hard-coded HTML, it goes off without a hitch. I know the string builder is actually doing it's job because I have it outputting to a textbox. It displays flawless HTML, but when I attempt to call that string out with the HtmlWorker class. Does anyone have any clue why it would generate output when it is referenced as a string, but not when I attempt to drop it in the list object? I've included my code, and I will bold the exact Line I receive the error.
 
private void createPDF()
{
    openFileDialog.Filter = "HTML Files (.html)|*.html";
    openFileDialog.FilterIndex = 0;
    if (openFileDialog.ShowDialog() == DialogResult.OK)
    {
        System.Text.StringBuilder store = new System.Text.StringBuilder();
        string fileName = openFileDialog.FileName;
 
        try
        {
            using (System.IO.StreamReader htmlReader = new System.IO.StreamReader(openFileDialog.FileName))
            {
                string line;
                while ((line = htmlReader.ReadLine()) != null)
                {
                    store.Append(line + Environment.NewLine);
                }
                this.textBox1.Text = store.ToString();
                string html = store.ToString();
                Document document = new Document(PageSize.LETTER, 20, 25, 35, 35);
 

                PdfWriter.GetInstance(document, new FileStream("c:\\Data/my.pdf", FileMode.Create));
                document.Open();
                System.Collections.Generic.List<IElement> htmlarraylist = new List<IElement>(HTMLWorker.ParseToList(new StringReader(html), new StyleSheet()));
                foreach (IElement element in htmlarraylist)
                {
                    document.Add(element);
                }
                document.Close();
            }
        }
        catch (Exception ex)
        {
            MessageBox.Show(ex.ToString());
        }
 
Like I said, if I swap out the line that links the Stringbuilder object to the string 'html' it works fine. The Null reference occurs every time on the bold line.
Posted 6-Jul-12 6:14am
Comments
Wes Aday at 6-Jul-12 11:30am
   
Try changing new StringReader(html) to new StringReader(html).ToString() and see what happens.
MikeVaros at 6-Jul-12 11:43am
   
"Cannot convert from 'string' to 'System.IO.TextReader'" I'm almost positive I tried that myself yesterday, but I figured it may be worth a shot. Thank you for the response though. It seems to think that the "store" String is empty. If I manually enter the html as a string, it works perfectly, but this isn't dynamic enough to suit my needs. I'm banging my head off the wall trying to figure this out.
 
I'm just going out on a limb here, but if the size of the string exceeds Stringbuilder's default capacity, could it return a null exception?
Wes Aday at 6-Jul-12 11:48am
   
Probably but it would have to be a really huge string to exceed the capacity. My bad I read your line wrong anyway. What is it on that line that is null?
MikeVaros at 6-Jul-12 11:55am
   
I'll paste in my stack trace. It's pretty large to describe easily.
 
at iTextSharp.text.html.simpleparser.HTMLWorker.CreateLineSeparator(IDictionary`2 attrs)
at iTextSharp.text.html.simpleparser.HTMLTagProcessors.HTMLTagProcessor_HR.StartElement(HTMLWorker worker, String tag, IDictionary`2 attrs)
at iTextSharp.text.html.simpleparser.HTMLWorker.StartElement(String tag, IDictionary`2 attrs)
at iTextSharp.text.xml.simpleparser.SimpleXMLParser.ProcessTag(Boolean start)
at iTextSharp.text.xml.simpleparser.SimpleXMLParser.Go(TextReader reader)
at iTextSharp.text.xml.simpleparser.SimpleXMLParser.Parse(ISimpleXMLDocHandler doc, ISimpleXMLDocHandlerComment comment, TextReader r, Boolean html)
at iTextSharp.text.html.simpleparser.HTMLWorker.Parse(TextReader reader)
at iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(TextReader reader, StyleSheet style, IDictionary`2 tags, Dictionary`2 providers)
at iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(TextReader reader, StyleSheet style, Dictionary`2 providers)
at iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(TextReader reader, StyleSheet style)
 
Thank you for your help,
 
Mike
Wes Aday at 6-Jul-12 12:08pm
   
Sorry. Nothing is standing out to me. The only thing I can think of is that the StringReader is returning null... ah ha.... I think I see it now. In your StringBuilder you are appending newlines (CR/LF) which in not good HTML. In HTML you would want <br> so I think that your StringReader is choking on that.
Wes Aday at 6-Jul-12 12:09pm
   
Meant to say your StringReader is choking on the newlines not the stringbuilder.
MikeVaros at 6-Jul-12 13:36pm
   
I feel like an idiot now. But I suppose I'd prefer it was something simple as opposed to a logic error. I was using the horizontal line tag in each of the html pages I was trying to run through the program. And it doesn't look like HTMLWorker has any methodology for handling this tag. The reason my hard-coded HTML worked was because I didn't use that tag in there. I was just doing a short snippet in a string. I figured it out the second I tried to insert one of those "hr" tags into that string. Instantly got a Null Reference Exception, and it clicked.
 
Thank you for helping, and I'll keep that in mind about the <br> tags if I notice any formatting issues with the string builder.
 
Mike
MikeVaros at 6-Jul-12 11:59am
   
It only seems to error out when that string "html" is created with a stringbuilder object. So the problem is there. The only problem is, that sb object renders the correct output to the textbox, so I know it isn't empty. And it doesn't truncate any text, so I know it isn't going beyond capacity.

1 solution

Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

Glad you got it working. Solution here is to make the question drop off the unanswered list.
 
The problem was a null reference exception being thrown due to HTML tags that the parser was unable to handle. Solution was to remove the tags.
  Permalink  

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 George Jonsson 215
1 Kornfeld Eliyahu Peter 169
2 OriginalGriff 120
3 PIEBALDconsult 110
4 BillWoodruff 85
0 OriginalGriff 6,165
1 DamithSL 4,658
2 Maciej Los 4,087
3 Kornfeld Eliyahu Peter 3,649
4 Sergey Alexandrovich Kryukov 3,294


Advertise | Privacy | Mobile
Web01 | 2.8.141220.1 | Last Updated 6 Jul 2012
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100