 |
|
 |
Hello,
i'm trying to remove text and comments from a html file and so i have created this code but it goes in error.
any idea how to fix it?
thank you
public void RemoveWhAndComments(DHtmlNode node)
{
DHtmlText text = node as DOL.DHtml.DHtmlParser.Node.DHtmlText;
if(text != null)
{
if(text.IsWhiteSpace)
{
node.Parent.Nodes.RemoveAt(text.NodeID);
}
return;
}
DHtmlElement element = node as DOL.DHtml.DHtmlParser.Node.DHtmlElement;
if(element != null)
{
for (int i = 0; i < element.Nodes.Count; i++)
{
RemoveWhAndComments(element.Nodes[i]);
}
return;
}
}
|
|
|
|
 |
|
 |
Hello James,
originally you've licensed this work under GPL. Later you provide explicit use for commercial applications and the permission to modify the source code (see posts below). Is the project Public Domain, or have you got a license for this like BSD, LGPL or ? I like to modify your lib therefore i want to know the exact license.
Kind Regards Tom
|
|
|
|
 |
|
 |
when I put the mouse over/select the text from html, the left frame show the node of the text, I change the node to parent or child, I can get the dom xpath.
many thanks.
|
|
|
|
 |
|
 |
is there a webbrowser control in this parser?
_____________________________
Don't download it, make it.
Visual Basic /C#
|
|
|
|
 |
|
|
 |
|
 |
That's just what I needed. thanks very much!
|
|
|
|
 |
|
 |
Is there an innerHTML Property available?
|
|
|
|
 |
|
 |
Take a look at the methods InnerText and TransformHtml
|
|
|
|
 |
|
 |
Hi all
I'm implementing a Winform app about 'HTML parser'.
In my app, the users input an URL (such as: www.amazon.com) and my app will show the expected page in a web browser control.
I want to let users can choose an area on that page and a label control will show all texts in that selected area. How can I do that???
I mean that: how can I determine the HTML tags (in that page) which enclose all selected texts ???
EX:
HTML:
<html>
<body>
</body>
</html>
Page:
selected text
none selected text
When I drag the mouse to enclose "selected text", I want to determine that table with id=1 is selected and "selected text" will be showed in a label control.
Please show me your ideas.
Thank in advance.
mns
|
|
|
|
 |
|
 |
Can we modify this piece of code to parse/resolve .css files ?
Ashish
|
|
|
|
 |
|
 |
i try to make a unit tests for my lib based on yours. but i always get an error: assertion failed. is it somthing wrong in my actions? or may be you know, why error is occurs.
by the way: tell me, why are some classes marked as sealed?
|
|
|
|
 |
|
 |
1. I try to add unit test in my lib via Visual Studio unit test project, and it work well.
Most assertion failed are argurment checking failed.
For example:
[TestMethod()]
public void ComapctWSCStringTest()
{
string str = NULL;
// string str = "Some test string you need assign.";
string expected = NULL;
// string expected = "You need assign expected return value.";
string actual;
actual = DOL.DHtml.DHtmlTextProcessor.ComapctWSCString(str);
Assert.AreEqual(expected, actual, "DOL.DHtml.DHtmlTextProcessor.ComapctWSCString unexpected return value");
}
2. I don't expect classes marked as sealed to extend by inheritance because creating instance of these classes is fixed in
class DHtmlGeneralParser. I think that you may extend these classes in some reasons. Hence, I will remove "sealed keyword"
and implement "abstract factory or factory method pattern" in class DHtmlGeneralParser to meet your requirement .
-- modified at 20:43 Wednesday 1st August, 2007
|
|
|
|
 |
|
 |
Because the 'DHtmlTextProcessor' class modifies a static StringBuilder instance (m_builder) in a few of its methods, you can very easily run into race conditions when using this library in multiple threads. Please keep in mind that I am not asking for these classes to be thread-safe, but rather suggesting that you re-work the DHtmlTextProcessor code so that 2 (or more) DHtmlDocument instances can be created (parsing can occur) on different threads at the same time. Currently, parsing is only safe on one thread at a time. You could synchronize access to m_builder by locking it in each method of DHtmlTextProcessor, however this would cause undue performance overhead. I think you are better off creating a new StringBuilder instance in each method that needs one.
|
|
|
|
 |
|
 |
I think you are right. The reason of using one StringBuilder instance is I concern the performance of memory allocating in each method because DHtmlTextProcessor is main performance bottleneck of this lib;P. But I modify it to create a new StringBuilder instance in each method that needs one, and the performance is OK. So I will update that, thank you for your suggesting.
|
|
|
|
 |
|
 |
| This is great work. One thing that looks to be broken is support for a colon in an attribute name which may be there for namespaces (<html xmlns:vml="urn:schemas-microsoft-com:vml" ...>). On a related note, and I'm not sure what the HTML specs are on this, but most browsers will accept a period in the attribute name ( |
|
|
|
 |
|
 |
As I know, concept of namespace is not defined in HTML spec and all elements of HTML DTD but XHTML has, it is because XHTML is a kind of XML document. Many browsers accept a colon or period in an element name or attribute name but it maybe ignores that to present in screen. In 3.2 HTML Lexical Syntax of HTML - 2.0 (RFC 1866), it seems to permit that so I retain that in parsing.
|
|
|
|
 |
|
 |
It is perfect!
Do you mind I use css parser as library in commercial applications?
|
|
|
|
 |
|
 |
It's OK
|
|
|
|
 |
|
 |
Thank you very much!
And Can I modify some code for some purpose?
|
|
|
|
 |
|
 |
Sure
If you can, please give me some suggestion for this lib. Thanks!
|
|
|
|
 |
|
 |
Thank you~!
I want to add some code to achieve underside function.
1, At rule in CSS2 also can parse;
For example : @import "test.css"->get the name of css file;
@media print{...}->parse into selector which has media type;
2,The CSS value also can parse;
For example:
font-family:'qMmpS Pro W3','Hiragino Kaku Gothic Pro', 'lr oSVbN', 'MS PGothic', Osaka, sans-serif;
->The value(String) list can parse out. Also,the value of Length ,URI,Integers,Colors and so on can parse out.
3,Ignore some invalid token;
For example: H3, H4 & H5 {color: red }
->Ignore the whole line, and not set the color of H3 to red
background: "red"->Ignore the whole line
-- modified at 2:47 Monday 16th July, 2007
|
|
|
|
 |
|
 |
Wow, you know CSS very much! I fact, the lib ignores all At rule in CSS declaration, and it implement subset of all CSS2 function. Sorry . I wish the original parsing structure does not hinder you to extend these functions. If you need my help, please open mind to tell me. I am very happy this lib can help you.
|
|
|
|
 |
|
 |
You are very kind.
Thank you very much.
|
|
|
|
 |
|
 |
Now, interesting is popular studies parser this domain.
Tried your demo to let me be interested very much.
Wants to use your library to try it.
Did not know whether you also do have the more specifies document?
(ex. class diagram or method usage and so on.)
Could you provide me to refer?
Thank you very much!!
|
|
|
|
 |
|
|
 |