|
Perhaps a better solution would be to allow an optional pml document header that allows the writer to specify what the bracket characters are, and the tokenizer adjusts appropriately? Make the default [] and let the rest of the world pick ones that work better for their keyboard layouts?
|
|
|
|
|
Yes, that's a good suggestion. Thank you.
See my answer[^] to another member who suggested a BRACKETS parameter.
|
|
|
|
|
I agree 100 % with the rationale behind PML and pXML
|
|
|
|
|
Glad you liked it. Thank you so much.
|
|
|
|
|
|
|
Meanwhile, in Lisp syntax:
(doc title:"Test"
(ch title:"An Unusual Surprise"
(p Look at the following picture:)
(image source:"images/strawberries.jpg")
(p Text of paragraph 2)
(p Text of paragraph 3)))
Seems like you've got a crippled re-implementation of Lisp syntax.
Every year modern programming languages get a little bit closer to the feature set of mid-eighties Lisp, so I'm not surprised that the syntax itself is being independently rediscovered.
TLDR - All languages converge on Lisp features, and all syntax converges on S-expressions.
|
|
|
|
|
Member 13301679 wrote: Seems like you've got a crippled re-implementation of Lisp syntax.
So you took a pXML example, replaced [] by () , replaced the indentation at the end with ))) (to make it look like Lisp), and then tell people that pXML is just "a crippled re-implementation of Lisp syntax". Unbelievable! Terribly unfair and simply wrong!
Member 13301679 wrote: All languages converge on Lisp features
Many languages use brackets to define boundaries (e.g. C- and Java-like languages use {} , XML uses <> , pXML uses [] ). But that doesn't make these languages "Lisp-like" if you just replace the brackets with () . Wrong and useless statement.
Member 13301679 wrote: all syntax converges on S-expressions
Wrong too (unless you mean that all syntax is just a list of tokens). pXML does not use s-expressions, although it might seem like that for people who don't understand the basic concepts. Consider this:
s-expression: (a b c d)
pXML: [a b c d]
The first example (s-expression) denotes a list with four elements: a , b , c , and d .
The second example (pXML), which is conceptually the same as writing <a>b c d</a> in XML, denotes a tree node with name a and with string "b c d" as content.
That's semantically very different!
Moreover, the syntax for attributes in XML and pXML (e.g. a = "b c d" ) is totally unrelated to s-expressions or Lisp.
|
|
|
|
|
Quote: So you took a pXML example, replaced [] by (), replaced the indentation at the end with ))) (to make it look like Lisp), and then tell people that pXML is just "a crippled re-implementation of Lisp syntax". Unbelievable! Terribly unfair and simply wrong!
Actually, Lisp came first, so yeah, pXML simply replaces all the '(' with '[' and all the ')' with ']'. This is exactly what I meant - pXML takes Lisp syntax and superficially changes some characters.
Quote:
Member 13301679 wrote:
All languages converge on Lisp features
Many languages use brackets to define boundaries (e.g. C- and Java-like languages use {}, XML uses <>, pXML uses []). But that doesn't make these languages "Lisp-like" if you just replace the brackets with (). Wrong and useless statement.
That's not "features", that's "syntax". When I say that all programming languages converge on Lisp feature-wise, I mean that they eventually get features that were in Lisp 30 years ago.
Syntax is separate.
Quote: Wrong too (unless you mean that all syntax is just a list of tokens)
It literally is. That's literally what the AST in all programming languages is!
Quote: The first example (s-expression) denotes a list with four elements: a, b, c, and d.
The second example (pXML), which is conceptually the same as writing b c d in XML, denotes a tree node with name a and with string "b c d" as content.
That's semantically very different!
Not in Lisp. This valid lisp code:
(defmacro a (&body rest)
(progn
(format t "<a>")
(dolist (r rest)
(format t "~a " r))
(format t "</a>")))
Turns any occurrence of "(a b c d)" into a tree "a b c d". Running this in my terminal gives me this:
$ cat t.lisp
(defmacro a (&body rest)
(progn
(format t "<a>")
(dolist (r rest)
(format t "~a " r))
(format t "</a>")))
(a b c d)
$ clisp -q < t.lisp
A
<a>B C D </a>
So, yeah, in Lisp every first element of an s-expression can be turned into the root of a tree that holds every other element, recursively.
I know this, because in 2001 I wrote a system for a client to generate the HTML and DOM from s-expressions. Generating html was as simple as this:
(html
(body
(h1 "Login")
(form action: /myform.php method: get
(span "Enter username")
(input name: username)
(span "Enter password")
(input name: password type: password)
(checkbox name: must_remember "Remember me!")
(button type: submit "Login"))))
And, IIRC, I wasn't the only one doing stuff like this. XML is a subset of Lisp s-expressions, html is a subset of XML. Your proposal is html transformed, hence it's a subset of Lisp s-expressions.
|
|
|
|
|
It also reminded me of Lisp when I first saw it. The critique may be harsh but I think mostly correct.
The Lisp-HTML approach shown seems a little bit more readable to me, and if I'm not wrong there are some systems that still use it today for some webpages.
I think some comparisons to HTML templating engines(?) (e. g. Emmet, pug, ...) might be good as they also try to reduce complexity or make is more humanly readable but still support the full html feature set as they are transpiled into html.
I'm also not sure whether pXML is really more readable for humans. The nesting shown seems not necessarily easier to read for deep and complex XML documents. And with good syntax highlighting XML is good enough.
|
|
|
|
|
I dislike your proposal that is not simple for french keyboard where [ and ] characters are not easy to type !!! What you propose is interessant but it is only a bad dream. You never speak about positional attributes or named attributes
|
|
|
|
|
schlebe wrote: it is only a bad dream
"I direct my efforts to dreams that light my fire and are worth my time and energy to pursue."
-- Alan Cohen
|
|
|
|
|
you lost me at:
<i>foo</i>
becomes
<i foo>
because now the text content of i has become an attribute with no value. These two are not the same thing. Text is free, attributes are constrained by the DTD. If you change to say "any unrecognised attribute is the text", that's not great:
- what happens when there are multiple unknown attributes?
- what happens when some short text becomes confused with a new attribute in the future?
Your solution after this is to put attribute pairs in parentheses (), but what about if text contains parentheses? I suppose we could escape the parens in text - but you don't mention doing so, so as it stands, I'd expect undefined behavior.
XML may not be great, but I'm not really seeing the win here, sorry. XML is verbose as-is, but compresses incredibly well because of the repeated content. Size over the wire, therefor, from any modern web server implementing gzip compression, isn't an issue. Readability is not improved (personal opinion) - much like how LISP isn't "easier to read" than XML (you've got something LISP-like here, with square brackets instead of parens)
You could introduce a new node, eg t
------------------------------------------------
If you say that getting the money
is the most important thing
You will spend your life
completely wasting your time
You will be doing things
you don't like doing
In order to go on living
That is, to go on doing things
you don't like doing
Which is stupid.
|
|
|
|
|
Davyd McColl wrote: now the text content of i has become an attribute with no value
The code <i foo> in the article is just an intermediary step (neither valid XML nor valid pXML code) towards achieving the final pXML syntax: [i foo] . BTW, an "attribute with no value" would be invalid in XML and pXML.
Davyd McColl wrote: Your solution after this is to put attribute pairs in parentheses ()
Exactly. Therefore there is no ambiguity with [i foo] in pXML. foo is clearly text, and surely not an "attribute with no value".
Davyd McColl wrote: but what about if text contains parentheses?
That's a good question. I forgot to mention it in the article.
Suppose that the text content of a node named foo is (a=b) .
Then [foo (a=b)] doesn't work because this is the syntax for assigning b to attribute a .
There are two solutions:
1. Write [foo() (a=b)] to make it clear that there are no attributes
2. Escape the ( with \( , like this: [foo \(a=b)]
Both methods are already implemented in the pXML parser (to be published next month).
I will update the article in the coming days to mention this edge case.
Davyd McColl wrote: XML is verbose as-is, but compresses incredibly well
The goal of the pXML syntax is to make it more human-friendly: easier to read and write for humans. And because pXML is less verbose, it will probably also produce smaller compressed sizes than XML.
|
|
|
|
|
Quote: The code in the article is just an intermediary step (neither valid XML nor valid pXML code) towards achieving the final pXML syntax: [i foo]. BTW, an "attribute with no value" would be invalid in XML and pXML.
- incorrect: xml attributes may be empty (see Can an XML attribute be the empty string? - Stack Overflow as well as asking anyone who has written directives / custom attributes for front-end frameworks.
- the intermediate presentation still suffers the same problem. The syntax change from angle brackets to square brackets doesn't fix the inherent flaw.
Quote: The goal of the pXML syntax is to make it more human-friendly:
With the requirement to step carefully around empty attributes & the inability to match the closing tag from a large node to the parent by type (sure, you can use an editor to match brackets, but if you have a giant node, eg "<customer-data>" and it ends off-screen at "</customer-data>", I don't think you're necessarily achieving your stated goal.
If this works for the set of problems you have to deal with, great! I don't find this more readable and I'm definitely not introducing a new parser to working code.
The power of existing formats comes largely from how easy it is to consume them. XML and JSON (and now even YAML) parsers are a dime a dozen and found for free in practically every programming environment. In addition, they're formats that I can confidently hand off to a third party to deal with. To disrupt that, you're going to need to provide such an astounding edge as to make it impossible to refuse your format.
And I'm simply not convinced.
Again, I don't want to be "that guy". If this tool works for the tasks you have lined up for it and you don't have to share with anyone else and you don't need support on a plethora of programming environments, then I wish you all the best, friend (:
------------------------------------------------
If you say that getting the money
is the most important thing
You will spend your life
completely wasting your time
You will be doing things
you don't like doing
In order to go on living
That is, to go on doing things
you don't like doing
Which is stupid.
|
|
|
|
|
Davyd McColl wrote: incorrect: xml attributes may be empty
In your first comment you spoke about "an attribute with no value", referring to the syntax <i foo> .
I replied that such an "attribute with no value" would be invalid in XML and pXML. Which is true (<i foo> generates an error in an XML validator).
However, now you are talking about xml attributes that "may be empty" (e.g. <i foo="" /> ). Of course attributes can be empty (in XML and pXML) by assigning an empty string. But these are two different cases, unless I totally misunderstand your point.
Davyd McColl wrote: I wish you all the best, friend (:
Thank you.
|
|
|
|
|
I think the first mistake is equating HTML to XML. Only the out-of-fashion XHTML requires strict adherence to XML.
HTML requires no self-closing tags: <br> comes to mind, or if its use offends you, <link> or <input>. Also, it is extremely common to use “disabled” as an attribute with no value.
These are not exotic examples. I like the idea of a simpler syntax to HTML, but pXML seems to only fit a particular use case very well, and others less so. In contrast, XML fits many more uses all equally “well”.
|
|
|
|
|
Andre_Prellwitz wrote: I think the first mistake is equating HTML to XML
When I refer to "HTML" in the article, I mean of course XHTML (because this article is about XML syntax), but I should indeed have been more explicit (e.g. writing "XML/XHTML", instead of "XML/HTML"). As far as I know, all modern popular browsers support XHTML syntax, so pXML could be used to create web pages with a pXML-to-XML converter.
Andre_Prellwitz wrote: pXML seems to only fit a particular use case very well, and others less so. In contrast, XML fits many more uses all equally “well”
Sorry, I have to disagree, unless I misunderstand your point. Could you please show an example of XML code that cannot be written with the pXML syntax?
|
|
|
|
|
The last example of the config file example is, to my eyes, missing a ) between green" and ] .
[config
[size XL]
[colors (background=black foreground="light green"]
[transparent true]
] Cheers,
Peter
Software rusts. Simon Stephenson, ca 1994. So does this signature. me, 2012
|
|
|
|
|
Well spotted! Thanks a lot.
It's now fixed.
|
|
|
|
|
The two links to pml-lang.dev in pXML Predecessor are not working.
You say "element names within metadata don't need to be prefixed with #", however
I would instead lean towards saying they should in fact be mandatory.
The biggest obstacle for adoption is getting browsers to support the format natively.
Open source converters are a good first step and I'd recommend adding a task to
rosettacode, as long as you can keep it reasonably simple, with examples in both
Java and PPL. Converters in C++ JavaScript and Python are probably key to widespread
adoption, along with a formal and complete test/verification set. Like any good idea,
it would need ramming down people's throats.
Before you get ahead of yourself, similar replacements for CSS, XQuery, XPath, XML and
JSON Schema, all native not "install x convert to y then z", well, you see the problem.
I think I have spotted a potential achilles heel: in html+js, pretty much the only thing you need to look out for is (say)
k = src.indexOf("<\script>"); ==> k = src.indexOf("<\" + "script>");
However with [script ...] you are well and truly hosed (escape every single ] in js, no thanks). Therefore I suggest a more hybrid approach, with <script></script> and similar still valid, in cases that need it.
While somewhat moot, the xml <phones><phone>123</phone><phone>456</phone></phones> equates to the invalid json "phones": {"phone": "123", "phone": "456"}, either that or the given vice-versa of
"phones": [ "123", "456"] to <phones><phone>123</phone><phone>456</phone></phones> ain't quite right.
Pete Lomax
modified 16-Mar-21 8:48am.
|
|
|
|
|
Glad you liked the article. Thank you.
Pete Lomax Member 10664505 wrote: The two links to pml-lang.dev in pXML Predecessor are not working.
I just tried all the 6 links in chapter 'pXML Predecessor', and they all worked. Maybe a server was down when you tried. If you still encounter problems then could you tell me please which two links don't work.
Pete Lomax Member 10664505 wrote: I would instead lean towards saying they should in fact be mandatory.
Interesting point. Could you please explain why you think the # should be mandatory in child elements?
Pete Lomax Member 10664505 wrote: I'd recommend adding a task to
rosettacode
Thanks for the tip
Pete Lomax Member 10664505 wrote: the given vice-versa of
"phones": [ "123", "456"] to <phones><phone>123<phone>456 ain't quite right
True. But XML doesn't have native arrays (or lists), unlike JSON.
|
|
|
|
|
links: ok, seems my isp (O2) is probably blocking that domain.
#mandatory: consistency - should you accidentally clip any ] after [#GUI_data it will treat [content] as an attribute, breaking "No confusion" and probably eventually resulting in an error much further away from where it c/should be.
I also corrected/encoded <script></script> in my original post, it should make more sense now.
Pete Lomax
modified 16-Mar-21 8:56am.
|
|
|
|
|
Pete Lomax Member 10664505 wrote: accidentally clip any ] after [#GUI_data
In that case the document becomes invalid, and the parser reports an error. So there would be no risk of accidentally turning content into metadata.
However a problem could arise if a child-node of metadata (without # ) is copy-pasted into another place, and the user forgets to add the # . That risk could indeed be avoided by making the # mandatory.
Also, in case of big metadata elements with lots of child-elements, looking at a child-element without # in the middle or end of the metadata-tree makes it less obvious to the human eye that it's looking at metadata.
So, maybe it would be better to make the # in child nodes mandatory, as you suggested. It could still be made optional with a parser flag.
Pete Lomax Member 10664505 wrote: eventually resulting in an error much further away
That's true, and it can be annoying, especially in big documents. As said in the article, this can be avoided in two ways:
1. Use the more verbose closing tag syntax ][/tag] for big nodes.
2. Quote: "Note, however, that this problem can be largely mitigated when elements are indented, and the parser emits a warning if the indentation of the opening [ and closing ] are different."
|
|
|
|
|
From practice, I've seen that the reason why you want to keep your child data as elements and not attributes is because child data is bound to be restructured. For instance, today the data may be available as a 1:1 relationship, but that doesn't mean it can't change. The data could very well have a 1:many relationship, and thus keeping it as an attribute is impossible.
In my own practice using XML as a structured data file has been to default all child data as attributes unless there is more data structure (i.e., more children), and then to change that structure into elements only when needed to support the structure. I avoid using the PC DATA of an element to hold data and instead prefer to use an attribute, as it is more defined. In the future, if the nature of that child data changes, you can just revision the XSD and let your subscribers know that there is a new revision of the XSD available.
After working with JSON Schema, I find that it is a very good practice, but I have yet to see it gain wider acceptance. Instead, I can very quickly determine the level of competence of a development team that insists on using JSON as an exchange format without also insisting on validation with JSON Schema.
The other benefit I've found from this structure is that with verbose starting and ending tags, it is fairly obvious when the XML is malformed, even without white space and indentation. It's a lot more difficult to spot this in JSON.
Another issue I've found with JSON is the range of supported encodings. By default, JSON is UTF-8, while almost any encoding can be used for XML as long as it is specified in the declaration.
I've used YML in the past and found the same problem that you did...with larger data structures, it was difficult to follow the data structure because it relies so heavily on white space and indentation.
I will take a deeper look into the pXML that you are proposing. I'm not likely to change my current practice, mostly because it is practical and succinct enough as it is and once compressed, the XML and JSON files are practically the same size. And XML Schema is very mature and robust in just about every development language.
|
|
|
|
|