|
Well spotted! Thanks a lot.
It's now fixed.
|
|
|
|
|
The two links to pml-lang.dev in pXML Predecessor are not working.
You say "element names within metadata don't need to be prefixed with #", however
I would instead lean towards saying they should in fact be mandatory.
The biggest obstacle for adoption is getting browsers to support the format natively.
Open source converters are a good first step and I'd recommend adding a task to
rosettacode, as long as you can keep it reasonably simple, with examples in both
Java and PPL. Converters in C++ JavaScript and Python are probably key to widespread
adoption, along with a formal and complete test/verification set. Like any good idea,
it would need ramming down people's throats.
Before you get ahead of yourself, similar replacements for CSS, XQuery, XPath, XML and
JSON Schema, all native not "install x convert to y then z", well, you see the problem.
I think I have spotted a potential achilles heel: in html+js, pretty much the only thing you need to look out for is (say)
k = src.indexOf("<\script>"); ==> k = src.indexOf("<\" + "script>");
However with [script ...] you are well and truly hosed (escape every single ] in js, no thanks). Therefore I suggest a more hybrid approach, with <script></script> and similar still valid, in cases that need it.
While somewhat moot, the xml <phones><phone>123</phone><phone>456</phone></phones> equates to the invalid json "phones": {"phone": "123", "phone": "456"}, either that or the given vice-versa of
"phones": [ "123", "456"] to <phones><phone>123</phone><phone>456</phone></phones> ain't quite right.
Pete Lomax
modified 16-Mar-21 8:48am.
|
|
|
|
|
Glad you liked the article. Thank you.
Pete Lomax Member 10664505 wrote: The two links to pml-lang.dev in pXML Predecessor are not working.
I just tried all the 6 links in chapter 'pXML Predecessor', and they all worked. Maybe a server was down when you tried. If you still encounter problems then could you tell me please which two links don't work.
Pete Lomax Member 10664505 wrote: I would instead lean towards saying they should in fact be mandatory.
Interesting point. Could you please explain why you think the # should be mandatory in child elements?
Pete Lomax Member 10664505 wrote: I'd recommend adding a task to
rosettacode
Thanks for the tip
Pete Lomax Member 10664505 wrote: the given vice-versa of
"phones": [ "123", "456"] to <phones><phone>123<phone>456 ain't quite right
True. But XML doesn't have native arrays (or lists), unlike JSON.
|
|
|
|
|
links: ok, seems my isp (O2) is probably blocking that domain.
#mandatory: consistency - should you accidentally clip any ] after [#GUI_data it will treat [content] as an attribute, breaking "No confusion" and probably eventually resulting in an error much further away from where it c/should be.
I also corrected/encoded <script></script> in my original post, it should make more sense now.
Pete Lomax
modified 16-Mar-21 8:56am.
|
|
|
|
|
Pete Lomax Member 10664505 wrote: accidentally clip any ] after [#GUI_data
In that case the document becomes invalid, and the parser reports an error. So there would be no risk of accidentally turning content into metadata.
However a problem could arise if a child-node of metadata (without # ) is copy-pasted into another place, and the user forgets to add the # . That risk could indeed be avoided by making the # mandatory.
Also, in case of big metadata elements with lots of child-elements, looking at a child-element without # in the middle or end of the metadata-tree makes it less obvious to the human eye that it's looking at metadata.
So, maybe it would be better to make the # in child nodes mandatory, as you suggested. It could still be made optional with a parser flag.
Pete Lomax Member 10664505 wrote: eventually resulting in an error much further away
That's true, and it can be annoying, especially in big documents. As said in the article, this can be avoided in two ways:
1. Use the more verbose closing tag syntax ][/tag] for big nodes.
2. Quote: "Note, however, that this problem can be largely mitigated when elements are indented, and the parser emits a warning if the indentation of the opening [ and closing ] are different."
|
|
|
|
|
From practice, I've seen that the reason why you want to keep your child data as elements and not attributes is because child data is bound to be restructured. For instance, today the data may be available as a 1:1 relationship, but that doesn't mean it can't change. The data could very well have a 1:many relationship, and thus keeping it as an attribute is impossible.
In my own practice using XML as a structured data file has been to default all child data as attributes unless there is more data structure (i.e., more children), and then to change that structure into elements only when needed to support the structure. I avoid using the PC DATA of an element to hold data and instead prefer to use an attribute, as it is more defined. In the future, if the nature of that child data changes, you can just revision the XSD and let your subscribers know that there is a new revision of the XSD available.
After working with JSON Schema, I find that it is a very good practice, but I have yet to see it gain wider acceptance. Instead, I can very quickly determine the level of competence of a development team that insists on using JSON as an exchange format without also insisting on validation with JSON Schema.
The other benefit I've found from this structure is that with verbose starting and ending tags, it is fairly obvious when the XML is malformed, even without white space and indentation. It's a lot more difficult to spot this in JSON.
Another issue I've found with JSON is the range of supported encodings. By default, JSON is UTF-8, while almost any encoding can be used for XML as long as it is specified in the declaration.
I've used YML in the past and found the same problem that you did...with larger data structures, it was difficult to follow the data structure because it relies so heavily on white space and indentation.
I will take a deeper look into the pXML that you are proposing. I'm not likely to change my current practice, mostly because it is practical and succinct enough as it is and once compressed, the XML and JSON files are practically the same size. And XML Schema is very mature and robust in just about every development language.
|
|
|
|
|
Pragmatic and very nice approach, thanks for sharing.
|
|
|
|
|
|
I love the idea of what you're proposing. I just hate the idea of what Microsoft, Google, and all of the other big vendors would turn it into - assuming you can get their attention at all.
For all its faults, XML has given us a fantastic structured data format with excellent tools (XPath, XQuery, XSLT) for data retrieval and transformation.
JSON is nice for when you need to pass simple data structures to a user interface, especially if that UI is written in Javascript.
YAML gives us nice whitespace issues and a very awkward place for storing commands.
SignalR and Proto have given us high-performance wire transfers for making RPC calls performant and scalable.
XML could certainly stand to be improved - I don't see why we couldn't have something like <item Toothbrush /> where the schema would define that there's a string defined inside item . Maybe a mashup of Relax NG is in order.
Another useful mashup might be introducing DolDoc, but I don't think the web kids are ready for that.
|
|
|
|
|
I never heard about 'DolDoc'. Will have a look at it. Thanks for sharing.
|
|
|
|
|
I change my previous opinion because I think I read your article too quickly !
I focused on the differences of pXML with JSON, and not on HTML, where indeed everything is in character string ...
Sorry.
modified 11-Mar-21 3:44am.
|
|
|
|
|
DidierO wrote: The format that you offer does not guarantee the restitution of the original data
I'm sorry, but I don't understand what you mean. Could you give us an example please, and explain exactly what you mean by "does not guarantee the restitution of the original data".
DidierO wrote: no guarantee on the type
All values in XML documents are just strings. That's how XML works, and therefore the same is true in pXML. There is no native way in standard XML to specify 'types'. You can add 'type information' with metadata, and you can define XML schemas to validate string values. But it's not like in JSON, where native values can be strings, integers, boolean, null. It seems that you are not aware of the fundamental basic differences between XML and other formats.
DidierO wrote: loss of white characters at the ends of the values
That's simply not true, unless I totally misunderstand your point. If you write [name foo ] in pXML, then the trailing space after "foo" is part of the value of name . Please provide an example if this is not what you are talking about.
I honestly think that your vote is totally unjustified (because your arguments are wrong). You might consider reevaluating your arguments and vote.
|
|
|
|
|
ChristianNeumanns wrote: You can add 'type information' with metadata
Isn't that true for any format when you serialize for storage?
|
|
|
|
|
Jörgen Andersson wrote: Isn't that true for any format when you serialize for storage?
Yes, true for most formats.
My intention is to (later) add types as an optional extension to pXML. Besides predefined types like boolean, number variations, date, time, list, map, etc. it must be easy for a user to add customized types. I have a very concrete idea about how to do that (without changing pXML's syntax), and I might publish a "Suggestion for types in pXML" article in the future, and consider feedback from the community.
|
|
|
|
|
DidierO wrote: Sorry.
No problem. Glad you changed your mind. Cheers.
|
|
|
|
|
Most large documents are created by WYSIWYG editors. Style is preset by the developers of the editor and difficult to change. The current solution is Cascading Style Sheets (CSS). These can easily become a maintenance nightmare. What is needed is named blocks — sort of like subroutines in code. A syntax is needed to define a name and its pXML code block, both with and without the use of an external style file.
An important feature to simplify maintenance is to prevent redefinition of a block name using different code within a document. This a prevents block named StyleFoo from being redefined in a sub-sub-document and screwing up the formatting from that point on. This problem often arises when multiple documents become merged into a larger document, such as short stories in an anthology or as chapters into a user manual.
In my experience, the designers of XML documents design a style sheet which they know and understand and use very effectively. Years later, maintenance must modify the document, but the time to understand the style sheets is not available, so the maintainers use local formatting for the modifications. When the style sheets change, such as happens when two companies merge or the company's graphics change, the document becomes an instant mess. I have never seen management budget for the time required to fix these document issues.
__________________
Lord, grant me the serenity to accept that there are some things I just can’t keep up with, the determination to keep up with the things I must keep up with, and the wisdom to find a good RSS feed from someone who keeps up with what I’d like to, but just don’t have the damn bandwidth to handle right now.
© 2009, Rex Hammock
|
|
|
|
|
Jalapeno Bob wrote: What is needed is named blocks — sort of like subroutines in code.
Could you please provide an example of such a 'subroutine' (maybe pseudo-code), and explain its benefits. Thank you.
|
|
|
|
|
Cascading Style Sheets is a good example. They were not mentioned in the description of the proposed syntax.
The problem with CSS is that styles can be redefined, causing the document to screw up after the redefinition. For a style that is used only occasionally, finding the redefinition can be time consuming and management never allocates sufficient (if any ) time document modification.
I suggest that if a named block definition is repeated identically, a warning should be displayed. If the definition differs, an error should be displayed and the original definition should be retained.
I have seen a hierarchy of CSS files redefine the style for the same element — usually <title> or <hn> and, of course, various table elements — multiple times. Of course, changing a CSS file to fix one document may well break another document that relies on the same file.
Disclaimer:I am a software developer and maintainer who uses xml codes in documentation. I do not have the time to study the chain of CSS files used by existing documents that I have to modify. I am not, by any stretch of the imagination, an expert in xml document tags.
__________________
Lord, grant me the serenity to accept that there are some things I just can’t keep up with, the determination to keep up with the things I must keep up with, and the wisdom to find a good RSS feed from someone who keeps up with what I’d like to, but just don’t have the damn bandwidth to handle right now.
© 2009, Rex Hammock
modified 12-Mar-21 20:34pm.
|
|
|
|
|
XML is definitely not terse. If just representing data is the goal there are, as you mention, many other syntaxes to use.
But the reason to use XML is because there can be a schema or (for the old school) a DTD. These definitions can describe in very great detail the structure of XML instance documents (the ones with tags and data). This allows the creator of an instance document to check that it contains valid content which covers not just structure but element values. The recipient of the document can also verify the document is valid.
Because the XML specification is as old as the hills, most languages include features to validate an XML instance document against a schema document.
|
|
|
|
|
Bill Seddon wrote: the reason to use XML is because there can be a schema
Yes, that's one of the very useful additions to XML. As said in the article, an XML schema can also be applied to a document using the pXML syntax. Once a pXML document is parsed into an XML structure, all these great XML additions and tools can still be used (including XML schema). I plan to publish a follow-up article to show examples of how XML technology can be used with pXML as well.
|
|
|
|
|
Easy 5. Thanks for sharing.
The converters would be a nice addition, btw.
|
|
|
|
|
Thank you.
I plan to publish a dedicated article for the converter, once it's open-sourced.
|
|
|
|
|