
Introduction
This DLL provides routines to manipulate UTF-8 encoded XML files. The set provided is not all-singing-and-dancing but a useful, small collection. Several co-operating executables, living off a common UTF-8 encoded XML file, may find their operating parameters and set parameters for others.
Background
Initially, the read functions were implemented to save incorporating the large overhead of using a proprietary interface. From this grew a certain understanding of the mechanism. Then were added write & delete routines; stream routines that allowed the user program to supply & recover the UTF-8 encoded XML data (without using disk files); some super (i.e. over-arching) routines to shrink the user's code.
Using the code
VC 6.0 projects: Place the XM8DLL.dll in a directory on your path variable. Add the library XM8DLL.lib to the project resources. Add the module XM8calls.h to the project. Use the routines therein.
VB 6.0 projects: Register the XM8DLL.dll with regsvr32. Add the module XM8DLL.bas to the project. Use the public routines therein.
XM8_newFile("Order");
XM8_getFrstGroup("Order",0);
XM8_newAttPutVal("number","1234");
XM8_pokeNewGrpPutVal("Date","2000/1/1");
XM8_newGrpPutVal("Customer","Acme < & > \" ' Ltd");
XM8_newAttPutVal("ID","1234A");
XM8_getFrstGroup("Order",0);
XM8_newGroup("ITEM");
XM8_newGroup("ITEM");
XM8_getFrstGroup("ITEM",1);
XM8_newAttPutVal("ID","01");
XM8_newGrpPutVal("Part-number","E16-25A");
XM8_newAttPutVal("warehouse","Warehouse11");
XM8_getFrstGroup("ITEM",1);
XM8_pokeNewGrpPutVal("Description","Production-Class Widget A");
XM8_newGrpPutVal("Quantity","16");
XM8_getLastGroup("ITEM",1);
XM8_newAttPutVal("ID","02");
XM8_newGrpPutVal("Part-number","E23-45B");
XM8_newAttPutVal("warehouse","Warehouse11");
XM8_getLastGroup("ITEM",1);
XM8_pokeNewGrpPutVal("Description","Production-Class Widget B");
XM8_newGrpPutVal("Quantity","12");
XM8_writeFile(fileName);
Points of Interest
- Throughout this article, the acronym UTF means UTF-8.
- Four 'conversion' routines are also supplied. These are not used internally by XM8DLL. The pair
XM8_UTFtoUCS
, XM8_UCStoUTF
. The pair XM8_UTF8toUTF16
, XM8_UTF16toUTF8
.
- After installing the relevant character sets on W2K, I managed to reveal the Japanese streams.
- For C/C++ only users, a static library can be built using workspace & project files provided.
- The private routines in the XM8DLL.bas module are to get around C/C++ <-> VB differences.
- The implementation of 'false' (C/C++ 0, VB -1).
- VB string addresses to C/C++ routines.
- VB return-string-parameter is handled in the DLL.
History
- 1.9 Corrections to
XMJ_deProfundis
.
- 1.8 XM8_sNew.cpp bug fixed in
putThing
.
- 1.7 Encryption using TinyEncryptionAlgorithm (TEA).
- XM8_crypt_vb.zip - demonstration of TEA applied to XML files.
- Four encryption routines to implement TEA:
XMLteaCryptKey
, XMLteaEncrypt
, XMLteaEncryptVal
and XMLteaDecrypt
.
- 1.6 Default is now 1-4 byte UTF-8, 22 bit UNICODE usage.
- New routine
XM8_fullCODE
, revert to 1-6 byte UTF-8, 31 bit usage.
- 1.5 XM8_sNew.cpp new loop routine
XM8_deProfundis
.
- Third VB demo. XLS files to XML files.
- 1.4 XM8_sNew.cpp bug fixed in
XM8_newStream
.
- 1.3 handles <, &, >, " and ' within values; both read & write.
- XM8DLL.bas bug fixed in
XM8_UTF8toUTF16
.
- 1.2 handles group to attribute & attribute to attribute white space.
- What took 661 mS now takes 231 mS.
- 1.1 XM8 handles ASCII encoded XML files because they are a sub-set of UTF-8. Therefore, XMJ may be replaced by XM8. Because XM8 works internally in UCS, it is about 30% slower than XMJ. Any observations on the code that might recover this loss will be much appreciated.