CExtensibleMarkupLanguageElement

$Revision: 73 $

Description

This class is the work horse of the XML classes. It does all of the parsing and outputting the elements that make up an XML document in a very simplistic manner. It does not assume that you are connected to Internet or that networking even exists (or files for that matter). To parse an XML document, you must retrieve the document. How this document is retrieved is left to the caller of this class. This class only parses what it is given. It will not make any attempts to retrieve data on its own.

Construction

static CExtensibleMarkupLanguageElement *NewElement( CExtensibleMarkupLanguageElement * parent_element = NULL, DWORD type = typeElement, CExtensibleMarkupLanguageDocument * document = NULL )

Creates another CExtensibleMarkupLanguageElement. There is no constructor for this class. If you want to create a CExtensibleMarkupLanguageElement object you must use NewElement(). The type parameter can be one of the following values:

typeUnknown - We don't know what it is.
typeProcessingInstruction - Start Tag is "<?" End Tag is "?>"
typeComment - Start Tag is ""
typeCharacterData - Start Tag is "<![CDATA[" End Tag is "]]>"
typeElement - A user's element
typeTextSegment - That which lies between sub-elements
typeMetaData - Anything in a "<!" field that ain't typeComment or typeCharacterData

If the document parameter is NULL, the newly created element will inherit the document of the parent_element. The newly created element will also have been made a child of parent_element.

Methods

BOOL AddAttribute( const CString& name, const CString& value )

Attributes are those parts of an element that ain't supposed to be data. It contains meta data (data about data). Sometimes though, people put data into the attributes. Consider the following sample:

<DATE data="19630502"/>

The part data="19630502" is considered to be the attribute. The attribute name is "data" and its value is "19630502". It returns TRUE if the attribute was successfully added.

void AddChild( CExtensibleMarkupLanguageElement * element_p )

Makes element_p a child of this element. A copy of element_p is not made so don't send AddChild() the address of a local variable. Also, the element_p will have it's document member set to the document of this element.

BOOL AddText( const CString& text_segment )

This method will create a child element of type typeTextSegment and set its contents to text_segment.

void Copy( const CExtensibleMarkupLanguageElement& source )

Copies the contents of source. It will copy all attributes and children of source.

DWORD CountChildren( const CString& name ) const

Returns the number of children by that name. Consider the following XML snippet:

<Southpark>
   <Characters>
      <Boy>Cartman</Boy>
      <Boy>Kenny</Boy>
      <Boy>Kyle</Boy>
      <Boy>Stan</Boy>
   </Characters>
   <Characters>
      <Girl>Wendy</Girl>
      <Boy>Chef</Boy>
      <Girl>Ms. Ellen</Girl>
   </Characters>
</Southpark>

If you wanted to know how many "Girl" children there are in the second set of characters, you would use a name of "SouthPark.Characters(1).Girl" If you set the parent/child separator character (via the SetParentChildSeparatorCharacter() method) to a forward slash, you could us a name of of "SouthPark/Characters(1)/Girl"

void DestroyAttributeByName( const CString& name )

Let's you get rid of a particular attribute.

void DestroyAttributeByValue( const CString& value )

Let's you get rid of all attributes with a particular value.

void DestroyAttributes( void )

Removes all of the attributes associated with this element.

void DestroyChildren( void )

Removes all children (this will destroy all grand children too).

void Dump( CDumpContext& dump_context ) const

Present only in debug builds of WFC. It helps in debugging. It will print the contents of this object in human readable form to dump_context.

void Empty( void )

Destroys all children and attributes. Empties the tag name and text fields. It does not reset the parent document (SetDocument()) pointer nor the parent element pointer.

BOOL EnumerateAttributes( DWORD& enumerator ) const

Allows you to enumerate through the attributes. It will initialize enumerator and return TRUE if there are any attributes to enumerate through. It will return FALSE when the number of attributes is zero (i.e. there ain't no attributes to enumerate).

BOOL EnumerateChildren( DWORD& enumerator ) const

Allows you to enumerate through the children. It will initialize enumerator and return TRUE if there are any children to enumerate through. It will return FALSE when the number of children is zero (i.e. there ain't no children to enumerate).

BOOL GetAbortParsing( void ) const

Returns whether you should abort parsing or not. This method is here so you can abort the parsing of an XML stream whenever a callback function deems it necessary.

BOOL GetAttributeByName( CExtensibleMarkupLanguageAttribute& attribute ) const BOOL GetAttributeByName( const CString& name, CString& value ) const CExtensibleMarkupLanguageAttribute * GetAttributeByName( const CString& name ) const

Retrieves the value of an attribute based on its name. If you retrieve the value, you will get the attribute value after all entities in the string have been resolved. If you retrieve the entire attribute (in a CExtensibleMarkupLanguageAttribute), the value will not pass through the entity resolution filter. The reason for this is there is no link between a CExtensibleMarkupLanguageAttribute and a CExtensibleMarkupLanguageDocument. Only the CExtensibleMarkupLanguageDocument knows about entities.

BOOL GetAttributeByValue( CExtensibleMarkupLanguageAttribute& attribute ) const

Retrieves a copy of an attribute by its value.

void GetBeginning( CParsePoint& parse_point ) const

Fills parse_point with the point at which this element began in the data stream.

CExtensibleMarkupLanguageElement * GetChild( const CString& child_name ) const

Returns the child of the given child_name. Consider the following XML snippet:

<Southpark>
   <Characters>
      <Boy>Cartman</Boy>
      <Boy>Kenny</Boy>
      <Boy>Kyle</Boy>
      <Boy>Stan</Boy>
   </Characters>
   <Characters>
      <Girl>Wendy</Girl>
      <Boy>Chef</Boy>
      <Girl>Ms. Ellen</Girl>
   </Characters>
</Southpark>

To retrieve the element for Cartman, child_name should be "Southpark.Characters.Boy" If you want Ms. Ellen (even though she doesn's play for the home team) you would use "Southpark.Characters(1).Girl(1)" If you set the parent/child separator character (via the SetParentChildSeparatorCharacter() method) to a forward slash, you could us a child_name of of "Southpark/Characters(1)/Girl(1)"

void GetCompleteName( CString& name ) const

This is the inverse to GetChild(). It will give you the complete path name for this element.

void GetContents( CString& contents ) const

Retrieves the contents of this element only. It will not resolve any entities nor will it retrieve the contents of any children elements. Returns the data (or contents) portion of the element. For example, if the XML element was:

<LINE n="1">Darmok and Jilad at Tenagara!</LINE>

GetContents() would fill contents with NOTHING! If you want to retrieve "Darmok and Jilad at Tenagara!" you will have to call GetText(). The reason for this is the parser has to support embedded elements in the text portion of this element. Consider the following:

<LINE n="1">Darmok and <ITALICS>Jilad</ITALICS> at Tenagara!</LINE>

Notice there is an embedded element named ITALICS right in the middle of a perfectly good text.

CExtensibleMarkupLanguageDocument * GetDocument( void ) const

Returns the pointer to the parent document.

void GetEnding( CParsePoint& parse_point ) const

Fills parse_point with the point at which this element ended in the data stream.

void GetName( CString& name ) const

Gives you the name (and possibly instance number) of this element.

BOOL GetNextAttribute( DWORD& enumerator, CExtensibleMarkupLanguageAttribute*& attribute_p ) const

Returns TRUE if attribute_p was filled with an attribute's pointer. Returns FALSE (and sets attribute_p to NULL) when no attribute has been retrieved.

BOOL GetNextChild( DWORD& enumerator, CExtensibleMarkupLanguageElement *& element_p ) const

Returns TRUE if element_p was filled with a child's pointer. Returns FALSE (and sets element_p to NULL) when no child has been retrieved.

DWORD GetNumberOfChildren( void ) const

Returns the number of children this element has. Consider the following XML snippet:

<Southpark>
   <Characters>
      <Boy>Cartman</Boy>
      <Boy>Kenny</Boy>
      <Boy>Kyle</Boy>
      <Boy>Stan</Boy>
   </Characters>
   <Characters>
      <Girl>Wendy</Girl>
      <Boy>Chef</Boy>
      <Girl>Ms. Ellen</Girl>
   </Characters>
</Southpark>

If you have the element for Southpark and call GetNumberOfChildren(), it will return

DWORD GetNumberOfAttributes( void ) const

Returns the number of attributes this element has.

CExtensibleMarkupLanguageElement * GetParent( void ) const

Returns the pointer to the immediate parent of this element.

CExtensibleMarkupLanguageElement * GetParent( const CString& name ) const

Returns the parent element that matches name. The element returned my be the parent, grand parent or great grand parent.

void GetTag( CString& tag ) const

Returns the tag value. For example, if the XML element was:

<DATE data="1963-05-02"/>

GetTag() will return "DATE"

void GetText( CString& text_string ) const

Retrieves all text segments and resolves all entities in the text. This parser is very very particular about whether white-space is holy or not. XML is for data so we will treat every character in the XML document as though it is data. Consider the following:

<WFC>
<AUTHOR>Samuel R. Blackburn
wfc@pobox.com</AUTHOR>
</WFC>

The WFC element has three children. The first child is of type typeTextSegment and contains the data between the end of the <WFC> tag and the beginning of the <AUTHOR> tag (i.e. a new-line character). The next child is of type typeElement which contains the AUTHOR element. This last child is of type typeTextSegment which contains the data between the end of the </AUTHOR> tag and the beginning of the </WFC> tag (another new-line character).
NOTE: There is one thing you must keep in mind when calling GetText(). Remember that we have a pointer to our parent CExtensibleMarkupLanguageDocument. That parent has a property called IgnoreWhiteSpace and is boolean. If the user sets that property to TRUE, GetText() will ignore text segments (typeTextSegment child elements) that contain nothing but space characters.

DWORD GetTotalNumberOfChildren( void ) const

This tells you how many children this element has plus the number of children they have and so on and so forth.

DWORD GetType( void ) const

Tells you what type of element this object is. It will return one of the following values:

typeUnknown - We don't know what it is.
typeProcessingInstruction - Start Tag is "<?" End Tag is "?>"
typeComment - Start Tag is ""
typeCharacterData - Start Tag is "<![CDATA[" End Tag is "]]>"
typeElement - A user's element
typeTextSegment - That which lies between sub-elements
typeMetaData - Anything in a "<!" field that ain't typeComment or typeCharacterData

BOOL IsAllWhiteSpace( void ) const

Returns TRUE if this is a typeTextSegment and it contains nothing but space-like characters.

BOOL IsRoot( void ) const

Returns TRUE if this element is the root element. The root element holds the data from the XML identifier. The XML identifier line looks something like this:

<?xml version="1.0"?>

static CExtensibleMarkupLanguageElement *NewElement( CExtensibleMarkupLanguageElement * parent_element = NULL, DWORD type = typeElement, CExtensibleMarkupLanguageDocument * document = NULL )

Creates another CExtensibleMarkupLanguageElement. There is no constructor for CExtensibleMarkupLanguageElement. If you want to create a CExtensibleMarkupLanguageElement you must use NewElement(). The type parameter can be one of the following values:

typeUnknown - We don't know what it is.
typeProcessingInstruction - Start Tag is "<?" End Tag is "?>"
typeComment - Start Tag is ""
typeCharacterData - Start Tag is "<![CDATA[" End Tag is "]]>"
typeElement - A user's element
typeTextSegment - That which lies between sub-elements
typeMetaData - Anything in a "<!" field that ain't typeComment or typeCharacterData

If the document parameter is NULL, the newly created element will inherit the document from the parent_element.

BOOL Parse( const CParsePoint& beginning, const CDataParser& parser )

Tells the element to start read itself from parser starting at the location in beginning. It returns TRUE if it successfully parsed (and all of its children successfully parsed).

void RemoveChild( CExtensibleMarkupLanguageElement * element_p )

Removes element_p from the list of children. It will not destroy the child. This is the method you would use to steal a child from one element and give it to another.

void SetAbortParsing( BOOL abort_parsing = TRUE )

This sets a flag in the class to abort parsing. This is usually called from callback functions. When an element is finished parsing, it is sent to the document's list of callback functions. One of those callbacks may choose to abort parsing of the document. One example of why you would want to do this is searching. You would parse the document until you found what you're looking for then stop parsing the document (because the document may be very large).

void SetDocument( CExtensibleMarkupLanguageDocument * document_p )

Tells the element which CExtensibleMarkupLanguageDocument this element belongs to.

void SetTag( const CString& tag_name )

Sets the tag name.

void SetType( DWORD element_type )

element_type can be one of the following values:

typeUnknown - We don't know what it is.
typeProcessingInstruction - Start Tag is "<?" End Tag is "?>"
typeComment - Start Tag is ""
typeCharacterData - Start Tag is "<![CDATA[" End Tag is "]]>"
typeElement - A user's element
typeTextSegment - That which lies between sub-elements
typeMetaData - Anything in a "<!" field that ain't typeComment or typeCharacterData

If element_type is not one of the above values, the type will be set to typeUnknown.

void WriteTo( CByteArray& destination )

Writes the element (and sub-elements) to the byte array in XML form.

Operators

CExtensibleMarkupLanguageElement& operator = ( const CExtensibleMarkupLanguageElement& source )

Basically calls Copy().

Example

#include <wfc.h>
#pragma hdrstop

BOOL contains_id_attribute( CExtensibleMarkupLanguageElement * element_p )
{
   WFCTRACEINIT( TEXT( "contains_id_attribute()" ) );

   CExtensibleMarkupLanguageAttribute * attribute_p = NULL;

   attribute_p = element_p->GetAttributeByName( "ID" );

   if ( attribute_p == NULL )
   {
      return( FALSE );
   }

   return( TRUE );
}

Copyright, 2000, Samuel R. Blackburn
$Workfile: CExtensibleMarkupLanguageElement.cpp $
$Modtime: 1/26/00 5:58p $