Parsing XML in C++ using the YARD Parser

Christopher Diggins
Rate me:
4.79/5 (23 votes)
21 Dec 20046 min read
87.2K
1.2K
Provides a set of tools for building XML parsers in C++ using the YARD recursive descent parser.
yard.zip
- html_test.hpp
- parser.hpp
- parser_input_stream.hpp
- re_ops.hpp
- rules.hpp
- scanner.hpp
- test.html
- test.txt
- test.xml
- tokenizer.hpp
- utils.hpp
- xml_grammar.hpp
- xml_test.hpp
- yard.cpp
- yard.hpp
- yard.vcproj
<?xml version="1.0" encoding="utf-8"?>
<heronsite>
  <pageheader><![CDATA[[
    <body bgcolor='#666633' leftmargin='0' topmargin='0' link='#666633' vlink='#666633'>
    <table cellspacing='0' cellpadding='0' width='95%'>
    <tr valign='top'>
    <td><a href='http://www.heron-language.com'><img src='logo.gif' border='0'></a></td>
    <td width='99%'><table width='100%' cellspacing='0' cellpadding='0'><tr>
    <td width='100%'><a href='http://www.heron-language.com'><img src='heron-programming.gif' border='0'></a></td></tr><tr><td width='100%'>
    <table width='100%' bgcolor='#ffffff' cellpadding='10'><tr><td width='100%'>
    <font face='Trebuchet MS' size='-1'>
  ]]></pageheader>
  <pagefooter><![CDATA[[
    <center><h4>
      <a href='doc2html.html'>This site was generated from XML using Heron</a><br/>
      <a href='index.html'>Home</a>
      | <a href='downloads.html'>Downloads</a>
      | <a href='toc.html'>Documentation</a>
      | <a href='people.html'>People</a>
    </h4></center>
    <div align='right'>Heron home page - <a href='http://www.heron-language.com'>http://www.heron-language.com</a></div>
    <div align='right'>Copyright 2004, Christopher Diggins,
    <a href='http://www.cdiggins.com'>http://www.cdiggins.com</a></div>
    <div align='right'>The contents of this site are licensed under the
    <a href='http://opensource.org/licenses/osl-2.1.php'>Open Software License version 2.1</a></div>
    </font></td></tr></table></td></tr></table></td></tr><tr></tr></table></body></html>
  ]]></pagefooter>
  <herondoc>
    <section label="overview" name="Overview">
      <topic label="about" name="About Heron">
        Heron is a modern general-purpose programming language which
        focuses on the production of efficient and robust software by
        supporting modern software development techniques and methodologies and
        facilitating code reuse.
        <p/>
        <subtopic label="lineage" name="Heron Lineage">
          Heron has many similiarities to C++, Java and Pascal but many of these similarities are superficial.
          The differences of Heron to other languages are more substantitive than many other programming
          languages that have appeared
          over the last few years. The choice of Heron syntax is to appeal to as broad audience as possible.
          Perfectly good languages and technologies can very easily be overlooked if they appear to imposing
          or different.
        </subtopic>
        <subtopic label="design_goals" name="Design Goals">
          Heron follows several inter-connected design goals, guidelines and principles.
          I have listed many of them here but in no particular order:
          <ul>
            <li>Correct code is infinitely more desirable than efficient code</li>
            <li>Heron should make it easy to reuse code as it is the single most effective way to improve programmer productivity.</li>
            <li>Programmers familiar of the syntax and semantics of C++, Java and Pascal should find Heron recognizable and easy to learn.</li>
            <li>Heron should be easy to use and write without explicit knowledge of how to properly design
              and write new types (i.e. classes and interfaces)</li>
            <li>It should be hard (if not impossible) to write software which can damage the operating environment.</li>
            <li>Heron should address all stages of the software development process including: design, prototyping,
              implementation, testing, release, modifying, reusing.</li>
            <li>The correctness of code should be easily verifiable.</li>
            <li>Programmers should be able to write efficient software easily.</li>
            <li>Efficiency of software starts with algorithms, and appropriate data types before the efficiency of low-level operations.
              <aside>
                This is related to the statement by Tony Hoare and restated by Donald Knuth:
                "Premature optimization is the root of all evil."
              </aside>
            </li>
            <li>The intention of a programmer should be clear and unambiguous from their code.</li>
            <li>Heron should minimize the modification of existing idioms established in C++ and Java.</li>
            <li>Correctness of software is the first priority of a programmer, Heron should strive to emphasize this.</li>
            <li>Heron code should be easy to read and understood rather before being quick and terse to write.</li>
            <li>The best way to accomplish something using Heron should be a straightforward and obvious choice for a programmer.</li>
            <li>The Heron grammar should facilitate the development of compilers, translators, and other parsing tools.</li>
            <li>Correct and adequate documentation of source code is rarely available, so it is better for a programming
              language to force explicit disambiguation of programmer intention.</li>
            <li>Heron should follow its own rules as much as possible in the implementation of core types, operations, and structures.</li>
            <li>What can be accomplished through libraries should be accomplished through libraries</li>
            <li>A language is evaluated by software developers primarily by the libraries that are available for it</li>
            <li>Heron should be type-safe</li>
            <li>Heron should strive to reduce the chance of memory access errors to nil</li>
            <li>Heron should minimize the amount of undefined and implementation specific behaviour</li>
            <li>Intutivity : The effect of Heron code should be intuitive</li>
            <li>Principle of Least Surprise : Heron code should suprise a programmer as little as possible. For instance
              There should be very little chance of side effects. </li>
          </ul>
          <p/>
          An informal yet equally important design principles was kept in mind throughout the design of Heron:
          <p/>
          <ul>
            <li>KISS : Keep It Simple Stupid!
              <aside>
                This is related to the extreme programming principle of "The Simplest Thing That Could Possibly Work"
                or TSTTCPW,  though I do not consider them the same. TSTTCPW as a guiding principle in my opinion
                can be at odds with the principle of improving genericity. Flat out ignoring genericity can be the
                best way to get a job done quickly and, more importantly, correctly in cases where a programming
                language opposes genericity.
              </aside>
            </li>
          </ul>
          <p/>
          Finally there is a single Heron design anti-principle:
          <p/>
          <ul>
            <li>TMTOWTDI : "There's More Than One Way To Do It". This is the motto for the Perl programming language.
              This is at odds with everything that Heron stands for. TMTOWTDI is in direct conflict with principles
              of KISS, least surprise, readability, scalability etc.
              <aside>
                I am not anti-Perl, but Perl is what Perl is, and Heron is most definitely not Perl.
              </aside>
            </li>
          </ul>
          <p/>
        </subtopic>
      </topic>
    </section>
    <section label="language-comparisons" name="Language Comparisons">
      From an academic standpoint I would prefer not to attempt to do language comparisons,
      but in the end a programmer needs a point of reference from which to understand
      and gauge a new language's applicability for their needs, and a language comparison
      is one of the best ways to do this.
      <p/>
      The reader should be well aware that I have a blatant bias when attempting
      to do a language comparison, which I don't even bother to try and overcome.
      That would be hypocritical since I have a vested interest in promoting the
      strong points of Heron.
      <p/>
      The main language I will compare Heron to is C++ as it applies to many other
      languages as well.
      <p/>
      <topic label="heron-cpp-comparison" name="Heron Compared to C++">
        The standard by which most modern imperative languages have to compare themselves to
        is C++. C++ has been around a long time and has proven itself to be
        an effective programming language for a broad spectrum of application domains.
        <p/>
        Here are the major difference of C++ and Heron:
        <p/>
        <ol>
          <li>Heron does not have virtual functions - this may be surprising but the interface support
            makes them redundant, this leads to simpler and better performing code without the
            common so-called "abstract penalty" often associated with object oriented code.
          </li>
          <li>Heron performance is optimized towards polymorphic object-oriented code</li>
          <li>Heron run-time type information (RTTI) is always available and does not have a performance penalty</li>
          <li>Heron memory accesss and managment is safer and more restricted</li>
          <li>Heron syntax does not have backwards compatability with any language and is therefore
            simpler</li>
          <li>Heron has program objects which can be redirected to and from streamable types</li>
          <li>Heron supports many modern software development methodologies and techniques that C++ does
            not have direct support for:
            <ol>
              <li>Interfaces</li>
              <li>Delegations</li>
              <li>Concept checking</li>
              <li>Aspect Oriented Programming</li>
              <li>Design by Contract</li>
            </ol>
          </li>
          <li>Enhanced enums - can create sets of enumerated values of any instantiable type</li>
          <li>Heron has enhanced metaprogramming support, for instance the language specification
            comes with several built in compile-time data structures, control-flow structures and
            compile-time functions. Heron also allows a parameterized type_def which
            greatly facilitates the writing of compile-time meta-functions.
          </li>
          <li>Heron arrays are parameterized library types as opposed to easily violated primitives</li>
        </ol>
      </topic>
      <topic label="more-language-comparisons" name="Heron Compared to Other Languages">
        Some similarities that Heron shares with C++ that certain other imperative languages lack:
        <ol>
          <li>Full support for generic programming</li>
          <li>Overloadable operators</li>
          <li>User definable primitive equivalent types</li>
          <li>const modifiers for mutable types</li>
        </ol>
      </topic>
    </section>
    <section label="introduction" name="Introduction">
      <topic label="hello-world" name="Hello World">
        Here is the obligatory hello world program:
        <code><![CDATA[[
          1: program HelloWorld;
          2: functions {
          3:   _main() {
          4:     String sHelloWorld("Hello World");
          5:     @sHelloWorld |&gt; StdOut();
          6:    }
          7: }
          8: end
        ]]></code>

        Line 1: All programs in Heron are named. This is because a heron source file can contain multiple programs,
          and programs can be reused.
        <p/>
        Line 2: Heron modules, that is programs or units, have option function sections where function with a scope
          encompassing the entire module are declared. The order of these functions is unimportant.
        <p/>
        Line 3: Heron programs must have a single entry point function named _main() which has no return type
          and no parameters.
        <p/>
        Line 4: Heron variables must always have a type (in this case String) and that type is either
          a class, a reference to a class (either strong or weak reference, more on that later), an enum type,
          a type_def, or a primitive. Variables that have a type which is a class is
          said to be an instance of that class or an object. In this case sHelloWorld is a said to be a String object.
        <p/>
          <tt>"Hello World"</tt> is what is known as a string constant literal. There are several kinds of constant
          literals in Heron, and have special predefined types. String constant literals have a type of <tt>_string</tt>.
          The variable <tt>sHelloWorld</tt> is initialized with the value of "Hello World". When a variable
        <p/>
        Line 5: The @ symbol is an address-of operator is similar to the same operator in pascal or the ampersand (&amp;) in C/C++. It
          was chosen over the ampersand because of the other role the ampersand has of declaring references.
        <p/>
          The operator <tt>|&gt;</tt>, which as far as I can tell is unique to Heron, is a throughput or pipe operator. It is a
          cross between the <tt>&gt;&gt;</tt> and <tt>&lt;&lt;</tt> operators in C++, and the pipe (<tt>|</tt>) operator
          common in many operating system shells. The pipe operator accepts on the left either a program or a reference
          to an object which implements the interface IBufferedRead. The right hand side of a pipe operator is a program object
          or a reference to an object
        <p/>
          StdOut() is a standard library function call which return a reference to the standard output stream which, unless
          redirected internally or externally, outputs the string to the screen.
        <p/>
        Line 8: The end statement acts as a delimiter for Heron modules (units and programs).
      </topic>
    </section>
    <section label="basics" name="Basics">
      <topic label="types" name="Types">
        All Heron variables (including function parameters, and return values) are typed.
        More specifically all Heron expressions have a type that is known at compile time, hence
        we say Heron is strongly typed.
        <aside>
          For those new to programming, the type of a variable is what describes the
          behaviour and set of valid values that a variable can represent. For instance a variable
          of type Bool can hold either one of two values: true or false. A Bool variable can be used
          with boolean operations like <tt>and</tt>, <tt>or</tt> and <tt>not</tt>
          but can not be used in arithmetic expressions.
          In contrast a variable of type Int can hold valid whole number values from
          approximately -2,000,000,000 to +2,000,000,000. An Int type allows almost all arithmetic
          operations, except for division, but does not support the boolean operations of <tt>and</tt>, <tt>or</tt>
          and <tt>not</tt>.
        </aside>
        Heron variables can be declared as values of a given type, or can be declared as a reference
        to a type using the reference declaration modifiers <tt>&amp;</tt> (for weak references) and
        <tt>^</tt> (for strong references)
        <p/>
        Heron has two main kinds of types: classes and interfaces.
        <p/>
        <subtopic label="typedefs" name="Type_defs">
          A type_def is a synonym for an existing type. This means that a type_def can be used interchangeably with
          its synonymous type. Type_defs can be parameterized in Heron.
          <p/>Type_def's are used extensively in metaprogramming to
          define meta-variables and meta-functions.
        </subtopic>
        <subtopic label="enums" name="Enums">
          Enums are sets of values of a given type. In C++ enums are always of an integer type, but in Heron
          enums can be of any type. The set of values that comprise an enum have their own collective type,
          which is interchangeable with the base type, just like a typedef.
        </subtopic>
      </topic>
      <topic label="refs" name="References">
        The term reference is often used interchangeably with pointers. Heron references have semantics which place them in between
        C++ references and C++ pointers. A Heron reference has the following semantics different from pointers:
        <ul>
          <li>Automtically Dereferenced</li>
          <li>Does not support arithmetic as if it was an interger representation of memory locations</li>
        </ul>
        Heron references differ from C++ references in that they have the following properties:
        <ul>
          <li>They can be compared using <tt>==</tt> and <tt>!=</tt></li>
          <li>They can be reassigned using the = operator</li>
          <li>When reassigning a heron reference to a new variable of value type, that variable must be prefaced by the
            address-of opreator</li>
          <li>Heron reference can be compared to NULL and assigned to NULL</li>
          <li>Heron references are always initialized to refer to NULL.</li>
        </ul>
        There are two three kinds of Heron reference, weak, strong, and new references.
        <subtopic label="weak_refs" name="Weak References">
          A weak reference in Heron is declared as follows:
          <code><![CDATA[[
            MyType my_value;
            MyType&amp; my_weak_ref;
            my_weak_ref = @my_value;
          ]]></code>
          A "weak reference" is the more common of the two reference types, as it can legally refer to any value type, is lightweight
          and is relatively safe. Weak references can not be freed, nor can they be assigned from the result of a new operation.
        </subtopic>
        <subtopic label="new_refs" name="New References">
          A "new reference" is a reference that is never used by a Heron programmer, as it is the kind of reference returned
          from a new operation. This kind of reference exists to avoid the easy mistake of a programmer to assign the result
          of a new operation to a weak reference.
        </subtopic>
        <subtopic label="strong_refs" name="Strong References">
          A "strong reference" is a reference that destroys the memory it refers to automatically when the last strong reference
          to that memory is destroyed. This technique is known as reference counting and is widely misundersood.
          Strong references also allow explicit destruction of memory by calling "free", but explicit
          destruction of memory that is still referenced by a strong reference will get destroyed. Strong references
          can not be constructed from, or assigned from, weak references but the contraray is true (i.e. weak reference can
          refer to strong references).
          <aside>
            Strong references are closely based on the shared_ptr class of the boost library.
            See <a href="http://www.boost.org/libs/smart_ptr/smart_ptr.htm">http://www.boost.org/libs/smart_ptr/smart_ptr.htm</a> for more
            information on smart pointers in the boost library.
          </aside>
        </subtopic>
        <rationale>
          Heron references are based on several observations:
          <ul>
            <li>Unchecked raw memory access is extremely error prone.
            </li>
            <li>The responsibility of a particular programmer element to free memory is not an easily expressible
              concept in most programming languages, and is often left as a task to the documentation. Adequate documentation
              is rare and requiring documentation is in general an inferior solution when compared to explicit unambiguous
              code in the language itself. A programming langauge can easily allow this kind of concern to be
              expressed by creating separate classifications of memory access types (i.e. destroyable and non-destroyable references).
            </li>
            <li>Garbage collection systems do not remove the concern of memory management, they simply hide this concern
              and make it hard for a programmer to manage the concern when neccessary. An example of this is when memory
              is intended by a programmer to be returned to the system might not occur when expected because of undetected
              references to the same memory in other modules. This is difficult, if not impossible, to detect with most
              garbage collection schemes.
            </li>
            <li>Automated garbage collection leads to a pervasive attitude that system resources are virtually unlimited and can be
              treated as such. This leads to badly written software which performs substandard lead.
            </li>
            <li>Performance of automated garbage collection systems can always be matched by a manual memory management system and are
              outperformed in the large majority of cases.
            </li>
          </ul>
        </rationale>
      </topic>
    </section>
    <section label="generic-programming" name="Generic Programming">
      <quote source="Bjarne Stroustrup, The C++ Programming Language">
        Generic programming [is a] technique of programming using types as parameters.
      </quote>
      Far too many programmers have been intimidated by the apparent complexity of code using
      templates and have shied away from familiarizing themselves with generic programming. The truth
      is that every programmer that has ever used an array has already used a form of generic programming.
      I find that approaching it from this angle helps to tame the paper tiger that is generic programming.
      <aside>
        A secret from a language designer: parameterized types are typically introduced at too late of a phase
        in a language's development. A language that is too mature can not easily handle the neccessary changes
        to introduce intuitive parametric behaviour and syntax. Heron has been designed from the ground up
        with templates and generic programming in mind.
      </aside>
      <topic label="parameterized-types" name="Parameterized Types / Templates">
        Heron uses parameterization of classes, interfaces and type_defs to support generic programming.
        The best way to explain parameterization is through an example.
        Here is code for instantiating an array of integers:
        <code><![CDATA[[
          Array&lt;Int&gt; a;
        ]]></code>
        Here is a linked list of strings :
        <aside>
          A linked list is a data structure which allows faster insertion of elements in the middle of the collection
          than an array, but typically takes much longer to access individual elements using an index.
        </aside>
        <code><![CDATA[[
          List&lt;String&gt; l;
        ]]></code>
        And now here is an array of linked list of Strings:
        <code><![CDATA[[
          Array&lt;List&lt;Int&gt;&gt; al;
        ]]></code>
      </topic>
    </section>
    <section label="oop" name="Object Oriented Programming">
      <topic label="cla" name="Classes">
        A class is a data type that contains methods (also known as member function) and / or data. It may seem
        strange but some data types do not contain data fields.
        <p/>
        <aside>
          It may be argued by some that, since not every single program element or construct in Heron is
          an object, it is not a "pure oriented language". I wish to challenge that
          notion by saying that a pure object oriented language, is not a
          widely accepted or recognized concept in computer science. Rather the notion of purity with regards to
          an object oriented language was more of a marketing gimmick than anything else.
        </aside>
        <p/>
        In Heron the fields of a class are always "protected", which means they are not accessible in scope
        outside of the class itself and derived classes. This enforces the object-oriented principle of
        information hiding, and the Heron principle of restricting the bad choices that a programmer can make.
        <p/>
        Heron is not really as Draconian as you may think, if you want to define a type which is transparent
        and has the performance of a C-style struct, you still can because
        functions that return references can be read from and assigned to just like a struct field.
        A programmer can easily define two accessor functions which return weak references to the internal fields.
        Here is an example :

        <code><![CDATA[[
          Complex {
            public {
              Real() : Float& {
                return @r;
              }
              Imaginary() : Float& {
                return @i;
              }
            }
            fields {
              Float r;
              Float i;
            }
          }
          ...
          Complex cx;
          cx.Real() = 3;
          cx.Imaginary() = cx.Real() - 1;
        ]]></code>

        <rationale>
          The extra level of abstraction required to provide public access to fields like
          costs nothing but a few keystrokes and makes the software much more
          maintainable, and manageable in the long run. An example of this advantage is that one can now make changes
          later on like defining these accessor functions in an interace, and perhaps cross-cut these accessor
          functions. This facilitates aspect oriented programming, and design by contract.
        </rationale>
      </topic>
      <topic label="constructors" name="Constructors">
        A constructor (sometime abbreviated in the literature at ctor or c'tor) is the first method that is called
        when an object is allocated on the heap (through the new operator) or on the stack (through a variable declaration).
        Heron constructors are declared as functions in the public section of a class as a function with the name
        <tt>_init</tt>.
        <p/>
        There are three separate kinds of user-defined constructors:
        <p/>
        <ol>
          <li>Default constructor - A constructor with no parameters.
          </li>
          <li>Initializing constructor - A constructor with one or more parameters that is not a copy constructor.
          </li>
          <li>Copy constructor - A constructor that takes one no-reference parameter of type <tt>self</tt>.
            Self is a local alias for the type of the class within which it is found. <b>Note:</b> constructors that spell
            out the type as a parameter such as _init(Array&lt;Int&gt; x) are also legal copy constructors which will be
            accepted but are considered bad style, and may not be accepted in later versions.
            <aside>
              Unlike C++, when passing a parameter to a function, the copy constructor is not called.
              Also unlike C++, copy contructors are not generated automatically.
            </aside>
          </li>
        </ol>
        <p/>
        In Heron all classes automatically call an auto-generated constructor when first created. The auto-generated
        constructor calls the auto-generated constructor for each member field in the order it appears in the class fields
        declaration then it calls the default constructor (if declared) for each member field in the same order.
      </topic>
      <topic label="destructors" name="Destructors">
        A destructor is simply the last function called when the memory allocated for an object is about to be
        released back to the system for reclaimation, be it an object that was created on the stack (using
        a simple variable declaration, or as the result of an expression) or an object that was created on the heap
        (using a dynamic allocation function such as the new operator)
      </topic>
    </section>
    <section label="delegation" name="Delegations">
      Delegation is a technique of deferring the implementation of an interface by an object to
      another object, usually a member field.
      <aside>
        This should not be confused with what C# calls a delegation which is simply
        a member function pointer.
      </aside>
      In Heron delegation is directly supported and is a crucial part of the language.
      Delegations play an important part in Heron in techniques such as Aspect Oriented Programming
      and Design by Contract.
      <p/>
      Heron Delegations are included as a section within a class declaration as follows:
      <code><![CDATA[[
        class StringArray {
          delegates {
            IArray&lt;String&gt; : mData;
          }
          fields {
            Array&lt;String&gt; mData;
          }
        }
      ]]></code>
      In other imperative languages such as C++ and Java, delegation is a commonly used technique,
      but does not have any direct language support. This means that in these languages using the
      technique of delegation is monotonous, error prone, and expensive to modify and maintain.
    </section>
    <section label="design-by-contract" name="Design by Contract">
      <i>Design by Contract is a trademark of Interfactive Software Engineering</i>

      <topic label="intro" name="Introduction">
        Design by Contract (DBC or DbC) is a method of designing and developing software using contracts to explicitly
        state and test design requirements. In DBC the contract is used to define the obligations and benefits of
        program elements such as subroutines and classes.

        <subtopic label="why-dbc-works" name="Why DBC works">
          First and foremost DBC emphasizes the use of assertions, which in of itself can only help
          reduce the number of bugs as well as making it easier to recover from coding and design errors.
          DBC emphasizes commenting of code. This improves readability, and can only help improve
          the software development process.
          <p/>
          DBC is quite rigorous and complete. It can serve as a methodology to build many different kinds of software from the ground up
          in a very well defined and organized manner. This to helps promote consistency and removes guess work and creation of ad-hoc
          designs.
          <p/>
          DBC has much in common with other kinds of TDD (test driven design) approaches like extreme programming, which have shown
          excellent results in practice primarily due to the almost obsessive testing of every feature as it is
          integrated into the software.
        </subtopic>

        <subtopic label="contracts" name="Contracts">
          Contracts are made up of three major elements, referred to as clauses, which are:
          preconditions, postconditions and class invariants.
          <p/>
          Preconditions and postconditions are clauses that evaluated at the beginning and at the end of specific routines,
          respectively. From a design standpoint a precondition represents the obligations on the context invoking the
          routine.
          <p/>
          Pre/post-conditions are typically implemented as run-time assertions in various implementations of DBC.
          A run-time assertion is a boolean expression that is required to always evaluate to true. If an assertion
          is violated (i.e. evaluates to false) typically an exception is thrown.
          <p/>
          A class invariant is a property of each instance of a class that is required to evaluate to true before and after every
          external call to a public function. One way to think of a class invariant, is as a clause that is and-ed with
          the pre- and post-condition of each public method.
        </subtopic>
        <subtopic label="separation" name="Separation of implementation and contract">
          The definition of contracts to which our classes and subroutines must abide, is ideally
          kept separate from their implementations. A contract can be easily more complex than
          the implementation itself. Mixing contracts and implementations makes it hard to read and parse
          either of them. A contract is also conceptually closer to the interface
          of a class than its actual implementation.
          <p/>
          One of the reasons that a language like Eiffel so successful is that it conceptually separates
          the contract from the implementation, making it easy for external tools (like the integrated development environment) to
          expose either the contract or the implementation or both simultaneously.
        </subtopic>
      </topic>
      <topic label="heron" name="DbC in Heron">
        The Heron standard library uses contracts extensively. To understand Heron examples I will use
        the Heron library implementation of the Array class. Here is the implementation of the Array
        class :
        <code><![CDATA[[
          type_def APPLY_CONTRACT : meta_bool&lt;true&gt;

          ...

          class Array&lt;type T&gt; {
            inherits {
              meta_if
              &lt;
                APPLY_CONTRACTS,
                ArrayContract&lt;ArrayImpl, T&gt;,
                ArrayImpl&lt;T&gt;
              &gt;;
            }
            public {
              _init(Int i) {
                SetCount(i);
              }
            }
          }
        ]]></code>

        Surprisingly short isn't it? What is happening here is all of the implementation details are hidden in
        the class ArrayImpl. The statement meta_if returns the type ArrayContract&lt;ArrayImpl, T&gt; or
        the type ArrayImpl&lt;T&gt; depending on whether the meta-value APPLY_CONTRACTS is equal to true or false (
        actually meta_bool&lt;true&gt; or meta_bool&lt;false&gt;).
        <aside>
          The benefit of this system is that the contract is completely optionally compilable, which means
          that when turned off (i.e. APPLY_CONTRACT set to false,
          the code is completely ignored, not even checked for errors.
        </aside>
        <p/>
        So in other words if contract compilation is on, the Array inherits from ArrayContract&lt;ArrayImpl, T&gt;.
        This then leads us to the implementation of ArrayContract.

        <code><![CDATA[[
          class ArrayContract&lt;template&lt;type&gt; Array_T, type T&gt; {
            inherits {
              Array_T&lt;T&gt;;
            }
            public {
              const GetAt(Int n) : T {
                pre_assert(n &gt;= 0);
                pre_assert(n &lt; Count());
                return inherited;
              }
              SetAt(Int n, T x) {
                pre_assert(n &gt;= 0);
                pre_assert(n &lt; Count());
                inherited;
              }
              SetCount(Int n) {
                pre_assert(n &gt;= 0);
                inherited;
              }
              Push(T x) {
                Int n = Count();
                pre_assert(n &gt;= 0);
                inherited;
                post_assert(Count() == n + 1);
              }
              Pop() : T {
                Int n = Count();
                pre_assert(n &gt; 0);
                return inherited;
                post_assert(Count() == n - 1);
              }
            }
          }
        ]]></code>

        The only slightly clever technique used here is that ArrayContract inherits from its template parameter.
        What happens is that we override several function with predcondition assertions, postcondition checks and local
        variables as needed to evaluate that results are as hoped. This technique
        maintains the desirable property of separation of implementation from specification of contract.
        <p/>
        The only part of DbC not covered by this example is invariant checking. Clearly we could just check the invariant
        conditions before and after each function, but that would add reams of extra code for a simple concept.
        This is where aspect oriented programming comes in handy. So braver reader continue on to <a href="aop.html">aspect oriented
        programming</a> ...

      </topic>
    </section>
    <section label="aop" name="Aspect Oriented Programming">
      <quote source="Christopher Diggins, Aspect Oriented Programming in C++, August 2004, Dr Dobbs Journal">
        Aspect Oriented Programming (AOP) is a technique for separating and isolating
        crosscutting concerns into modular components called aspects. A crosscutting
        concern is a behaviour that "cuts" across the boundaries of assigned
        responsibility for a given modular element. Examples of crosscutting concerns
        are process synchronization, location control, execution timing constraints,
        persistence, and failure recovery. There is also a wide range of algorithms and
        design patterns which are more naturally expressible using AOP.
      </quote>
      <topic label="heron-aop" name="AOP in Heron">
        Heron support for Aspect Oriented Programming AOP is very straightforward but deceptively powerful.
        For an object oriented language the ability to crosscut an object with a concern is extremely useful
        but is missing from all the major imperative programming languages.
        In Heron crosscutting is done in the delegation section of a class using a crosscut modifier
        to the delegation.
        <p/>
        Heron aspects override one or more of the following functions, which are called in relation
        to each function in the interface that is being crosscut.
        <p/>
        <ul>
          <li><tt>_before()</tt> - Called at the beginning of each function in the interface being crosscut</li>
          <li><tt>_after()</tt> - Called after the end of each function in the interface being crosscut</li>
        </ul>
        <p/>
        The following example is taken directly from the <tt>heron.core</tt> unit in the Heron standard
        library:
        <code><![CDATA[[
          class SizedArray_contract&lt;template&lt;type, meta_int&gt; SIZED_ARRAY_T, type T, meta_int SIZE&gt; {
            delegates {
              ISizedArray&lt;T, SIZE&gt; : m / this;
            }
            public {
              _before() {
                invariant_assert(Invariant());
              }
              _after() {
                invariant_assert(Invariant());
              }
              Invariant() : Bool {
                return m.Count() &lt;= meta_value&lt;SIZE&gt;;
              }
            }
            fields {
              SIZED_ARRAY_T&lt;T, SIZE&gt; m;
            }
          }
        ]]></code>
        The crosscut modifier takes the form of <tt>/ <i>aspect</i></tt>) and occurs in the delegation clause.
        This indicates that for every function in the interface being delegated, expression._before() is called upon entry
        into that function and expression._after() is called before exiting from the function.
        <p/>
        The most common crosscutting example is that of the invariant checking done by a contract so
        most often we see crosscutting using "<tt>this</tt>" as the aspect, but the aspect can be any expression which returns an
        object which supports the interface IAspect. IAspect contains only two functions: _before() and _after().
        The aspect can be a function call, or a field or even a module level varaible.

        <aside>
          Heron sought to maximimize AOP support while minimizing the language complexity.
          The result is the crosscut modifier to delegations. This is sufficiently expressive and can match and even surpass.
          the power and flexibility of many existing AOP languages, and pre-parsers when used with metaprogramming techniques.
          Later on we will extend the documentation to demonstrate advanced usage of the crosscut modifier to emulate
          structures found in other AOP models.
        </aside>
      </topic>
    </section>
    <section label="standard-library" name="Heron Standard Library">
      <topic label="over" name="Overview">
        The Heron standard library was designed with the following design goals in mind, listed
        in approximate order of priority.
        <p/>
        <ol>
          <li>Ease of use
            <rationale>
              Without ease of use as the first priority, a library is wasted on everyone except
              advanced users. Lack of usability is one of the few valid criticisms outstanding
              of the C++ STL (Standard Template Library).
              Ease of use encourages reusability.
            </rationale>
          </li>
          <li>Efficiency
            <rationale>
              Without efficiency as a significant design goal reusability
              is sacrificed. A standard library has a responsibility to be acceptably efficient for a
              sufficiently large number of application domains.
            </rationale>
          </li>
        </ol>
      </topic>
    </section>
    <section label="operator-overloading" name="Operator Overloading">
        The following overridable operators are transformed into function calls according to the following rules:
        <p/>
        <code><![CDATA[[
        x = y ::== x._assign(y)
        x + y ::== _plus(x, y)
        x - y ::== _minus(x, y)
        x * y ::== _star(x, y)
        x / y ::== _div(x, y)
        x % y ::== _mod(x, y)
        x += y ::== _pluseq(x, y)
        x -= y ::== _minuseq(x, y)
        x *= y ::== _stareq(x, y)
        x /= y ::== _diveq(x, y)
        x %= y ::== _modeq(x, y)
        x &gt; y ::== _gt(x, y)
        x &lt; y ::== _lt(x, y)
        x &gt;= y ::== _gteq(x, y)
        x &lt;= y ::== _lteq(x, y)
        x == y ::== _eq(x, y)
        x != y ::== _noteq(x, y)
        x |&gt; y ::== _pipe(x, y)
        x |&gt; y |&gt; z ::== _pipe(x, y, z)
        x++ ::== x._postinc()
        x-- ::== x._postdec()
        x[y] ::== _dereference(x._subscript(y))
        ]]></code>
    </section>
    <section label="tools" name="Heron Tools">
      <topic label="hrn2cpp" name="hrn2cpp - The Heron to C++ translator">
        Hrn2cpp is a translator written in Delphi that takes a list of Heron source files, merges them
        and outputs a single C++ source file. Hrn2Cpp verifies that the input program is a well-formed Heron program,
        and does some type identifier resolution. Other symbol resolution and type checking is left up to the C++ compiler perform.
        <p/>
        <note>
          Hrn2Cpp was written for and tested exclusively on windows but porting it to Kylix (Delphi for Linux) should be quite smooth
          as there is very little windows specific code.
        </note>
        <p/>
        The Heron translator is currently a prototype, that it is being used to test and make changes
        to the Heron language on the fly. The current version of hrn2cpp is prototype 3.0.0. This means that even though
        it is relatively robust, there are always changes occuring to the code.
        <p/>
        <subtopic label="hrn2cpp-usage" name="Using hrn2cpp">
          To use hrn2cpp it requires three command line arguments
          <ol>
            <li>the name of the target program, that is the program that will serve as the main entry point for the executable. </li>
            <li>a semicolon (;) delimited list of files which are to be compiled. Wildcards (* and ?) are also allowed</li>
            <li>the path and file name of the output file (which will be a C++ file)</li>
          </ol>
        </subtopic>
        <subtopic label="hrn2cpp-output" name="The Output of hrn2cpp">
          Hrn2Cpp aims to produce standards compliant C++. Unfortunately very few C++ compilers can handle the
          sophisticated C++ code this is output by hrn2cpp. At this point I can only try to make Hrn2Cpp work with Visual C++ 7.1.
          My Borland and Digital Mars C++ compiler crash and burn with any non-trivial heron program (for instance
          the <a href="doc2html.html">doc2html</a> heron program used to generate this site)
        </subtopic>
        <subtopic label="extending-hrn2cpp" name="Extending hrn2cpp">
          The Heron translator requires the file "heron.hpp" which contains the C++ definitions of the primitives used
          and assumed by Heron. This file is undocumented but for the ambitious programmer, the behaviour
          of Heron primitive types can be changed by modifying this file. This is only recommended for expert level
          C++ programmers.
        </subtopic>
        <subtopic label="upcoming-features" name="Upcoming features">
          Planned feature list :
          <ul>
            <li>Type Checking</li>
            <li>Do full symbol resolution with scope checking</li>
            <li>Perform pre-optimization of C++ code.</li>
          </ul>
        </subtopic>
      </topic>
    </section>
    <section label="metaprogramming" name="Metaprogramming in Heron">
      <topic label="about-meta" name="What is Metaprogramming?">
        Metaprogramming is the use of compile time algorithms and data structures to generate code, evaluate expressions,
        and construct data types at compile time. Possibly the most important and well known example of metaprogramming
        is the usage of the macros <tt>#define</tt>, <tt>#ifdef</tt>, <tt>#elif</tt>, <tt>#else</tt>, <tt>#endif</tt> and <tt>#if</tt>
        in the C and C++ languages.
        <p/>
        The other common example of metaprogramming in many other compiled languages is the ability of the compiler to evaluate constant
        arithmetic expressions at compile time. Previously this was viewed by the majority of programmers as little more than a
        somewhat trivial optimization feature. With the increasing popularity and exploration of generic programming techniques
        the ability to evaluate and compare simple types at compile time creates the possibility of creating code that uses more
        complex decision making algorithms for the construction of types.
        <rationale>
          Metaprogramming in general helps programmers to write more efficient, general and portable code. It
          is also a much more expressive, maintainable, and robust method of producing multiple executable releases
          from a single code base. The need for metaprogramming facilities becomes apparent when writing generic libraries.
        </rationale>
        <aside>
          Heron metaprogramming facilities are heavily inspired by the work of Aleksey Gurtovoyi, David Abrahams and others
          on the Boost C++ Metaprogramming Library : MPL. For more information on metaprogramming in C++ see the MPL
          documentation online at <a href="http://www.boost.org/libs/mpl/doc/">http://www.boost.org/libs/mpl/doc/</a>.
          Some more recommended reading on an introduction to C++ metaprogramming techniques can be found at
          <a href="http://osl.iu.edu/~tveldhui/papers/Template-Metaprograms/meta-art.html">http://osl.iu.edu/~tveldhui/papers/Template-Metaprograms/meta-art.html</a>
        </aside>
      </topic>
      <topic label="heron-cpp-mp" name="Heron Metaprogramming Compared to C++">
        This topic is aimed at programmers who already have had exposure to metaprogramming techniques in C++.
        <p/>
        Heron metaprogramming is very closely related to C++ metaprogramming but Heron provides several
        metaprogramming enhancements over C++, and incorporates them more deliberately in the language.
        <p/>
        One of the big differences between Heron and C++ metaprogramming, is that Heron allows
        parameterized typedefs. In C++ meta-functions have to be declared in a roundabout way by
        creating a superflous struct. This will be appreciated most by those who have experience
        writing C++ meta-functions.
        <p/>
        Another difference is that in Heron constants and constant
        literals are separate from meta-values. This makes it much easier to disambiguate compile-time evalated
        expressions from run-time evaluated expressions.
        <p/>
        <rationale>
          It may on the surface seem unneccessarily restrictive for Heron to require simpler arithmetic
          expressions (such as <tt>2 * PI</tt>) to now be expressed in the more verbose form:
          <tt><![CDATA[[meta_int_mult&lt;meta_value&lt;2&gt, PI&gt;&gt]]></tt>, in order to be evaluated
          at compile-time. These stricter rules are considered important to Heron in several regards:
          <ul>
            <li>they emphasize the difference between the value 2 as an Int and meta_int&lt;2&gt;.
              The difference is quite simple really: the constant
              expression <tt>2</tt> is guaranteed to trigger a call to the Int constructor,
              the Int destructor, and also provides access to const functions.
            </li>
            <li>they make it much easier to identify, read, understand, and verify compile-time code
              by both programmers and automated tools (such as compilers, debuggers and editors).
            </li>
            <li>they maintain the distinction (which is confusing to many) between compile time evaluated code
              and run-time code. For instance an epression like: <tt>x + y</tt> can safely be assumed to always be evaluated
              only at run time, while <tt>meta_int_add&x, y&gt;</tt> can safely be assumed to always be evaluated at compile
              time.
            </li>
          </ul>
        </rationale>
        <p/>
        <note>
          The subtle difference in C++ of run-time constant expressions and compile-time constant expressions,
          is a source of consternation to many programmers and compiler vendors. An example of this
          is the surprise that some C++ programmers get when they try to create a template that accepts a
          floating point or array of chars as a template value parameter.
        </note>
        <aside>
          A simple rule of thumb in Heron for identifying metacode (compile time evaluated expressions)
          is that whenever you see the prefix <tt>meta_</tt> or <tt>_meta_</tt> you are
          more than likely looking at compile-time evaluated code (also known as metaprogramming code).
        </aside>
      </topic>
      <topic label="meta-values" name="Metaprogramming Values">
        <subtopic label="meta-vars" name="Metaprogramming Variables">
          A meta-variable is a value that is known at compile time and is declared using the <tt>type_def</tt>
          meta-operator. Here is an example of declaration of a couple metaprogramming variables:
          <code><![CDATA[[
            type_def WORD_SIZE : meta_int&lt32&gt;
            type_def DWORD_SIZE : meta_int_mult&lt;WORD_SIZE, meta_int&lt2&gt;&gt;
            type_def DEBUG : meta_bool&lt;true&gt;
          ]]></code>
        </subtopic>
        <subtopic label="meta-constant" name="Metaprogramming Constants">
          A meta-constant is an unlabelled value that is known at compile time and declared using one of
          the meta-primitive declarations.
          <p/>
          The one important predefined meta-constant is <tt>meta_null</tt>.
        </subtopic>
        <subtopic label="meta-primitives" name="Metaprogramming Primitives">
          The meta-type primitives are the set of predefined compile time meta-types. They correspond very closely
          to the regular primitive data types. To declare a new meta-constant, it must be one of the following types:
          <ul>
            <li><tt>meta_int&lt;<i>integer_literal</i>&gt;</tt></li>
            <li><tt>meta_bool&lt;<i>boolean_literal</i>&gt;</tt></li>
            <li><tt>meta_char&lt;<i>char_literal</i>&gt;</tt></li>
            <!--
            <li><tt>meta_float&lt;<i>float_literal</i>&gt;</tt></li>
            <li><tt>meta_string&lt;<i>string_literal</i>&gt;</tt></li>
            -->
          </ul>
        </subtopic>
        <subtopic label="meta-value" name="The meta_value Operator">
          There is a special operation for converting from the compile-time meta-primitives to an expression which is usable
          in an ordinary run-time expression called <tt>type_value</tt>.
          It is invoked like any meta-function as follows :
          <code><![CDATA[[
            type_def meta_int&lt;1000000&gt; : BIG_INT;
            ...
            IsBigNumber(Int x) : Bool {
              result = x &gt; meta_value&lt;BIG_INT&gt;
            }
          ]]></code>
          <aside>
            My way of understanding the meta_value operator is by thinking of it as a bridge from compile time
            expressions to run-time expressions.
          </aside>
        </subtopic>
      </topic>
      <topic label="meta-functions" name="Metaprogramming Functions">
        Meta-functions are defined using type_def just like meta-variables.
        Heron currently supports the following built-in meta-functions :
        <code><![CDATA[[
          meta_if&lt;meta_bool COND, type T1, type T2&gt; : type;
          meta_eq&lt;type T1, type T2&gt; : meta_bool;
          meta_is_null&lt;type T&gt; : meta_bool;
          meta_and&lt;meta_bool T1, meta_bool T2&gt; : meta_bool;
          meta_or&lt;meta_bool T1, meta_bool T2&gt; : meta_bool;
          meta_not&lt;meta_bool T&gt; : meta_bool;
          meta_xor&lt;meta_bool T1, meta_bool T2&gt; : meta_bool;
          meta_int_gt&lt;meta_int T1, meta_int T2&gt; : meta_bool;
          meta_int_lt&lt;meta_int T1, meta_int T2&gt; : meta_bool;
          meta_int_eq&lt;meta_int T1, meta_int T2&gt; : meta_bool;
          meta_int_gteq&lt;meta_int T1, meta_int T2&gt; : meta_bool;
          meta_int_lteq&lt;meta_int T1, meta_int T2&gt; : meta_bool;
          meta_int_noteq&lt;meta_int T1, meta_int T2&gt; : meta_bool;
          meta_int_plus&lt;meta_int T1, meta_int T2&gt; : meta_int;
          meta_int_negative&lt;meta_int T&gt; : meta_int;
          meta_int_minus&lt;meta_int T1, meta_int T2&gt; : meta_int;
          meta_inc&lt;meta_int T&gt; : meta_int;
          meta_dec&lt;meta_int T&gt; : meta_int;
          meta_list_head&lt;meta_list T&gt; : type;
          meta_list_tail&lt;meta_list T&gt; : meta_list;
        ]]></code>
        <subtopic label="meta-if" name="The meta_if Metaprogramming Function">
          Perhaps the single most important meta-function is <tt>meta_if</tt> because it is used
          in the majority of metaprogramming functions and compile time decision making code.
          The way <tt>meta_if</tt> works is very simple, if the compile-time boolean value that is passed to it (<tt>COND</tt>)
          has a true value (specifically <tt>type_bool&ltbool&gt;</tt>)
          then the entire <tt>meta_if</tt> expression is taken to be equivalent to T1, otherwise the expression is equivalent
          to T2.
          <p/>
          A useful property of <tt>meta_if</tt> is that only T1 or T2 is expanded but not both.
          This not only improves the efficiency of the compiler meta-programming expansions, it also shields us from
          undesirable compile time errors in situiations, where the un-compiled meta-expression is known to be erroneous.
        </subtopic>
      </topic>
      <topic label="meta-list" name="Meta-Lists : Compile-Time Abstract Data Types">
        There is one basic metaprogramming abstract data type, the meta-list. A meta-list is a collection of
        zero or more elements. A meta-list element can be another meta-list, a meta-value, a meta-function
        or a type.
        <aside>
          The observant reader who is new to metaprogramming may have noticed that meta-lists, meta-values, meta-variables
          and meta-functions are really in the end just plain old parameterized types. This is the magic of metaprogramming,
          it is all done with types.
        </aside>
        There are three meta-list operators :
        <code><![CDATA[[
          meta_list&lt;type T1, type T2, ... , type TN>; // constructs a meta_list from an arbitrary number of arguments.
          meta_list_head&lt;meta_list T&gt; : type; // returns the first element of a list
          meta_list_tail&lt;meta_list T&gt; : meta_list; // returns a list consisting of all elements of the list except the first
        ]]></code>
      </topic>
      <topic label="meta-example" name="Metaprogramming Example">
        To summarize how metaprogramming in Heron works we give the example of how to write meta-functions and use meta-types.
        Since there are no meta-programming looping constructs, metaprogramming relies heavily on recursion to achieve
        the desired algorithms. Designing recursive algorithms may take a small period of adjustment for some,
        but it can be a very powerful and useful powerful technique to familiarize one's self with.
        <p/>
        By way of example these are two functions that people who use meta-lists often may find useful. One outputs whether a
        list is empty or not (note: that a list that contains an empty list is not considered empty). The other function
        outputs the number of elements in a meta_list.
        <code><![CDATA[[
          type_def meta_list_is_empty&ltmeta_list T&gt;
          :
            type_eq&lt;meta_list_head&lt;T&gt;, type_null&gt;;

          type_def meta_list_count&lt;meta_list T&gt;
          :
            type_if&
            &lt;
              meta_list_is_empty&lt;T&gt;,
              meta_int&lt;0&gt;
              meta_int_add&lt;meta_int&lt;1&gt;, meta_list_count&lt;meta_list_tail&lt;T&gt;&gt;&gt;
            &gt;;
        ]]></code>
      </topic>
    </section>
    <section label="last" name="Last Words">
      <topic label="plans" name="Plans for Upcoming Versions of Heron">
        Here are a few (very sketchy) working ideas that I am considering for upcoming versions of Heron:
        <ul>
          <li>aspect oriented programming - There are now a couple of competing ideas I have for
            how to introduce aspect oriented programming into Heron</li>
          <li>foreach - built in loop structure for generic collections</li>
          <li>parameterized modules - can pass parameters to imported modules</li>
          <li>try / except / finally / success</li>
          <li>safe and unsafe exceptions - Exceptions can be considered as either safe or unsafe
            (i.e. corrupting type safety or memory). The approach that I am exploring
            would transform user thrown exceptions and certain internal exceptions into
            halting_exceptions when there is potential to corrupt the type system or cause
            memory leaks, etc. One possible example could be that an expcetion that propogates out of a constructor
            would be transformed into a halting_exception. This needs more exploration, but I am reasonably confident
            that the distinction between safe and unsafe exceptions (with unsafe exceptions causing a program
            halt, and reclamation of memory) could lead to full type safety and better memory safety. This is based
            on the recent discovery of flaws in Java type safety when exceptions are thrown.
          </li>
          <li>Differentiate between typedefs and typealias. I want to make a typedef support only implicit downcasts,
            while a typealias would support implicit upcasts and downcasts.
            The logic is that in C++ typedefs are often intended as separate
            types while other times it is intended as only a convenient shorthand.
            These two uses would be better represented through separate semantics than documentation.</li>
          <li>meta-value simplification, the current requirements on meta-values to always be declared with the
            type is perhaps too strict. What could be done is conversion of literal cosntants to meta-values
            when it is relatively obvious such a thing is desired. This would be a big boon for those working
            on high performance matrix libraries.
            <p/>
            The other possibility in response to the needs of people doing advanced matrix meta-programming
            is to simply introduce into the language specification, meta_matrix and meta_vector code.
            I am a bit on the fence on this issue. I don't want to give up the strict type declaration rules
            on meta-constants because of the complications that arise when trying to distinguish between
            int and floats and the entire lack of a proper meta-type-checking system at compile-time.
            The other side of the coin is that, once I start twisting rules for one type of application, other
            applications desire it as well.
            <p/>
            Finally the other option which is also attractive is to simply introduce typed meta-arrays.
            <code><![CDATA[[
              meta_array&lt;meta_array&lt;meta_float&gt;&gt;
              &lt;
                &lt;1, 1, 2&gt;,
                &lt;1, 2, 1&gt;,
                &lt;2, 1, 1&gt;
              &gt;
            ]]></code>
            The explanation is that a meta-array has meta-type-checking. Confusing, but it does balance
            the desire for strict meta-type declaration, with ease of typing. This kind of logic makes
            me wonder about the possibility of declaration of meta-abstract data types. For instance a
            <code><![CDATA[[
              meta_array&lt;meta_pair&lt;meta_string, meta_int&gt;&gt;
              &lt;
                &lt"one", 1&gt;,
                &lt;"two", 2&gt;
              &gt;
            ]]></code>
            This kind of code would facilitate the creation of data structures such as compile-time associative
            arrays (i.e. maps). This is a compile-time data structure that I personally would have found desirable
            on several occasions.
          </li>
        </ul>
      </topic>
    </section>
  </herondoc>
  <page name="Extensions" label="extensions">
    <h1>Interface Extensions, Highly Reusable Function Definitions</h1>

    <h4>Abstract</h4>

    The Heron programming language introduces the notion of an interface extension, which is a set of functions defined
    within an interface, that can only call other functions belonging to that particular interface. We explore the
    advantages of extensions from a design standpoint and demonstrate how this improves the expressive capacity
    of an interface.

    <h4>Introduction</h4>

    For the purpose of this paper we will use a simplification of the definition of interface that corresponds to
    the Java and Heron language specifications. An interface is a type that can not be instantiated but can be
    used as a reference to any object that implements said interface. For instance in Heron :

    <code><![CDATA[[
      IComparable&lt;Int&gt;&amp; i; // declare an interface reference variable named i
      Int x(3), y(4); // intialize two ints (x, y) with the values of (3, 4) respectively
      i = x; // set the variable i to refer to the variable x
      Bool b(i.Compare(y)); // b is initialized with the value of true
    ]]></code>

    In the above example the IComparable interface has a contract of single function Compare(T x) where T is a parameterized type.
    <p/>
    An extension is a set of functions that belongs to the interface that has a definition that may refer to itself and other functions
    of the interface. In order to differentiate between extension and non-extension functions, the non-extension
    set of functions of an interface is referred to as the contract in Heron.
    <p/>

    <h4>When to use Extensions</h4>

    Whenever any function of a class does not require access to any implementation specific functionality such as private functions
    or member fields and calls only functions belonging to a single interface then it would be a good canidate for the extension
    of that interface. Sometimes part of an interface can be moved from the contract to the extension thus reducing complexity.
    <h4>Example</h4>

    An excellent example of an interface contract and extension division can be demonstrated through a straightforward
    Binary Search Tree implementation as described in "Introduction to Algorithms" [Cormen, Leiserson, Rivest; McGraw Hill] and implemented
    in Heron.

    <code><![CDATA[[
      interface ITreeNode&lt;VALUE_TYPE&gt; {
        requires {
          VALUE_TYPE supports ICompare&lt;VALUE_TYPE&gt;;
        }
        public {
          GetValue() : VALUE_TYPE;
          GetLeftChild() : ITreeNode&lt;VALUE_TYPE&gt;&amp;;
          GetRightChild() : ITreeNode&lt;VALUE_TYPE&gt;&amp;;
          GetParent() : ITreeNode&lt;VALUE_TYPE&gt;&amp;;
        }
        extension {
          GetMinimum() : ITreeNode&lt;VALUE_TYPE&gt;&amp; {
           result = this;
            while (result.GetLeftChild() !~ null) {
              result = result.GetLeftChild();
            }
          GetMaximum() : ITreeNode&lt;VALUE_TYPE&gt;&amp; {
            result = this;
            while (result.GetRightChild() !~ null) {
              result = result.GetRightChild();
            }
          }
          GetSuccessor() : ITreeNode&lt;VALUE_TYPE&gt;&amp; {
            if (GetRightChild() !~ null) {
              result = GetMinimum(GetRightChild());
            } else {
              result = this;
              ITreeNode&lt;VALUE_TYPE&gt;amp; tmp = GetParent();
              while ((tmp !~ null) &amp;&amp; (result ~~ tmp.GetRightChild())) {
                result = tmp;
                tmp = tmp.GetParent();
                }
              }
            }
          }
          GetPredecessor() : ITreeNode&lt;VALUE_TYPE&gt;&amp; {
            if (GetLeftChild() !~ null) {
              result = GetMaximum(GetLeftChild());
            } else {
              result = this;
              ITreeNode&lt;VALUE_TYPE&gt;&amp; tmp = GetParent());
              while ((tmp !~ null) &amp;&amp; (result ~~ tmp.GetLeftChild())) {
                result = tmp;
                tmp = tmp.GetParent();
              }
            }
          }
        }
      }
    ]]></code>

    The point to this example is that GetMinimum(), GetMaximum(), GetSuccessor() and GetPredecessor() all can be defined in terms
    of other parts of the ITreeNode interface. Therefore any tree implementation that implements the ITreeNode interface gets those
    functions for free.

    <h4>Equivalency to Abstract Base Classes</h4>

    If we thought of an interface as being equivalent to an abstract base class, that is a class with no
    member fields and only abstract or pure virtual functions, then an extension would be any functions that we
    added to that class that are non-virtual and have a defintion.
    <p/>
    The problem with the abstract base class model is that conceptually all functions are grouped together equivalently.
    Traditional OOP allows programmers to define virtual functions with a default defintion. This confuses the role of
    an ABC function : is it supposed to be part of the contract or is it an extension. In this scenario an undisciplined
    mess of overriding scenarios can occur. Extensions force programmers to be careful about the choices they make and to
    be explicit about the roles they assign for different functions.
    <p/>
    Extension functions though do differ from virtual functions in that they do not propagate up through an inheritance tree
    rather they trickle down from the top. What this means is that an extension that calls a contract function calls the
    same contract function as the user of an object would. One way to understand this subtle difference from virtual function
    calls is to think of an extension function such as :

    <code><![CDATA[[
      SomeInterface.SomeExtensionFunction() {
        SomeContractFunction();
      }
    ]]></code>

    as being semantically equivalent to

    <code><![CDATA[[
      SomeExtensionFunction(SomeInterface&amp; param) {
        param.SomeContractFunction();
      }
    ]]></code>

    <h4>Summary</h4>

    An extension is reusable whenever we implement an interface because it allows us to write new methods that operate on any object that implements a particular interface in
    a clear and well specified manner. Extensions respect the software design notion of separation of concerns. The contract part of the interface is
    separated clearly from the extensions. Extensions have an advantage of being well identified within software for what their role is, without
    the need to resort to documentation.

    <h4>Future directions</h4>
    Interfaces with extensions and a mechanism for delegating implementation to member fields can replace entirely class inheritance
    as a means for providing polymorphic objects. Such a system would have advantages in that it supports yet clearly distinguishes between
    inheritance of a role (interface) and inheritance of implementation.
  </page>
  <page name="Heron FAQ (Frequently Asked Questions)" label="faq">
    <h1>Heron F.A.Q</h1>
    <ol>
      <li><b>Why would I use Heron?</b><br/>
        Instead of directly answering this question, because I do not know your specific needs and preferences I will
        answer this question for myself.
        <p/>
        I use Heron because I wanted to use a freely available language that is sufficiently expressive,
        efficient, easy to use, and can shield from me from significant errors.
      </li>
      <p/>
      <li><b>Why no garbage collector in Heron?</b><br/>
        Garbage collectors are evil. The Heron strong / weak reference model is much better. It is more efficient and more
        explicit. The programmer controls when memory is released back to the system.
      </li>
    </ol>
  </page>
  <page name="Why ABC's make bad Interfaces" label="abc-iop">
    <h1>Why ABC's make bad Interfaces</h1>
    <p/>
    Many programmer's make the mistake of assuming that an interface is equivalent to an Abstract Base Class (ABC).
    This is an easy mistake to make because it is common practice to implement an interface using ABC's.
    <p/>
    The difference is that an interface is not a set of virtual functions like an ABC. If I declare a class
    as implementing an interface, it is not a tacit agreement to make those functions virtual.
    <p/>
    When using an ABC to simulate an interface you are wrongly making all of the interface functions virtual.
    The leads to the problem of extra virtual table pointers within our objects, and unnecceassary superfluous
    dynamic dispatching of the functions.
    <p/>
    For an example of interfaces versus ABC's in C++ visit
    <a href="http://www.heron-language.com/heronfront.html">http://www.heron-language.com/heronfront.html</a>
    <p/>
  </page>
  <page name="Heron Specificiation" label="spec">
    <h2>The Heron Specification has Moved</h2>
    <h4>The Heron Specification has become the Heron Documentation and the Table of Contents is now
    available at <a href="toc.html">toc.html</a></h4>
  </page>
  <page name="Why I don't use C#" label="C-sharp-critique" >
    <h1>Why I don't use C#</h1>

    <h3>Preamble</h3>

    I am very hesitant to post a critique of any programming language, due to the obvious conflict of interest
    (I am the designer of the <a href="http://www.heron-language.com">Heron programming language</a>) but the
    sheer force of the marketing hype behind the C# language along with lack of a solid critique has
    motivated me to write this article. For an illustration of what I mean, consider what is purported to
    be an "academic critique" by Microsoft at
    <a href="http://www.msdnaa.net/content/?ID=1588">http://www.msdnaa.net/content/?ID=1588</a>. One of their
    so-called critiques is written by the lead language designer Anders Hejlberg while the other is by an
    authour of the book "C# Essentials" published by O'Reilly.

    <h3>Introduction</h3>

    C# (C-Sharp) is a new language which was designed with a set of goals
    which emphasized flexibility and early adoption of the technology over making decisions
    which would have better facilitated large-scale development of low-defect software.
    <p/>
    That having been said there are certain language features which overall could be deemed retardant to
    effective software development of non-trivial software. These are the issue that I wish to address.
    <p/>
    I can not overemphasize, that even though I have attempted to impart an objectivity to this article,
    I have an unescapable bias.

    <h3>Unmanaged / Unsafe Contexts</h3>

    The marketing department at Microsoft would like us to believe that C# has the advantages of C++ like
    efficiency (when compared to Java) because we can use raw memory access inside of unmanaged and unsafe
    contexts. At the same time we are expected to believe that because the contexts are optional we have
    the safey associated with a Java-like memory protection and garbage collector. Memory safety only
    occurs when an entire program is safe, which implies that we do not have the efficiency of C++.
    Therefore you either have safety or you have efficiency, you never have both.

    <h3>Attributes</h3>

    Attributes are one of the biggest differences between C# and other similar languages.
    An attribute is a class that is used declaratively to express extra information about program elements.
    On the positive side using attributes is a powerful and flexible programming technique.
    On the downside, this kind of flexibility can often be achieved through other programming techniques
    available in C# such as object oriented programming or generic programming.
    <p/>
    When a programming language
    supports too many programming techniques there is a strong tendency for programmers to use one technique they
    understand well to poorly approximate another technique. One common example is for programmers to
    use run time type information to achieve a kind of object polymorphism when it would sometimes
    be more appropriate to have declared a common base class.

    <h3>Garbage Collection</h3>

    Without going into any of the disadavantages of a garbage collection system in general, which is a very contentious
    topic in its own right, C# has a couple of language specific problems with regards to its usage of a GC (
    garbage collector) :

    <blockquote>
      Problem #1 : Non-deterministic destructors.
      GC with destructors means the destructors can be called at unpredicatable moments during
      execution. This requires extra care on the part of programmers. Many
      programmers falsely make the assumption that the destructor is called once the last reference is removed.
      This is not something that can be counted on with a GC and means that control flow with regards to destructors
      is not consistent or predictable.
      <p/>
      Problem #2 : There is a problem of being able to modify managed types from within an unsafe context. This significantly
      undermines the effectiveness of a GC and renders all unsafe code as potentially having undefined behaviour.
      From the language specification :

      <i><blockquote>Modifying objects of managed type through fixed pointers can result in undefined behavior. For example, because strings
      are immutable, it is the programmer's responsibility to ensure that the characters referenced by a pointer to a fixed string are not
      modified.<Br/>
      - <a href="http://msdn.microsoft.com/library/en-us/csspec/html/vclrfcsharpspec_A_6.asp">http://msdn.microsoft.com/library/en-us/csspec/html/vclrfcsharpspec_A_6.asp</a>
      </blockquote></i>
    </blockquote>

    Both of these problems are serious because they both introduce potential for new and signficant errors into software.

    <h3>Value Types and Reference Types</h3>

    As a result of separating types into two groups (reference and value) instead of allowing the programmer to declare
    whether a variable is a reference to a type or a value of a type, brings certain additional problems.
    First there is no longer
    a consistent rule for passing an argument to a subroutine, whether it is done by value or by reference depends on the
    type. This violates the programming language rule of consistent expected behaviour.
    <p/>
    Since objects are always automatically references and can never be instantiated by value, this leads to several
    small performance penalties when using objects locally which would not have been neccessary :

    <ul>
      <li>An extra word is needed for the pointer</li>
      <li>There are more allocations on the heap than are neccessary which leads to more work for the garbage collector</li>
      <li>There is a dereference of pointer penalty</li>
    </ul>

    In order to have some kind of unification (i.e. reconcilation of the difference between the two groups of types) C# chose
    to use a boxing / unboxing semantic. Boxing and unboxing is both subtle and confusing as it can lead to a surprising
    interpretation of straightforward code.

    <h3>Special Primitives and Immutability</h3>

    Special rules exist for so-called "primitive" types. This significantly reduces the expressiveness and flexibility
    of the language. The programmer can not add new types that are fully at par with existing primitives.
    For example instances of user defined mutable types can not be declared as being immutable.
    <p/>
    To illustrate you can declare an integer constant (<tt>const int = 5</tt>) but you can not write such code for user
    defined types.

    <h3>Namespace Restrictions</h3>

    C# does not allow static functions outside of a class and in the global namespace. Consider the following example :

    <code><![CDATA[[
    using System;
    class Hello
    {
       static void Main() {
          Console.WriteLine("hello, world");
       }
    }
    ]]></code>

    The System namespace can not export a function named WriteLine. It could be argued all static data
    should occur outside of classes since it is not directly related to objects themselves.

    <h3>Lack of Modules</h3>

    C# lacks the ability to declare modules. Modules must be emulated through the use of a static class with only static data.
    The weakness of such a design pattern is that one must programatically enforce the non-instantiation rule of a module.

    <h3>Explicit Interface Implementation</h3>

    C# does not support delegating interface implementations to member fields. This significantly restricts the ability to
    effectively use interfaces.

    <h3>Late Added Generics</h3>

    By adding generics late in the language development generic programming is not supported as deeply as it could be.
    One such example is that there are two kinds of arrays : <tt>int a[];</tt> and <tt>array&lt;int&gt; a;</tt>.
    There is also a fair amount of distance to go before C# will support advanced metaprogramming techniques available
    in other language with more mature and well-integrated generics.

    <h3>Public Fields</h3>

    Classes can expose their fields as being public. This violates the encapsulation principle of objects. This is not
    neccessary at all even for reasons of syntactic simplicity when properties are available. An example of how this is a
    problem would be in an unsafe context where a pointer is taken to a public field.

    <h3>by value or by reference argument passing</h3>

    Whether an argument is passed by value or by reference is a characteristic of the type
    (i.e. whether it is a value type or a reference type).
    This is restricting and requires extra contextual information to parse.

    <h3>Virtual Functions</h3>

    Every language which claims to be object oriented since C++ has virtual functions,
    but virtual function are not strictly neccessary and even less so with the inclusion of interfaces. Virtual functions make
    it difficult to write correct and easily reusable code. It is my opinion that one of the biggest obstacles to reuse of
    object oriented code is due to improper designs involving virtual functions. Virtual functions also come with
    a performance and memroy usage overhead which is not neccessary for implementing run-time polymorphic objects.

    <h3>Properties</h3>

    This could be argued as a positive or negative point for the language. From a negative standpoint, they obfuscate what
    o.x = y means. i.e. Is this a simple variable assignment or actually a function call.
    One big disadvantage is that such a statement could surprise a programmer by having side effects and
    throwing an exception unrelated to assignment.

    <h3>Assignment Ambiguity</h3>

    With all of the features C# offers, one of the problems is that it now has a huge number of different ways to
    interpret certain statements depending on context. One of the most basic statements in an imperative programming
    language is the assignment, for which C# has too many potentially different meanings depending on context. For example :

    <code><![CDATA[[
      o.x = y;
    ]]></code>

    can mean very different things depending on the following conditions :

    <ul>
      <li>Is x or y a property or variable?</li>
      <li>Is the type of x a subtype of the type of y?</li>
      <li>Is the type of one of x and y an object and the other a value type?</li>
      <li>Are x and y both value types?</li>
      <li>Are x and y both reference type?</li>
      <li>Is there an implicit conversion happening?</li>
      <li>Is there an overloaded = operator?</li>
      <li>In an unsafe context is x or y a pointer?</li>
      <li>Is o an object reference or a struct value?</li>
    </ul>

    <h3>Lack of an Open-Source Development Process</h3>

    C# development is firmly controlled and guided by Microsoft. Like Java, any variation from the C# specifications set out by
    Microsoft will likely lead to lawsuits. With the language specification not open source we can not trust an organization
    with a vested commercial interest in the technology to allow the technology to move and adapt to the needs of the community
    when they potentially conflict with the financial goals of the corporation.
    <p/>
    Another way of stating this case is that by refusing to be open-source Microsoft is preventing any real bazaar-style
    development of the technology. See the document titled
    <a href="http://www.catb.org/~esr/writings/cathedral-bazaar/"><i>"The Cathedral and the Bazaar"</i></a> for more
    information about the bazaar style approach to software development.

    <h3>Summary</h3>

    Some of the points I have made are not neccessarily bad language design choices individually,
    but they do significantly contribute to my choice to not use C# for software development.
  </page>
  <page name="HeronFront : Introducing Interfaces into C++" label="heronfront">
    <h1>HeronFront : Introducing Interfaces into C++</h1>

    HeronFront is a pre-processor which introduces interface types into C++ by translating HeronFront files into standard C++.
    HeronFront was intended partly as a proof of concept of the viability of the <a href="cpp-iop.html">proposal to add interfaces to C++</a>.
    HeronFront is also by itself a very useful tool for generating standards compliant C++
    code that allows run-time polymorphism of objects without the speed and memory costs associated with using  virtual functions.
    The Circle example class below, for instance, when using interfaces executes at up to twice as fast and takes up half the memory.

    <blockquote>
    <b>-&gt; <a href="hfront.zip">download hfront.zip version 1.2.0 here</a> &lt;-</b>
    </blockquote>

    <h3>Why HeronFront is useful and important</h3>

    Emulating run-time polymorphic Interfaces in C++ requires the usage of ABC's. This technique comes at the cost of requiring
    virtualization of the desired interface functions. This is not always desirable from a design standpoint, also virtualization of
    functions can lead to decreased performance and a larger memory footprint for the object. The technique used by HeronFront
    can be accomplished manually but is almost prohibitively verbose and complex for a programmer to manage.

    <h3>About HeronFront files and interface types</h3>

    A HeronFront file is made up of interface declarations which look like the following example included in the download (shapes.hfront):

    <code><![CDATA[[
    interface IDrawable {
      contract:
        void Draw() const;
    };
    interface IMoveable {
      contract:
        void MoveTo(const Point& x);
        void MoveBy(const Point& x);
    };
    interface ISizeable {
      contract:
        void SetSize(int x);
        int GetSize() const;
    };
    interface IShape : IDrawable, IMoveable, ISizeable {
    };
    ]]></code>

    Interfaces are made up of a set of function declarations and are compatable with any type that provides implementations
    of the functions of the class. An interface variable can be assigned from anyother interface variable with the same
    type and can be assigned from any compatable object. An interface object stores a pointer internally to the passed object
    so care must be taken when assignign temporaries to interface objects.
    <p/>
    Here is a snippet of code taken from the download (hfront-test.cpp):

    <code><![CDATA[[
      struct Circle {
        Circle() { mSize = 0; };
        void Draw() const { /* noop */ };
        void MoveTo(const Point& x) { mCenter.MoveTo(x); Draw(); };
        void MoveBy(const Point& x) { MoveTo(GetPos().Plus(x)); };
        Point GetPos() const { return mCenter; };
        void SetSize(int x) { mSize = x; };
        int GetSize() const { return mSize; };
        Point mCenter;
        int mSize;
      };

      struct AbcDrawable {
        virtual void Draw() const = 0;
      };

      struct AbcPosition {
        virtual Point GetPos() const = 0;
      };

      struct AbcMoveable {
        virtual void MoveTo(const Point& x) = 0;
        virtual void MoveBy(const Point& x) = 0;
      };

      struct AbcSizeable {
        virtual void SetSize(int x) = 0;
        virtual int GetSize() const = 0;
      };

      struct AbcShape : public AbcDrawable, public AbcPosition, public AbcMoveable, public AbcSizeable {
      };

      struct NaiveCircle : public AbcShape {
        NaiveCircle() { mSize = 0; };
        void Draw() const { /* noop */ };
        void MoveTo(const Point& x) { mCenter.MoveTo(x); Draw(); };
        void MoveBy(const Point& x) { MoveTo(GetPos().Plus(x)); };
        Point GetPos() const { return mCenter; };
        void SetSize(int x) { mSize = x; };
        int GetSize() const { return mSize; };
        Point mCenter;
        int mSize;
      };

      template&lt;typename T&gt; void RunTest(T x, const char* s) {
        const int ITERS = 5000000;
        Point pt(0, 0);
        cout &lt;&lt; "sizeof(" &lt;&lt; s &lt;&lt; ") = " &lt;&lt; sizeof(T) &lt;&lt; endl;
        cout &lt;&lt; "timing " &lt;&lt; ITERS &lt;&lt; " calls to " &lt;&lt; s &lt;&lt; "::MoveBy() ... ";
        TimeIt t;
        x.MoveTo(pt);
        for (int i=0; i &lt; ITERS; i++)  {
          x.MoveBy(Point(i, i));
        }
      };

      int main()
      {
        Circle c1;
        NaiveCircle c2;

        RunTest&lt;IShape&gt;(IShape(c1), "IShape");
        RunTest&lt;AbcShape&&gt;(c2, "AbcShape&");
        RunTest&lt;IMoveable&gt;(IMoveable(c1), "IMoveable");
        RunTest&lt;AbcMoveable&&gt;(c2, "AbcMoveable&");
        RunTest&lt;Circle&&gt;(c1, "Circle&");
        RunTest&lt;NaiveCircle&&gt;(c2, "NaiveCircle&");
        RunTest&lt;Circle&gt;(c1, "Circle");
        RunTest&lt;NaiveCircle&gt;(c2, "NaiveCircle");

        cin.get();
        return 0;
      }
    ]]></code>

    This code when compiled with Borland 5.5 on my Windows XP, Intel Celeron
    system outputs the following results:

    <code><![CDATA[[
      sizeof(IShape) = 8
      timing 5000000 calls to IShape::MoveBy() ... time elapsed (msec): 187
      sizeof(AbcShape&) = 16
      timing 5000000 calls to AbcShape&::MoveBy() ... time elapsed (msec): 485
      sizeof(IMoveable) = 8
      timing 5000000 calls to IMoveable::MoveBy() ... time elapsed (msec): 203
      sizeof(AbcMoveable&) = 4
      timing 5000000 calls to AbcMoveable&::MoveBy() ... time elapsed (msec): 484
      sizeof(Circle&) = 12
      timing 5000000 calls to Circle&::MoveBy() ... time elapsed (msec): 156
      sizeof(NaiveCircle&) = 28
      timing 5000000 calls to NaiveCircle&::MoveBy() ... time elapsed (msec): 469
      sizeof(Circle) = 12
      timing 5000000 calls to Circle::MoveBy() ... time elapsed (msec): 79
      sizeof(NaiveCircle) = 28
      timing 5000000 calls to NaiveCircle::MoveBy() ... time elapsed (msec): 344
    ]]></code>

    <p/><hr/><p/>
  </page>
  <page name="People" label="people">
    <h2>People Involved in Heron</h2>

    Heron has been conceived, designed, implemented, and marketed by <a href="http://www.cdiggins.com">Christopher Diggins</a>.
    <p/>
    Several people have contributed directly and indirectly to the development of Heron:

    <ul>
      <li>Melanie Charbonneau - My beautiful and talented wife who made the Heron logo, designed the site for me, and
        has supported me both emotionally and financially for a long time.</li>
      <li>Jon Erikson - Editor of the C/C++ Users Journal and Dr. Dobbs Journal</li>
      <li>Peter Grogono - To whom I am extremely appreciative for his support, his time and his ideas</li>
      <li>My proof readers - Kris Unger, Daniel Zimmerman, Muthana Kubba and Matthew Wilson who have been a great
        support and very helpful reviewers</li>
      <li>Everyone at Skrud.net - Skrud for hosting a forum on Heron, and the other members for being so encouraging and supportive</li>
      <li>Bjarne Stroustrup - Whose integrity and ingenuity has served as an inspiration for me.</li>
    </ul>

    <h3>Seminars and Presentations</h3>

    If your organization is interested in learning more on Heron
    <a href="http://www.cdiggins.com">Christopher Diggins</a> is prepared to give powerpoint presentations on
    the Heron programming language customized to your specific needs and level of technical expertise.

    <h3>Contributing to Heron</h3>

    Contact <a href="http://www.cdiggins.com">Christopher Diggins</a>
    if you want to start contributing to Heron. Or jump into the discussion forums.

    <h3>Heron on your next project</h3>

    Take advantage of the fact that I am currently offering free support and consulting with regards
    to Heron, contact <a href="http://www.cdiggins.com">Christopher Diggins</a>.

    <h3>Other Consultants</h3>

    If you are interested in offering your products and services related to Heron here, I encourage you
    to send me your links!

    <p/>

    My email address:<br/>
    <b><a href="mailto:cdiggins@videotron.ca">cdiggins@videotron.ca</a></b>
  </page>
  <page name="C++ Proposal to Add Interfaces" label="cpp-iop">

    <h1>C++ Proposal : Interfaces</h1>
    <b>Original Authour</b> : Christopher Diggins<br/>
    <b>Date of Original Proposal</b> : April 13, 2004<br/>
    <b>Last Modification</b> : April 29, 2004<br/>

    <h2>Motivation #1 - Interfaces without Virtual Functions</h2>
    <ol>
    <li><h3>Problem</h3>
    Interface constructs as found in languages like Java are used to model
    looks-like and behaves-like object relationships. C++ does not support
    interfaces.
    <p/>
    </li>
    <li><h3>Common Workaround</h3>
    What is typically used to compensate for this deficiency are abstract base
    classes (classes with one or more pure virtual functions) commonly referred to
    as ABC's.
    </li>
    <p/>
    <li><h3>Deficiency of Common Workaround</h3>
    The principal problem of using an ABC as an interface is that it automatically
    declares the functions intended to model the interface as virtual.
    This is uneccessary and incorrect with regards to
    most definitions of an interface. This also leads to two major practical
    problems: 1) performance penalties due to superflous dispatching and
    inability of the compiler to inline calls where normally it would be
    appropriate. 2) object size penalties which increase linear with the number of
    interfaces modeled due to extra vtable pointers within the objects.
    The second problem is especially troublesome because design models that use interfaces typically
    call for multiple interfaces, thus compounding the penalties, and making
    many perfectly acceptable designs unusable in practice.
    </li>
    <p/>
    <li><h3>Writing out Interfaces Manually</h3>
    Given that the common solution of using ABC's to emulate interfaces may be deemed unacceptable
    for various reasons, the programmer is left with the alternative of writing their own
    interface types. This requires a significant amount of coding and is a complex
    endeavour. The redundnacy of typing can be overcome to a certain degree through the
    clever use of macros. The macro approach is often considered an undesirable solution
    for many reasons which leaves us in a position to consider a propsal for a change to the language.
    </li>
    </ol>

    <h2>Motivation #2 - Template Parameter Requirements Checking</h2>

    <ol>
    <li><h3>Problem</h3>
    As outlined in Stroutstrup 2003, Concept Checking - A more abstract compliment to type checking
    ( <a href="http://std.dkuug.dk/jtc1/sc22/wg21/docs/papers/2003/n1510.pdf">http://std.dkuug.dk/jtc1/sc22/wg21/docs/papers/2003/n1510.pdf</a> )
    C++ lacks a more general and abstract facility than type checking to express template parameter requirements.
    Interfaces are almost exactly the same as the function matching approach described in the Stoustrup paper, which he
    characterizes as being promising.
    </li>
    </ol>

    <h2>Description of Proposal for non-assignable interface variables</h2>

    Interfaces are an excellent canidate for consideration as a new features for C++ language
    due the fact that the represent a clear and well-defined semantic construct with
    relatively low impact on other aspects of the language itself.

    <ol>
    <li><h3>Declaring Interfaces</h3>

    Allow declaration of an interface type, like a class/struct but with only
    function declarations, i.e.:

    <code><![CDATA[[
      interface IFuBar {
        void FuBar();
      };
    ]]></code>
    </li>
    <p/>
    <li><h3>Interfaces Implementations</h3>

    Any object that has definitions for a complete set of functions with matching signatures as
    any given interface is said to implement that interface.
    </li>
    <p/>
    <li><h3>Interface Variables</h3>

    An interface variable is a non-assignable variable that can refer to any object that implements
    that interface. An interface variable stores a pointer to an object that implements
    an interface. An interface variable allows any function that is part of the interface
    to be invoked using dot notation. It is the intention of this proposal that
    an interface variable model its behaviour as closely to reference type as can reasonably
    be expected.
    </li>
    <p/>
    <li><h3>Constructing and Initializing Interface Variables</h3>

    Interface variables must be intialized with a variable which can either be an instance of a class or struct which
    implements the corresponding interface, or can be an interface variable of the same or derived type.

    <code><![CDATA[[
      SomeInterface i = x;
    ]]></code>

    Interpretation of the above statement:
    <ul>
      <li>if x is an instance of a class or struct then i stores internally the address of x</li>
      <li>if x is a pointer to a class or struct then i stores internally the value of x</li>
      <li>if x is an instance of another interface variable then i stores the address of the object referred to by x.</li>
    </ul>

    <!--
    <p/>
    <b>Special case:</b> : In order for an interface variable to refer to another interface variable instead of what it refers to
    would require a special operation which hasn't been covered in this proposal. It is recommended that this degenerate case simply not
    be addressed.
    -->
    <p/>
    </li>
    <li><h3>Assignment</h3>

    Interface variables can not be assigned to (i.e. treated as an l-value).
    </li>
    <p/>
    <li><h3>Invalid interface variables</h3>

    If an interface variable's target object is invalidated (i.e. prematurely destroyed)
    then the interface variable exhibits undefined behaviour when a member function is invoked.
    <p/>
    </li>
    <li><h3>Const Qualification</h3>

    Interface variables can be declared as const:

    <code><![CDATA[[
      const SomeInterface i = SomeObject;
    ]]></code>

    Const qualified interface variables behave like const references, only allow const member functions to be called, etc.
    </li>
    <p/>
    <li><h3>Extending the Lifetime of temporaries</h3>

    Using an interface variable as an lvalue extends the lifetime of that temporary in the same way that assigning a temporary
    to a reference does.

    </li>
    <p/>
    <li><h3>Typecasting Precedence</h3>

    A conversion from object to interface variable has precedence just above that of a user defined typecast, but less than all other
    conversions.
    </li>
    <p/>
    <li><h3>dynamic_cast</h3>

    Interface variables can be cast to the type of object they refer to using dynamic_cast, this returns a type-safe pointer
    to the internal object.
    </li>
    <p/>
    <li><h3>Interface Comparison</h3>

    An interface variable can be compared using <tt>==</tt>
    to another interface variable with the result being equivalent to a comparison of the
    internal object pointers.
    </li>
    <p/>
    <li><h3>Interface Arguments to Template Parameters</h3>

    Parameters to templates can be restricted to only interface types through the following syntax:

    <code><![CDATA[[
      <b>template</b>&lt;<b>interface</b> <i>argument_name</i>&gt; // ...
    ]]></code>
    </li>
    <p/>
    <li>
    <h3>Visibility Modifiers</h3>

    All interfaces functions are always public therefore no visibility modifers are allowed.
    </li>
    <p/>
    <li><h3>Inheritance</h3>

    An interface can inherit from one or more interfaces. Syntax of inheritance of interfaces is as follows:

    <code><![CDATA[[
      <i>inheritance_list</i> ::= <i>interface_name</i> [, <i>interface_name</i>]*

      <b>interface</b> <i>inteface_name</i> : <i>inheritance_list</i> {
        // ...
      };
    ]]></code>

    Notice that there are no qualifiers allowed before an interface name from the inheritance list.
    Interfaces only ever publically inherit from other interfaces. Classes or structs can not inherit
    from interfaces.
    </li>
    <p/>
    <li><h3>Template Parameter Requirements Checking</h3>
    Interfaces would be an excellent construct for checking requirements on template parameters. This could easily be done
    in the following syntax:
    <code><![CDATA[[
      <b>template</b>&lt;<b>class</b> <i>T</i> : <i>SomeInterface</i>&gt; <b>class</b> <b>SomeClass</b> {
        // ...
      };
    ]]></code>
    This would mean that T is required to implement SomeInterface implicitly.
    </li>
    </ol>
  </page>
  <page name="Xml to Html using Heron" label="doc2html">
    <h1>Xml to Html</h1>
    In order to demonstrate that Heron is a fully functional and effective language with
    a wide range of applicability I wrote an xml to html generator entirely in Heron called doc2html.
    I used this program to generate the entire site, along with the table of contents
    for the documentation from a single xml file.
    <p/>
    The source for this program is available with the heron download at <a href="downloads.html">downloads.html</a>.
  </page>
  <page name="Downloads" label="downloads">
    <h1>Heron Downloads</h1>

    The Heron download package includes the following files in windows .zip file:
    <ul>
      <li>Delphi Source code and Win32 binaries for hrn2cpp version 3.0.0, the Heron to C++ translator</li>
      <li>output2exe - The Visual C++ project files I use for compiling output from hrn2cpp</li>
      <li>heron.hpp - The library which is required to make output from hrn2cpp compilable</li>
      <li>core.heron - The heron core library file</li>
      <li>utils.parser.heron - The heron parser library</li>
      <li>utils.xmlparser.heron - The heron xml parser library</li>
      <li>doc2html.heron - The heron site generator program used to generate this site</li>
    </ul>

    <p/>
    --&gt; <a href="heron.zip">Download Heron</a> &lt;--
  </page>
  <page label="index" name="The Heron Programming Language">
    <h3><font color="#666633">About Heron</font></h3>

    Heron is a brand new open source, general purpose, imperative programming langauge.
    Heron is a language that focuses on facilitating the production
    of code that is efficient, low in defects and highly reusable.
    <p/>
    Heron supports several software development techniques and methodologies, including,
    but not limited to, object oriented programming (OOP), interface oriented programming (IOP),
    aspect oriented programming (AOP), generic programming, policy driven design
    and design by contract (DbC).
    The Heron syntax intentionally resembles a mix of C++ and Java, to be familiar and
    easy to learn for programmers familiar with the syntax of these language.

    <h3><font color="#666633">Heron Links</font></h3>

    <ul>
      <li><b><a href="downloads.html">download</a></b>
        Download source and binaries of hrn2cpp along with the doc2html program
        written in Heron that was used to generate this entire site from a single
        xml file.</li>
      <p/>
      <li><b><a href="toc.html">Heron Documentation / Specification</a></b>
        The Heron documentation describes the Heron programming language
        somewhat informally in a tutorial-like manner. The documentation
        is intended to be appropriate for programmers with varying levels
        of experience with programming.</li>
      <p/>
      <li><b><a href="heronfront.html">HeronFront</a></b>
        HeronFront is a small translator which generates efficient C++ code from
        interface declarations. The humble beginnings of hrn2cpp and part of a proof of concept
        for a <a href="cpp-iop.html">proposal to add interfaces to C++</a></li>
    </ul>

    <h3><font color="#666633">Heron Links - Off Site</font></h3>

    <ul>
      <li><b><a href="http://www.cdiggins.com">CDiggins.com</a></b>
      Home page for the seriously overtired inventor of Heron</li>
      <p/>
      <li><b><a href="http://forum.skrud.net/viewforum.php?f=18">Skrud.net</a></b>
      Online forums for discussing Heron.</li>
    </ul>

    <h3><font color="#666633">Heron Licensing</font></h3>

    Heron is completely free and open-source.
    One of the goals of Heron is to provide a new technology that can
    be advanced by the community at large
    without fear of reprisal from overly
    litigious corporations.
    <p/>
    Heron, and everything else on this site, is
    released under the Open Software license version 2.1.
    This means you can do practically anything you want with the
    materials including sell them which I encourage.
  </page>
  <page label="benchmarks" name="Heron Benchmarks and Performance">
    <h3>The Heron benchmarks are currently unavailable</h3>
    <b>The benchmarks are high priority and are expected back by July 14th at the latest.</b>
  </page>
</heronsite>
By viewing downloads associated with this article you agree to the Terms of Service and the article's licence.
If a file you wish to view isn't highlighted, and is a text file (not binary), please let us know and we'll add colourisation support for it.
License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.
A list of licenses authors might use can be found here
Written By
Christopher Diggins
Software Developer Ara 3D
Canada
I am the designer of the Plato programming language and I am the founder of Ara 3D. I can be reached via email at cdiggins@gmail.com
Parsing XML in C++ using the YARD Parser

License

Comments and Discussions