Multicolor Text Strings with Managed Code!

Dragonaur

Rate me:

4.40/5 (7 votes)

16 Apr 2007CPOL15 min read

49.5K

870

.NET makes multi color strings 'nigh impossible. In this article, we dredge around in Win32 and GDI32 to find the solution we want for a C# editor project.

Introduction

How many times have you wanted multi-colored text when working in managed .NET code? Always, never? Taking a hard look at the System.Drawing namespace, the author here must assume that the answer is never! Why? After scouring the Graphics.DrawString() API, I simply can't find any combination of calls that will help me easily render something like you see in the opening image for this article.

Background

From my point of view, this particular frustration is actually not much of a surprise. System.Drawing looks to be derived for easy use with the System.Windows.Forms code base. How many times have you seen a multi-color string in a boring in-house Windows(c) Forms application? Again the answer is probably: close to zero.

So why bother? I'm sure my heading image gives me away! Text editing...from the first time I started tapping the membrane keys of a Timex-Sinclair micro computer, I've wanted to master the art of rendering text to the screen. Say what you will about that machine, but for me, it was the entry way to the grand world of programming. Upwards from my Vic-20, Commodore-64, and through my first IBM-PC/XT computer, I've blitted and rendered circles, lines, and the Utah teapot. I've printed text documents with an NROFF style text processor and written no end of text parsers. I've even written my own mini forms package for an in-house cash register system. But one program I haven't written in all these years is a word processor of any sort!

In this modern age, text editors are the one thing we buy or acquire through open source in ready made form. One of the last things a modern programmer is going to be asked to write is a text editor. And even though I spend most my day working on somebody else's programming problems, the hobbyist in me refuses to die! So recalling the fun of programming on those glory boxes of old, let us consider anew, the problem of rendering text, from the context of .NET and the Microsoft Windows APIs.

Windows Programming

No real program can be written without a consideration to the environment it will be executed in. Given my background in Windows programming and the nature of the site I'm writing this article for, it's no surprise I would choose to make my editor run under some version of Windows. And just as time has marched on from those computers of old, we've witnessed evolution in the operating system world as well! Unmanaged code is passe. Managed code is in!

At the risk of further delaying the introduction of real code in this article, I would like to say a few words about managed coding. I've watched C style programming develop into C++, Sun's Java(c) and eventually Microsoft's C#. I've written code using the C runtime and malloc() functions. Then I moved onto Microsoft's Component Object Model (COM) and its IUnknown reference counting mechanism. I don't think I'm saying much when I say writing managed code is a dream compared to malloc() and any IUnknown unmanaged memory scheme! So it was natural that I would start with the system namespace while I looked for a way to solve my text rendering problem. But all the examples therein seemed to revolve around the MeasureString() method. Here is one of the smaller C++ samples I lifted from Microsoft MSDN. The C# example for MeasureString() is huge, but revolves around GDI+ style solutions like the one below:

Graphics graphics(hdc);
WCHAR string[] = L"Measure Text";
Font font(L"Arial", 16);
RectF layoutRect(0, 0, 100, 50);
RectF boundRect;

graphics.MeasureString(string, 12, &font, layoutRect, &boundRect);
graphics.DrawRectangle(&Pen(Color(255, 0, 0, 0)), boundRect);

But MeasureString() takes strings for its arguments and there's all these rectangles that have to be passed around. Busy, busy. What I really need is the width of each character that I intend to render. I had no luck finding such a way to get that in the managed world. If you've seen otherwise, let me know. But for the length of this article, the die is already cast!

Also frustrating, all these text rendering functions keep revolving around use of the string class. Displaying a multi-color string was going to involve some gross string surgery, likely being a huge waste of both time and space. Moreover, as any informed managed coder will tell you, managed strings are immutable. Any text editor is going to run into some serious design issues if it relies on string to do the heavy lifting.

Back to the Future: GDI32 and USER32 to the Rescue!

Fortunately, I'm an old timer. I know Windows has had great text APIs, I just needed to look beyond the managed code world to find my answers. When I did look, I found my answer in about 10 minutes!

BOOL GetCharABCWidthsFloat(
  HDC hdc,            // handle to DC
  UINT iFirstChar,    // first character in range
  UINT iLastChar,     // last character in range
  LPABCFLOAT lpABCF   // array of character widths
);

But the poor GetCharABCWidthsFloat API is locked behind the bars of the unmanaged code world. However, it turns out that getting at the API and all the others I would eventually need is quite easy. It seems the Microsoft guys knew that no matter how great the .NET Framework was, there would always be some old coot who wanted access to some dusty old API. I'm not sure if these are C# 1.0 or 2.0 only constructs, but getting access to those calls is as easy as the following definitions:

[StructLayout( LayoutKind.Sequential, CharSet = CharSet.Auto )]
public struct AbcFloat {
    public float flA;
    public float flB;
    public float flC;
};

unsafe public class Gdi32 {
    public static float flInchesPerPoint = .013837F; // Approx 1/72

    /// <summary>
    /// The intensity for each argument is in the range 0 through 255.
    /// If all three intensities are zero, the result is black. If all
    /// three intensities are 255, the result is white.
    /// </summary>
    /// <remarks>
    /// This function is directly from WinGDI.h
    /// </remarks>
    public static UInt32 SetRGB(byte r, byte g, byte b) {
        return ( (UInt32)( r | ( (UInt16)g << 8 ) ) | ( ( (UInt32)b << 16 ) ) );
    }

    [DllImport( "gdi32.DLL", EntryPoint = "TextOutW", SetLastError = true )]
    public static extern bool TextOut(IntPtr hDC, Int32 iXStart, Int32 iYStart,
                                          char* pwcStart, Int32 iLength);
    [DllImport( "gdi32.DLL", EntryPoint = "SelectObject", SetLastError = true )]
    public static extern IntPtr SelectObject(IntPtr hDC, IntPtr hGDIObject);
    [DllImport( "gdi32.DLL", EntryPoint = 
                         "GetCharABCWidthsFloatW", SetLastError = true )]
    public static extern bool GetCharABCWidthsFloat
                                      (IntPtr hDC, UInt32 iFirst, UInt32 iLast,
                                      [In, Out] AbcFloat[] rgAbc);
    [DllImport( "gdi32.DLL", EntryPoint = "CreateSolidBrush", SetLastError = true )]
    public static extern IntPtr CreateSolidBrush(UInt32 argbColor);
    [DllImport( "gdi32.DLL", EntryPoint = "DeleteObject", SetLastError = true )]
    public static extern bool DeleteObject(IntPtr hBrush);
    [DllImport( "gdi32.DLL", EntryPoint = "SetTextColor", SetLastError = true )]
    public static extern UInt32 SetTextColor(IntPtr hDC, UInt32 uiColor);
    [DllImport( "gdi32.DLL", EntryPoint = "SetBkColor", SetLastError = true )]
    public static extern UInt32 SetBackColor(IntPtr hDC, UInt32 uiColor);
}

unsafe public class User32 {
    [DllImport( "User32.DLL", EntryPoint = "GetDC", SetLastError = true )]
    public static extern IntPtr GetDC(IntPtr hWnd);
    [DllImport( "User32.DLL", EntryPoint = "ReleaseDC", SetLastError = true )]
    public static extern int ReleaseDC(IntPtr hWnd, IntPtr hDC);
}

Now we're rocking! Through the magic of the unsafe keyword, the [DLLImport] attribute and the [StructLayout] attribute, we have entered the twilight zone! We control the horizontal and we control the vertical. We also can crash and burn just a little easier than we used to. It is a shame really, since it wouldn't take much to make safe managed calls for these same functions. But you know, if you want to make a cake, you've got to break a few eggs.

I could stop here. Personally, I figure any developer could make the necessary inferences to lead to a working editor. But if you knew anything about this topic, it is likely you would have stopped reading long ago! So let's continue on our road to discovery and see how we can merge the world of managed text and the world of unmanaged rendering!

I'm going to leave the descriptions of the above keyword and attributes since they're fairly self explanatory and easy to look up. I could write a whole different article on managed/unmanaged interactions. At least, the Microsoft dudes have made some of these translations as easy as they should be!

One More Piece

So we have a splattering of APIs. We have our hands on unsafe code yet we are filled with good intent! The next step is getting the measurement of a bit of text on a character by character basis as we originally set out to do. When I worked for a large corporate software sweatshop, I used to ask interview questions along these lines... How can I determine if one of any set of characters, say, 'a', 'r', and 't' exist in some arbitrary string? About 50% of the time, I would get some answer something like this:

// This is a O(n^2) solution
foreach( char cChar in "an arbitrary string" ) {
    foreach( char cTest in "art" ) {
        if( cChar == cTest )
            return( true );
        }
    }
    return( false );

The big O notation of complexity of the above program is n squared. In other words, the time spent is roughly the square of the number of characters in any one string. Can I work up a faster answer? Of course, the answer is yes.

// some where globally we define this array and initialize it once.
bool[] rgAscii = new bool[256]; // let's assume this is initialized to all false.
rgAscii[65] = true; // Now, set the letter 'a' true.
// and so on....

// we call this code for each new "arbitrary string"
foreach( char cChar in "an arbitrary string" ) {
    if( rgAscii[(int)cChar] ) // The "trick" is here!
        return( true );
}
return( false );

The trick is to give up a little space for a decrease in time. If the lookup characters don't change often, we get a net win. If you answered with the second program first you were 80% on your way to a thumbs up in my interview! More subtle, is the trick of looking up the character by its index in the array of booleans. Say what you will, but many a programmer could not come up with this indexing solution even with heavy hinting! It's one of those little programming things you learn to do over time. If you figured it out so soon in your programming experience, kudos to you, you're hired!

Now, these last two programs were pseudo C# code functions and I'm not going to nit over syntax, etc. But they show the core of the trick we'll use to measure our text for our multi-color text function. To put our brilliant plan into action, we first need an IntPtr to the window Display Context, DC. On the System.Windows.Forms.Control class, use the Handle property to get a IntPtr to the window handle. And with that, do something like this in a class:

/// <summary>
/// Depending on whether we start getting the focus or not we might need to
/// initialize ourselves from the window handle instead of the DC in the paint event.
/// </summary>
/// <param name="hWnd">IntPtr to a window handle.</param>
/// <seealso cref="InitFromDC"/>
protected void InitFromWnd( IntPtr hWnd )
{
    IntPtr hDC = User32.GetDC( this.Handle );
    IntPtr hFontOld = Gdi32.SelectObject( hDC, this.Font.ToHfont() );

    InitFromDC( hDC );

    Gdi32.SelectObject( hDC, hFontOld );
    User32.ReleaseDC( this.Handle, hDC );
}

AbcFloat[] _rgAbcWidths;

/// <summary>
/// Get the ABC widths for ANSI. Also, walk each line in the buffer and
/// create an edit line for it. There's probably a nice code page way of
/// initializing widths I need.
/// </summary>
/// <param name="hDC">A IntPtr pointing to the display context handle.</param>
/// <seealso cref="InitFromWnd"/>
protected void InitFromDC(IntPtr hDC)
{
    if( _rgAbcWidths == null ) {
        _rgAbcWidths = new AbcFloat[256];

        Gdi32.GetCharABCWidthsFloat( hDC, 0, 255, _rgAbcWidths );
    }
}

Now we have everything we need to measure a string... eh? Are you thinking something about exceptions? I definitely have a mind about exception handling, but this article is about text handling. I'll give you my philosophy on handling exceptions in another article! We see the old Select/Release pattern of Win32 is back. Any program really handling exceptions would have to make sure this call was properly scoped so that we don't leak that DC.

Continuing on, let's start measuring text in our own super macho style. Using the _rgAbcWidths array we created and initialized above, we're ready to take this big step.

char[]  _rgLine;            // our line of text
float[] _rgCumulativeWidth; // our corresponding measurements we will make.

/// <summary>
/// Measure the width of the current string. Currently I pass the abc
/// widths for the character set I know I'm using. This won't scale up
/// if I allow other languages.
/// </summary>
/// <param name="rgAbcWidths">an AbcWidths for all the characters
/// we expect to encounter on this line.</param>
public void MeasureWidth(AbcFloat[] rgAbcWidths) {
    if( _rgCumulativeWidth == null ) {
        _rgCumulativeWidth = new float[_rgLine.Length];
    }
    if( _rgCumulativeWidth.Length < _rgLine.Length ) {
        _rgCumulativeWidth = new float[_rgLine.Length + 10];
    }

    if( _rgCumulativeWidth != null ) {
        float flSeed = 0;
        for( int i = 0; i < _rgLine.Length; ++i ) {
            char cChar = _rgLine[i];
            float flPixels = rgAbcWidths[cChar].flA +
                            rgAbcWidths[cChar].flB +
                            rgAbcWidths[cChar].flC;
                            flSeed += flPixels;
            _rgCumulativeWidth[i] = flSeed;
        }
    }
}

As you can see, for every text array we might want to display, we have a corresponding CumulativeWidth array which marks the position every character will end up at. You might think it a drag to compute this array. But it doesn't happen often, only when the character array changes. In any editor, the calculation is only going to happen on one line at any one moment and even then the human typing will never notice your multi-gigahertz multi-core computing monster working on the problem as he or she types.

Since it is likely we'll have a variety of lines in our editor, it only makes sense that we would package this bit of code in a separate class from the code we used to generate the array of ABCWidths for all the ASCII characters. Later, I'll wrap it all up into a little demo set of classes that you can compile and run on your own.

One Step Beyond

So we've measured the string but what the heck for? If I was a better writer, I would probably have eluded to this final step sooner. But looking it over, this article really hasn't been too long up to this point, so I don't think I've kept you in suspense for too long. It's a big chunk of code but rather simple in what it does. So let's take a deep breath and dive right in!

/// <summary>
/// Render this line at the given position. Render only the elements that
/// match the current color. There is no clipping. Any line without parse
/// info WILL NOT be rendered.
/// </summary>
/// <param name="hDC">IntPtr to the DC.</param>
/// <param name="pntTopLeft">Topleft point to start at.</param>
/// <param name="iCurrentColor">The color currently in use. Only render
/// the element if the color indices match. Negative numbers indicate selections.
/// </param>
public void Render(IntPtr hDC, PointF pntTopLeft, int iCurrentColor)
{
    char[]                    rgText = _rgLine;
    IEnumerator<IMemoryRange> oEnum  = null;

    // It would be nice to re-use these enumerators.
    if( iCurrentColor > -1 && this.Elements != null )
        oEnum = this.Elements.GetEnumerator();

    if( oEnum != null ) {
        unsafe {
        fixed( char* pwcText = rgText ) {
        while( oEnum.MoveNext() ) {
        IMemoryRange oElem      = oEnum.Current;
        int          iMaxLength = rgText.Length - oElem.Offset;
        int          iLength    = oElem.Length > iMaxLength ? iMaxLength : oElem.Length;

    if( oElem.ColorIndex == iCurrentColor &&
        oElem.Length > 0 &&
        oElem.Offset < rgText.Length ) { // Little hack to deal with cr/lf issues.
        Gdi32.TextOut( hDC,
        (Int32)( pntTopLeft.X + this.CumulativeWidth( oElem.Offset ) ),
        (Int32)( pntTopLeft.Y ),
        &pwcText[oElem.Offset],
        iLength );
     }
    } // end while
   } // end fixed
  } // end unsafe
 }
} // End Render()

We'll stick this bit of code along with the MeasureWidth() method we wrote above. But what's going on? Well as you know, you can only select one pen into a DC at a time. In our case, if you were looking at the functions we imported from GDI32, you'll see we are limited by SetTextColor which probably deep in the bowels of Windows sets a pen of some sort, or uses the current pen. I haven't gotten that far in my research yet. The bottom line is that we can only render using one color at a time.

So why not just set the text color on a per character basis, perhaps calling our GDI32 API SetTextColor only when the color actually changes from one character to the next? Well, this involves a little bit of Windows trivia which I believe is as valid today as it was way back in 1985 or so when Windows 1.0 first debuted. It is expensive to change pens. Now, I've asked around a few friends and from what I'm hearing, this problem is still true. If so, and even if it only takes a few split seconds to change a color, given we want the best performance, we want to attempt to render all parts of the string that are colored with the same color all at once!

To do this, we need to know precisely where every character will get rendered so that in the end, our patchwork string will look just as naturally spaced as its boring monochromatic nephew. Yes, with our ABC widths measurements we have the means to achieve this very goal! It turns out that the code for this new way of rendering isn't even too ugly over the straight line by line way any normal person would expect to implement. Precise measurements allow us to place individual characters with the same precision of the built in TextOut() function. Of course, calling the function for each character might be onerous but as we can see from my Figure 1, it's not all that bad. Perhaps way back in the 4.77 Mhz days of 16 bit Windows machines with 64K of memory, I might have had some trouble. But no more!

There is one little tidbit hanging out innocently in this method which I should spend a few moments explaining and that is the IMemoryRange interface being used. There are many ways we can represent the color information for a particular line. In my case, lurking behind these simple lines of code is a heavy hitter context free grammar parser that I wrote a few years back to demonstrate how CFGs totally destroy the simpler regular expression parsing, finite state automata used in 100% of the language parsers I've seen that don't deal with a real programming language. Yes, that too, is a different article in the making. Anyway, to access this parser data I created a small interface to the parsed unit so I can feed the parse units directly to my editor.

Defence in Depth

Our "unsafe" construct is the text offsetting piece of code, &pwcText[oElem.Offset]. Given our text measurements, we still need to access the corresponding portion of the text array so that we can render it out. We could have done this just as easily with a call something like this on the Graphics class:

TryTextOutManaged( iX, iY, rgCharArray, oElem.Offset, iLength );

It is a call that would be safe as safe. rgCharArray could be a char[] with its built in Length property. Any implementation could easily check the bounds and return false if there was any problem. Or it could be implemented so that it throws an exception.

But we are trail blazers. We don't get the luxury of a safe and protected world. So we have to attempt to build a safe construct that won't crash even if abused. Here, our safety bound is the length of the character array. No offset/length combination should make us try to read beyond the length of the character array. And we only want to read elements that represent valid parse data and not some old slop swimming around in an unused portion of our array. In any case, we don't want some weird hacker trick loading up code into our video memory or some other such black hat activity. Hopefully, these considerations will keep us safe.

So until Microsoft reads my article and admits they need to change the Graphics class, we can take matters into our own hands and get what we want, right now!

The Last Step

All that remains is to wrap it all up into a call on paint that gets the job done...

/// <summary>
/// Our paint function
/// </summary>
/// <param name="oE"></param>
/// <remarks>I should probably capture exceptions so we don't leak
/// DC's or anything if some outer function decides to start catching
/// exceptions to try to keep the program running.</remarks>
protected override void OnPaint( PaintEventArgs oE )
{
    IntPtr hDC      = oE.Graphics.GetHdc();
    IntPtr hWnd     = this.Handle;
    IntPtr hFontOld = Gdi32.SelectObject( hDC, this.Font.ToHfont() );

    InitFromDC( hDC );

    // Render the lines one color at a time.
    for( int iColor = 0; iColor < _rgColors.Length; ++iColor ) {
        _pntTopLeft = new PointF( 10, 10 );

        Gdi32.SetTextColor( hDC, _rgColors[iColor] );

        RenderLines( hDC, _pntTopLeft, iColor );
    }

    // We're all done. Select the old font back and release the DC.
    Gdi32.SelectObject( hDC, hFontOld );
    oE.Graphics.ReleaseHdc( hDC );
}

RenderLines is a call which walks through all our line/cumulativewidth structures calling the Render method we wrote previously. It's easy to imagine, but if you don't believe me, or you want to see it for real, just load up the project source I've included at the top of this article. If you have any problems getting the project to build, just remember to enable unsafe blocks, in the project and that should take care of any issues I noticed.

Wrap Up

Having been a programmer since the very beginning of the micro-computer era, I've watched programming go from a fun hobby back to the over produced sweatshop inducing activity it was back when computers were made out of vacuum tubes! The complexity of modern operating system environments is staggering! It's a shame. I still don't understand why I need more than 4 megabytes of RAM just to boot up my computer! But such as it is, we can still take control of things and make easy to use, powerful programs, to suit our needs. It just takes a little courage and inventiveness.

With that, I hope you enjoyed this little treatise I wrote. I would be pleased if this was the beginning of a long series of articles that remind you of those old BYTE columns like "Circuit Cellar" by Steve Ciarcia where you could actually build something useful out of simple electronic components! Or maybe something like the classic "Programming Windows" by Charles Petzold. When a few lines of code could make something wonderful happen!

So feel free to lift the code from this article. I would be pleased if you include a reference back to me at dragonaur2000@yahoo.com.
Remember this code comes with no warranties, expressed or implied! Hobby code tends to involve quite a bit of crashing! Only you can determine the suitability of any piece of code in your own application.

About the Author

Sean Johnson, aka, Dragonaur, is a mild mannered programmer by day and a mild mannered cartoonist wannabe by night! He remembers the day when drawing circles and text on the screen was easy and hopes programming as a hobby is never destroyed by expensive or overly complicated software systems! You can see his cartooning endeavors at at "Dragonaur" the comic.

History

17^th April, 2007: This is 1.0! But remember C programs start at 0, crash and burn baby!

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By

Dragonaur

Web Developer

United States

C, C++, Smalltalk, Pascal, HTML, ASPX, Win32, .Net All around Geek and Cartoonist.

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.