Click here to Skip to main content
Click here to Skip to main content

Additional information on the RTF specification

, 11 Jan 2005
Rate this:
Please Sign up or sign in to vote.
Information I have figured out about the RTF spec that is not documented by Microsoft.

Introduction

I have been through numerous battles trying to figure out the RTF format. This document is both a list of open questions I have as well as items I think I have figured out. If you can shed any more light on any of these, please shoot me an e-mail at dave@thielen.com.

Note: While I try to update this article here, the copy that is guaranteed to be the most recent is at http://dave.thielen.com/articles/The%20RTF%20Spec.htm.

RTF format questions

  1. \picw and \pich are defined as the picture width and height in pixels for a bitmap image. However, Word 2000 saves them as values that are 35.28 times the bitmap size.
  2. What is the relation between \trleft, \clwWidth, and \cellx? It seems that \trleft + \clwWidth == \cellx (first cell). But, that is not always true (it's not true in the RTF doc example).
  3. How does \trgaph relate to the \trpadd* tags that set cell padding?
  4. There is both row padding \trpadd* and cell padding \clpadd*. Which holds, and if it's the cell padding, why is the row padding there?
  5. The spec says \listoverrideformat will always be followed by a number. But, Word 2000 emits the tag with no number. And, how do you tell which level a value is overridden for?
  6. It appears that after the last \cell in a row, you must put “\pard \ql \li0\ri0\widctlpar\intbl” before doing the {\trowd…\row}, or Word will GPF. Exactly what is necessary here (and why)? Also, this comes after the paragraph row text – so what paragraph are these values assigned to?
  7. \slN – what is N in – twips?
  8. \trftsWidth says it is units for clwWidth. Isn’t it units for trwWidth?
  9. A " character is saved as \'94 - which is not Unicode for a quote - what is going on here?
  10. What is \faauto? It's used a lot but never explained.
  11. \trautofit does have a (usually small) effect even if clwWidth and trwWidth are set!
  12. The width of the last cell in a row seems to be set by \cellx, not clwWidth.
  13. A bullet character in a list is saved as a \u-3913 (0xf0b7). 0xb7 is Unicode for a bullet - but that is a much smaller bullet than the one Word displays. 0xf*** is for user-defined chars, so where is Word getting this from?

RTF format answers

  1. Word will write the normal style out as the first entry in the stylesheet with no style number. However, OpenOffice will write it out as the non-first entry, and will give it a style number.
  2. If there is a \pard\plain text instead of a \pard\plain\fs24 text, then Word 2000 will display the text as 10 point - even though the spec says that the default for fs is 24 points.
  3. What’s the difference between \line, \lbr, and \par? Well, \lbr3 == \line (not sure what \lbr0-2 means), and while it is a hard line break like a \par, it does not start a new paragraph, and therefore the next line follows the left indent, not the first indent.
  4. Does each cell have its own paragraph formatting attributes? It appears not, except for vertical alignment. Instead, there are standard paragraphs and character formatting within a cell.
  5. Why are there row border values if each cell has border values? I don’t know, but the cell border settings are what Word uses.
  6. \plain means reset character formatting to nothing (bold, etc.) on 12 point (or is it 10?) and the document default font. The values in style Normal are not used. And, the only reset value that is not hardcoded across all documents is the font number. (Is there a list of what tags this resets?)
  7. \pard means reset all paragraph formatting to default values (mostly 0’s). It does not use settings in the Normal style or any document level settings – everything is set to a hardcoded default. (Is there a list of what tags this resets?)
  8. \s identifies the style for that paragraph – but has no effect on the format of that paragraph. In other words, nothing is changed in the formatting of a paragraph by the \s tag, all formatting comes from the formatting tags appearing in that paragraph.
  9. In every RTF doc I have seen, \intbl precedes \itapN. The docs don’t say that’s required, but I have a feeling most RTF readers will blow up if this order is reversed.
  10. There is no documentation for \brdrnone – what is it? I assume no border, but it still takes up the border width with blank space.
  11. Word sometimes writes a table paragraph with no \trowd…\row. In this case, all you have is a \intbl, and in that case, use the table and cell settings from the previous paragraph. I have only seen this happen with outer tables, not nested tables.

Article home - http://dave.thielen.com/articles/The%20RTF%20Spec.htm.

This article may be freely copied as long as it is copied in its entirety.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

Share

About the Author

David Thielen
Chief Technology Officer Windward Studios
United States United States
CTO/founder - Windward Studios

Comments and Discussions

 
GeneralTables/Tab Sizes Pinmemberqweqwe4-Nov-07 4:10 
GeneralLatest version of the spec PinmemberDavid Thielen25-Jan-05 11:30 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web01 | 2.8.141022.1 | Last Updated 11 Jan 2005
Article Copyright 2005 by David Thielen
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid