|
Intent
Define a representation of data that facilitates a separation of physical and logical views of the data and its supporting metadata constructs.
Motivation
A common programming requirement is to load raw data from a physical data store such as a database, XML file, etc and to provide a "view" of the data that is logical to an application's needs. This logical view of the data will often include certain metadata such as possible values or ranges which can be applied to a particular part of the data. It is not desirable to always implement each logical view of the data as its own class (or set of classes), because this promotes code duplication. In addition, it is often not desirable for various portions of an application to share a single logical view because this promotes overly complex designs which become tightly coupled with the implementation.
Consider a contact management application. In such an application, there is a need to store and display information that may include names, addresses, phone numbers and any other desirable data field. This information is typically stored in some form of RDBMS and accessed using the SQL language. Typically when working with RDBMS software, it is necessary and desirable to structure the storage of data into various tables, and implement integrity constraints accordingly. This process of normalization is intended to improve insert/update performance and reduce redundancy. Unfortunately, this process usually results in the logical view of the data becoming further separated from the physical, requiring the application to "map" the physical structure into a more meaningful view. This mapping process may be very simple, but can become quite complicated, when the separation increases.
In addition to the mapping of raw data, a contact management application will need to maintain certain metadata. An example of this meta data is apparent when working with the "Name prefix" portion of the data. A name prefix will typically be one of a set of values such as Mr., Mrs., Ms., Miss., etc. This list of values will need to be available to various portions of the application, to facilitate user-interface specific functionality and to provide validation support. Different portions of the raw data will have varying metadata associated with it. Metadata can include value lists, validation rules, application specific properties, etc.
We can solve this problem by designing a basic "Entity" class that encapsulates the raw data, and binds the associated meta data to it and provides a common interface for accessing and manipulating the raw data. This Entity class will hide the specifics of data persistence from the rest of the application and provide a basis for many data-centric controls. The ability to implement data-centric functionality, decouples the data from the application and facilitates code reuse.
An added benefit of the Entity pattern is the ability to define multiple logical views of a single physical representation of data. This ability can be critical to applications requiring a high level of security or customization.
Applicability
Use the Entity pattern when
-
the physical representation of data diverges from the logical representation desired for an application.
-
there is a need for more than one logical representation of the same physical representation.
-
the physical representation of data is subject to change, while the logical view remains consistent. This often happens when there is a need to persist data to or from disparate physical formats.
-
more than one data set needs to share a common set of functionality.
-
there is an extensive amount of metadata associated with the view of the data.
-
there is a need for transactional behavior within the logical view of the data.
Structure
Sorry guys, I don't have a good way to draw the structure for this, though I wish I did. So, let me describe it with as few words as possible.
- Entities contain:
- One or more Fields
- Zero or more Properties
- Zero or more Rules
- Zero or more Comments
- Fields contain:
- Zero of more child Fields
- Zero or more Values
- Zero or more Properties
- Zero or more Rules
- Zero or more Comments
- Values contain:
- Zero or more Properties
- Zero or more Comments
Participants
- Entity - declares an interface for representing a logical view of data and implements support for basic functionality
- Field - implements support for type-specific (and potentially type-safe) in-memory data storage as well as logical-only, non-storage fields. Also implements support for collections of contained entity objects. A Field participant will typically have specialized-derived types such as CollectionField, GroupField, TextField, 32BitSignedField, 64BitSignedField, DateField, TimeField, BlobField, etc.
- Value - declares an interface for representing one possible value for a given Field object
- Rule - declares an interface for non-domain and non-application specific rules as well as application specific rules when needed. A Rule participant will typically have specialized-derived types such as MinLengthRule, MaxLengthRule, ValidCharsRule, ValidDateRangeRule, etc.
- Property - implements support for extensible application and domain specific configuration options at the Entity, Field and Value levels
- StorageEngine - declares an interface for persisting the logical view of data to and from its physical representation
- Comment - declares an interface for attaching comments to any element of the entity interface, including the Entity, Field, Value, Rule, Property and StorageEngine participants.
Consequences
The Entity pattern has the following benefits and liabilities:
- It isolates data persistence from the application. Most applications attempt to separate the data access layer from the presentation layer for good reason. Doing so decouples the disparate portions of the application, facilitating easier changes. The Entity pattern facilitates this decoupling implicitly by its design.
- It promotes sharing of data validation rules. It is not desirable to write redundant code. The Entity class provides a basic interface for accessing data in a logical format, rule objects can be created to implement specific rules such as range checking, value list limits, date formats, NULL values, etc. These rules can be written once and applied to all Entity objects regardless of the data they represent.
- It facilitates data-centric application configuration. Often, it is necessary for an application to behave differently based on what data it is acting upon. This modality is often implemented such that, the implementation expects a very specific type of object for each mode. This modality often forces an unnecessary duplication of code or overly complex class hierarchies, because each type of object may need the same support with only minor differences. The Entity pattern facilitates data driven configuration, by providing application defined properties at the Entity, Field and Value levels, which can be interpreted by an application at run-time to determine the desired behavior.
- It facilitates self-documenting data models. The Entity pattern facilitates self documentation because, the Entity class knows the logical view of the data and provides an interface for interrogating this view at runtime, much the same way that iterators provide a common accessor methodology. This capability can be important during the documentation phase of development, but can also prove invaluable at runtime, because dynamically adding fields/values/rules can be detected and dealt with accordingly, thus facilitating features such as user-defined custom fields, etc.
- It facilitates multiple physical representations of the data. It is often necessary to persist data to and from varying physical representations. This may include different RDBMS platforms, XML files, ASCII files, etc.
- It facilitates data security. By allowing the same physical data to be represented with multiple logical views, it is possible for an application to define custom views based on the role of the user or based on the component of the application that is accessing the data. These custom views need only provide the minimal number of necessary fields and features thus hiding the rest of the data.
- It can cause type-safety concerns. An implementation of this pattern may abstract the raw data in such a way that, it does not provide type-safe access to the data. This may not be a significant disadvantage if the other benefits outweigh the type-safety concerns. It is possible that an implementation of this pattern may support strong type-safety through code generation or generics. Such an implementation may be more complex or less flexible.
Implementation
Here are some useful techniques for implementing the Entity pattern.
- The logical view of an entity should be configured, not programmed. An Entity object is defined by its properties, fields, values and rules. This definition can be represented through a set of database tables, an XML file or any other appropriate means. The definition should thoroughly define each aspect of the entity. This definition should be loaded at run-time to automatically configure the logical view of the data.
- Implement Entity derived classes as generics. It may be desirable to adapt entity classes at compile-time, to support special domain-specific features. Though the base Entity class will not be a generic, the derived classes can be implemented to take an adaptor object as a customization parameter. This adapter object may then be used across many entity implementations.
- Implement several runtime customizable StorageEngine derived classes. Instead of using one StorageEngine class for each distinct logical - physical view combination, it is desirable to implement the StorageEngine classes for each physical storage medium. The StorageEngine derived classes can use the Entity classes provided interface to determine proper mapping of in-memory data to and from the physical format. This would promote a greater level of reuse by allowing numerous logical views to use a single shared StorageEngine class to persist their data.
- The internal representation of data within the entity object should be type-safe wherever possible. Doing so helps to insure that, data remains intact and that the integrity of the data cannot be easily compromised by unexpected occurrences.
- Support for NULL data should be dealt with here. Often, dealing with NULL data can be very frustrating in programming languages, because there is no explicit support, at the language level, for NULL values.
- Use a Class Factory pattern for creating Entity objects and StorageEngine objects, thus further decoupling the application from the data and storage needs.
-
Implement a generic base class to represent behaviors and actions which can be taken based on the logical view of the data. The derived classes could potentially be implemented as generics, to further improve robustness and flexibility.
An example of an action which can be applied across all Entity objects regardless of the logical views, is an integrity checker. An integrity checker could interrogate the Entity object for all fields contained within it, execute the appropriate validation rules and report any inconsistencies.
An additional example of a generic action might be a DocumentWriter class. This class would be responsible for interrogating the Entity object's logical view and producing a document (in HTML, PDF or any other format), which details the structure of the entity, the fields, value lists, rules and any domain-specific properties contained with it.
There are literally hundreds of possible ways to use the self-documenting nature of entity objects, to produce flexible and reusable data driven behaviors. These behaviors could be used as needed, by any application, component, or user, without concern for the actual data being acted upon. Domain-specific properties can be applied to the entity, to further customize these behaviors.
-
Implement support for instrumentation at the data level. All access to the raw data and metadata must pass through the Entity class. Instrumentation of this access can be easily localized to the Entity object.
It may be desirable to log the data that is accessed, who accesses it and when they access it. Doing so with this pattern is quite easy because all of the data access passes through a common base class.
-
Implement support for dynamic run-time customization of logical views. In our contact management application, it is often desirable to allow a user to add (at runtime) custom fields to the contact data objects. The Entity class should facilitate this runtime configuration through its exposed interface and provide a means for insuring that the customized field is properly dealt with throughout the application.
The custom fields should behave like all other fields. A property may be set to indicate that the field is user defined. However, in all other respects, the field behaves exactly the same.
When this feature is present, it is more important that the StorageEngine classes be implemented as discussed earlier. It is not possible to rewrite the persistence code whenever a user needs to add a custom field. If the storage engines are designed to interpret the logical view at run-time and automatically cope with the underlying physical view, then the application should be able to continue without interruption.
-
Implement support for calculated fields. Most data sets include static and calculated field values. For example, a contact Entity may have a field named DOB to represent the date-of-birth of the contact. It would be redundant and unwise to also include a field named AGE to represent the age of the contact, because the age can be computed on-demand based on the date of birth. Instead, the AGE field should be included in the logical view as a calculated field.
Calculated fields should behave exactly the same as normal fields. The only significant difference may be that they are read-only and that they are typically not persisted to physical storage.
Complementary concepts
Here are some additional concepts which can be applied to this design pattern.
- Implementing support for an event interface to notify subscribers of changes to the logical view, contents of the logical view or associated meta data could be added to the basic design, without adversely affecting the basic conceptual layout of the pattern.
- Support for transactional behavior in regards to changes of the contents of the entity as well as the logical view of the data and the associated meta data could be added to this design pattern.
- Change history (aka versioning) of both the logical views themselves including meta data as well as the entity contents could be handled as an integral part of the implementation of this design pattern.
- Support for logical derivation of one entity off of another could be implemented.
- Instrumentation of actions taken against an entity (i.e., setting field values, changing meta data, etc) could be provided as part of an implementation of this design pattern.
Sample code
None provided at this time. A future article may provide a full implementation of this design pattern.
Related patterns
- ClassFactory classes may be used to facilitate the construction of Entity objects and StorageEngine objects.
- Adapter classes may be utilized to provide type-safety or interface customization to the underlying Entity objects. This may be done to facilitate using Entity objects in an existing application.
- Flyweight classes may be used for Rules, Fields or even Entity objects, depending on the exact implementation taken. This may reduce the memory footprint of the Entity objects and as a result improve application performance and scalability.
- Interpreter classes are used for loading the configured entity definition.
- The Observer pattern can be implemented by providing a means for Entities to notify associated objects of changes to the Entity state.
- The Strategy pattern is used to provide external adaptability and behaviors for Entity objects.
| You must Sign In to use this message board. |
|
| | Msgs 1 to 25 of 34 (Total in Forum: 34) (Refresh) | FirstPrevNext |
|
|
 |
|
|
 |
|
|
Matt, I just spent over a week thinking and designing the same structure when I found this article! Disregarding the minor unimportant differences our concepts turned out to be the same. It means that now I am confident it is the right way forward! Thanks!
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
hi!!!! thank you so much.can you please provide an example? this would really help. i am thinking of an intelligent zoom application for a school project. thanks!!!!
|
| Sign In·View Thread·PermaLink | 1.00/5 (1 vote) |
|
|
|
 |
|
|
 |
|
|
I liked the article, would like to have seen some examples or UML though. Thanks!
Chris Lasater http://www.geocities.com/lasaterconsult
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Some really impressive work, although, to me, it is not clear what exactly your pattern represents. I think this is because it actually encompasses multiple sub-patterns, and is thus difficult to explain.
I have found "Data Access Patterns" by Clifton Nock to be an excellent reference regarding this style of pattern. If I had to try and match your pattern to one of his, it would be "Active Domain Object". In simple terms, this refers to a type of object that handles its own persistence.
However, it is obvious that "Entity" is far more than that - perhaps too much to be a single pattern.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
I really like when CPians pull their sleeves and start on something abstractly real! Great work. I look forward in checking out the implementation.
Cheers, Erick
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
This is one of the best articles I've read on CodeProject. I'm somewhat biased (in a positive way) because I've done some similar work on a recent project. It is along the lines of the Adaptive Object Framework that was referenced in an earlier message. Its amazing how something you think of as being particularly unique ends up being something others have also pursued in their own independent way.
In any case, great work on the article.
|
| Sign In·View Thread·PermaLink | 2.00/5 (1 vote) |
|
|
|
 |
|
|
 |
|
|
I find many examples in internet,such as OJB.but these pragrams is writen by java or C# .I can't find any source code writen by c++ .Why?Because It is difficult to write it.
44444
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
I have't kept up in the past couple of years, but there have been several object-relational systems built in C++ (STYX used to be one). I have built a system in C++ that incorporates this design pattern. My system has worked out amazingly well (which is what drove me to write the article). The platform I developed is called Intelliframe and is the basis for the entire interactive research division at my current employer.
Unfortunantely, I can't just give away the systems code or even binaries, but I can discuss the ideas and re-implement many of the concepts.
I plan to publish more articles on this subject over the next few months. Right now I have a series of articles I am working on that will ultimately incorporate this pattern, but that is still several weeks away.
Anyway, thanks for the interest.
Matt Gullett
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
I am confused: Do we need an Entity pattern to describe a concept of internal data representation? Or Matt is trying to propose something more then that?
As for the implementation of the Entity claas: while reading this article I was thinking about XML, XSD and DOM. I understand that the answer would be that "XML is just another way to store data and we need storage-independent way to treat data storage, etc." Still, XML DOM implementation gives me a very flexible way to store almost arbitrary data structure, XSD gives me a very reasonable way to describe data format. I doubt you can create something more general then that.
MK
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Sorry for the confusion. Internal data representaion is just the basis for everything else. Once you have that structured in the proper way, you open the door to having truly data-driven engines. This pattern tries to lay out just the beginning and a few possibilities.
An implementation of this pattern could use XML, XSD and DOM, but may not. There is no problem with these technologies, but in-and-of themselves, there is no engine for automatically dealing with persistince, validation, etc. Rules can be programmed in XSD, but not really configured or driven based on properties within the entity object itself. Persisting the XML to a database requires programming of some kind, not just setting a property. DOM provides easy access to the logical view of the data, but does not provide access control, change detection, etc. All of this is just the beginning, not the end.
Object-relational mapping is (in my mind) not the end-game either. What I want (and have) is the ability to have a system automatically discover new entity types and deal with them accordinly. Every type should benefit from a common set of functionality that is driven by the properties and configuration of that type, not an external program (ie. XSD). I do not want to write programs for each new type. This makes it possible to lower the bar on who can actually use and benefit from the technology. (How many users can write an XML document or create an XSD?)
Just so you know, this is something I have actually implemented, not just an idea. I have written a system called Intelliframe which incorporates this design pattern (as well as other features).
Thanks for the feedback. I hope I answered your questions, but if not, please reply back.
Thanks,
Matt Gullett
|
| Sign In·View Thread·PermaLink | 2.00/5 (1 vote) |
|
|
|
 |
|
|
http://groups.msn.com/DotNetPersistence
There are a couple of frameworks on the list that have implemented this pattern. Different designs lead to different variations on the pattern, but there are working implementations none the less.
The article proposes some interesting stuff. In practice, I have found some of the proposed capabilities to be secondary concerns and thus I have not implemented them in my framework. Nonetheless, the suggested designs are pretty darned sound and it's good to see column space devoted to this important topic.
Cheers, Scott
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
|
Hi,
Your article got me thinking, and I am trying to implement a version in C#. For clarification, can you give a small example of a simple entity in text??
e.g.
Entity: Car (not again ) entityId: 1 name: "Car"
Field: fieldId: 1 name: "CarId" type: long
Field: fieldId: 2 name: "Model" type: string
Value: defines the available models for field 2 ??
Property: ???
How do I define a relation with another physical table like "Engine"? What is a property??
Thanks,
Jeroen.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
One way to look at this is that you have an ENTITY TYPE named "Car".
The entity type of car is defined something like this:
ENTITY TYPE: Car
PROPERTIES: ------------------- IS_A_MACHINE=YES PERSIST_TO=Some DB string PERSIST_TABLE=tblCars
FIELDS: ------------------- MAKE Possible values (1) Chevy PROPERTIES: OWNED_BY=GM FIRST_PRODUCTION=19?? (2) Ford (3) Honda, etc.} PROPERTIES: PERSIST_TYPE=longint PRESENT_AS=DropDown
MODEL Possible Values Case Where Make=Chevy (1) Lumina (2) Camaro etc., Case Where Make = Ford etc. PROPERTIES: PERSIST_TYPPE=DROPDOWN
STYLE
YEAR
etc. etc.
Much like C++ classes, entity records are istantiated from this entity type. So, they inherit (not the same as class inheritance), the properties of their entity type, are defined by that type and can then be interacted with as entities or as cars. Depending on the impementation, individual entity records can add properties to their entity type that are specific to that record.
<small><b>Jeroen Prins wrote:</b></small> <i>How do I define a relation with another physical table like "Engine"?</i>
One way to do this is by adding a field to the entity type definition named Engine and have properties which tell the system how to determine available engines based on cases.
Something I did not discuss in the article is the need for an engine to manage entites and deal with issues of dynamic case selection. I plan to write more articles on this subject, but right now I am tied up with one of my other article series.
|
| Sign In·View Thread·PermaLink | 2.00/5 (1 vote) |
|
|
|
 |
|
|
 |
|
|
 |
|
|
 |
|
|
 |
|
|
Some not-so-simple (and rather slow) Java UML CASE tools:
Visual Paradigm for UML - http://www.visual-paradigm.com
Poseidon for UML - http://www.gentleware.com
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
This was a great article. And just in time for me as well. I'm beginning a large programming project (just got the prototype approved) and the use of views of data is very important to me.
I'm fairly new to .NET programming, so I'll be looking forward to more design strategies from you in the future.
Great job and thanks for taking the time to post it.
Tim Richardson tim@databasecreators.com
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
I'm glad you found it usefull. I have found these techniques to be extremely beneficial when developing applications that are heavily dependent on data which sounds like what you are about to undertake.
Matt Gullett
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
This design pattern seems to be very interesting. Does anybody know where to find more information about it? A quick search on the web wasn't very helpfull. I couldn't find any good links.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
General News Question Answer Joke Rant Admin
|