A software application can have the most amazing user interface ever designed, but this interface does not do much good for the future of the application if the data model has a flawed design. This is why it is imperative for the developer to to not cut corners or time on designing the data model and its database implementation. What I will outline in this post is a way of designing the data model and abstraction layer that I have used for some time and been quite pleased with its performance and maintainability. I once was a developer that did not have a concept for abstraction layers, so this resulted in me throwing database calls wherever I needed one. Some time ago I was called up to maturity by some very talented developers who taught me the principles of building my applications around a data model and building an abstraction layer to handle the storage and retrieval of data objects. If you are a fellow developer who has never heard of such things, today is your day, I am calling you into maturity as a software developer and will teach you these concepts as you read on.
Part 1 – The Data Model
When designing the data model for my projects, I like to have all my class definitions inherit a common base class. This enables me to mandate common attributes that all objects in the model should have. Such as a unique identifier (Recnum in example below), or a time stamp of when the object was created or last updated. In this base class I also define the methods needed to create a XML string from the object instance, or restore the object instance from a XML string. You will see how this is used later in the post.
using System;
using System.Collections.Generic;
using System.Text;
using System.IO;
using System.Xml.Serialization;
namespace SomeApp.Core.Data
{
[Serializable]
public class StorableObject
{
private int _recnum = -1;
private DateTime _create_date;
private DateTime _update_date;
public StorableObject()
{
}
public int Recnum
{
get { return _recnum; }
set { _recnum = value; }
}
public DateTime CreateDate
{
get { return _create_date; }
set { _create_date = value; }
}
public DateTime UpdateDate
{
get { return _update_date; }
set { _update_date = value; }
}
public string GetXml()
{
string xml = "";
using (StringWriter writer = new StringWriter())
{
XmlSerializer serializer = new XmlSerializer(this.GetType());
serializer.Serialize(writer, this);
xml = writer.ToString();
}
return xml;
}
public static StorableObject GetObject(string xml, Type type)
{
StorableObject obj = null;
using (StringReader reader = new StringReader(xml))
{
XmlSerializer serializer = new XmlSerializer(type);
obj = (StorableObject)serializer.Deserialize(reader);
}
return obj;
}
}
}
In the following example you can see how a data model class definition is implemented using the StorableObject. Automatically the data object inherits the fields of StorableObject, and the necessary methods to serialize into XML and deserialize from XML. In this pattern, the developer would build a object with all the fields he would traditionally put in a table.
using System;
namespace SomeApp.Core.Data
{
[Serializable]
public class User : StorableObject
{
private string _email;
private string _first_name;
private string _last_name;
private string _password;
public User () : base()
{
}
public string Email
{
get { return _email; }
set { _email = value; }
}
public string FirstName
{
get { return _first_name; }
set { _first_name = value; }
}
public string LastName
{
get { return _last_name; }
set { _last_name = value; }
}
public string Password
{
get { return _password; }
set { _password = value; }
}
public string FullName
{
get { return _first_name + " " + _last_name; }
}
}
}
Part 2 – The Database
After defining the data model the developer will need to build the database tables to hold all this data. In the following example, I define the table “users” to store the User object. The “users_sequence” SQL snippet creates the sequence which produces the unique identifier which will be stored in the Recnum field.
CREATE SEQUENCE users_sequence
INCREMENT 1
MINVALUE 1
MAXVALUE 9223372036854775807
START 10
CACHE 1;
ALTER TABLE users_sequence OWNER TO postgres;
CREATE TABLE users
(
recnum integer NOT NULL DEFAULT nextval('users_sequence'::regclass),
email text NOT NULL,
"xml" text NOT NULL,
CONSTRAINT "Users_PK" PRIMARY KEY (recnum),
CONSTRAINT "Users_Email" UNIQUE (email)
)
WITH (
OIDS=FALSE
);
ALTER TABLE users OWNER TO postgres;

I do not create a database field for each field in my storage class, but instead I just create fields I need search indexes for and leave storing the rest of the class up to the “xml” field. There are pros and cons to this design that I can see.
Pros: The pros obviously are when storing large objects your tables and SQL statements remain small and simple. Also this design provides benefits for when scaling the data model. You can add a new field to the storage class and not have to touch the abstraction layer. The developer can create extensive intricate objects, and as long as each sub object is marked as Serializable, it will be maintained in the databases. Also the Serializer handles all the pesky work of type preservation. In the instance a new search index is needed, the developer simply creates just that field in the database and updates the abstraction layer.
Cons: I have yet to find the perfect design that offers the best benefits of performance and scalability. One tradeoff with this design is that each row consumes much more space than it would with a traditional database layout. This is because combining all the object’s values plus the necessary XML markup creates a rather long string. Where as traditionally each value would be in its own type specific column, thus removing the XML markup. Another tradeoff is for each index column there is a duplicate of this value in the “xml” field. If the abstraction layer pattern I outline below is followed, then these values should never be out of sync unless someone was to manually change the value in the database itself. So the real tradeoff is the additional amount of space that is used with the duplicate data. One workaround for this would be to set the [XmlIgnore] attribute on the index fields if this is a concern. In all, these are decisions that the designer has to make for himself when designing his application.
Part 3 – The Abstraction Layer
I always define my abstraction layers based on a interface. Then I keep my application coded in such a way that all it knows is that it is calling type IStorageController, never the implementation of IStorageController. This is a good pattern for whenever the application is interfacing with a third party component, such as a database provider. Because the developer never knows when the project requirements may change and one day you may be coding for MySql, then the next day you could be coding for PostgreSql. If this rule is properly followed, then you would only have to touch code in your implementation class, and not have to change any application code expect for where the implementation is instantiated. This saves many many headaches and keeps your application very scalable!
using System;
using System.Collections.Generic;
using System.Data;
namespace SomeApp.Core.Controllers
{
public interface IStorageController
{
void StoreUser(User user);
User GetUser(int recnum);
User GetUserByEmail(string email);
List<User> GetAllUsers();
void DeleteUser(User user);
}
}
In this example I have IStorageController implemented as PostgreSqlStorageController. All my database commands are using the wonderful Npgsql assembly. This is a wonderful project with the goal of building a Ado.net compliant data provider for PostgreSql.
using System;
using System.Collections.Generic;
using Npgsql;
using SomeApp.Core.Data;
namespace SomeApp.Core.Controllers
{
public class PostgreSqlStorageController : IStorageController
{
private NpgsqlConnection _connection;
public PostgreSqlStorageController ()
{
_connection = new NpgsqlConnection ("User ID=someuser;Password=somepass;Host=localhost;Port=5432;Database=somedb;");
_connection.Open ();
}
#region IStorageController implementation
public void Close ()
{
_connection.Close ();
}
public void StoreUser (User user)
{
using (NpgsqlCommand cmd = _connection.CreateCommand ())
{
if (user.Recnum >= 0)
{
user.UpdateDate = DateTime.Now;
cmd.CommandText = "UPDATE users SET email=@email, xml=@xml WHERE recnum=@recnum;";
cmd.Parameters.Add ("@recnum", user.Recnum);
}
else
{
user.CreateDate = DateTime.Now;
cmd.CommandText = "INSERT INTO users (email, xml) values(@email, @xml);";
}
cmd.Parameters.Add ("@email", user.Email);
cmd.Parameters.Add ("@xml", user.GetXml ());
cmd.ExecuteNonQuery ();
}
}
public User GetUserByEmail (string email)
{
User user = null;
using(NpgsqlCommand cmd = _connection.CreateCommand ())
{
cmd.CommandText = "SELECT * FROM users WHERE email=@email;";
cmd.Parameters.Add ("@email", email);
NpgsqlDataReader reader = cmd.ExecuteReader ();
if (reader.Read ())
{
user = (User)User.GetObject (reader["xml"].ToString (), typeof(User));
user.Recnum = Convert.ToInt32 (reader["recnum"]);
}
reader.Close ();
}
return user;
}
public User GetUser (int recnum)
{
User user = null;
using(NpgsqlCommand cmd = _connection.CreateCommand ())
{
cmd.CommandText = "SELECT * FROM users WHERE recnum=@recnum;";
cmd.Parameters.Add ("@recnum", recnum);
NpgsqlDataReader reader = cmd.ExecuteReader ();
if (reader.Read ())
{
user = (User)User.GetObject (reader["xml"].ToString (), typeof(User));
user.Recnum = Convert.ToInt32 (reader["recnum"]);
}
reader.Close ();
}
return user;
}
public void DeleteUser (User user)
{
using (NpgsqlCommand cmd = _connection.CreateCommand ())
{
cmd.CommandText = "DELETE FROM users WHERE recnum=@recnum;";
cmd.Parameters.Add ("@recnum", user.Recnum);
cmd.ExecuteNonQuery ();
}
}
public List<User> GetAllUsers ()
{
List<User> users = new List<User> ();
using(NpgsqlCommand cmd = _connection.CreateCommand ())
{
cmd.CommandText = "SELECT * FROM users;";
NpgsqlDataReader reader = cmd.ExecuteReader ();
User user;
while (reader.Read ())
{
user = (User)User.GetObject (reader["xml"].ToString (), typeof(User));
user.Recnum = Convert.ToInt32 (reader["recnum"]);
users.Add (user);
}
reader.Close ();
}
return users;
}
#endregion
}
}
In the above code snippet you can see several examples of how to store and retrieve records for this data model. You can follow this pattern or create your own, but there are several things I would like to point out with this. Notice each time I call the GetObject method on the storage class, passing in the XML string from the database and then cast the object to the return type and assign to the return variable. After each instance of this, I always set the corresponding fields in the object for each of my index columns. There should never be a instance where the value in these index columns would be out of sync with the corresponding value in the XML. But just in case I always like to make sure the value in the object being sent off into the application is the exact same value that the object selection was based on. There is one exception to this rule; because the
“recnum” field is auto-generated by the database, after the object is inserted, the Recnum field in the XML will not be populated until the object is again updated in the database. So it is imperative that after the object is casted in the return variable, that the Recnum field be set from the “recnum” column in the database.
I hope you enjoyed reading this and found it to be useful. Please feel free to comment or contact me with any questions or ideas to improve on this design. Thanks!