Thursday, July 23, 2009

Exploring the Repository Pattern (Part 2)

In the first portion of this article we explored what the Repository pattern is and provided a basic implementation for now developers are using it to facilitate data access.  We create a basic class RepositoryBase<T, PK> and showed a basic example of using it.  Now we will explore this class and demonstrate how to take the Repository further and make it more useful.

Now, in what we created there is one major obvious drawback.  If we wanted to write a custom query how do we do so?  Surrounding this matter is a lot of debate on extending the pattern, for the most part I have seen the following schools of thought:

  • Query objects
  • Extension methods
  • Derived Model Repositories

Of these I find query objects to be the most fascinating, I will defer to my friend RossCode to explain them here.  But essentially the idea is to create a new abstract base class called QueryBase<T> and use this to create objects that contain criteria for performing these custom queries.  Using this we can create two methods (one for singles, one for sets) on our RepositoryBase<T, PK> like such:

   1:  protected T GetBy(QueryBase<T> query)
   2:  {
   3:       return query.SatisfyingElementFrom(GetAll());
   4:  }
   5:   
   6:  /// <summary>
   7:  /// Get a listing of records that matches a custom criteria
   8:  /// </summary>
   9:  /// <param name="query">Query object to supply the criteria data</param>
  10:  /// <returns>ILIst of T or null</returns>
  11:  protected IEnumerable<T> GetAll(QueryBase<T> query)
  12:  {
  13:       return query.SatisfyingElementsFrom(GetAll());
  14:  }

What we have done is create a set of methods that take an abstract type as its parameter, thus allowing any class that inherits from QueryBase.  In reality, this method is designed to be used with something like LINQ to NHibernate which would allow these custom queries to benefit from the deferred execution that is found in many data related variants of Linq, in our example however, we will be using standard Linq to Objects; I believe RossCode’s examples use Linq to NHibernate.

So using these query objects we can define separate classes for each custom query we wish to make and then pass then to either GetBy or GetAll.  This has the advantage of allowing us to use only one class for the repository and yields custom results.  But wait there is a slight problem with this method from a maintenance perspective.  Now we are dropping this query objects all over the application, and while this may be ok for a smaller application, in larger application it could quickly turn into a mess.  From a Separation of Concerns perspective we are not defining our queries outside the data access layer.  The caller should not need to know anything about how the query is generated, only that it can call the function and return the result.

Lets stick with the idea of Query Objects and address the root of this problem: we want to be able to make the custom calls as dumb as possible and not have query definition logic in the layer making the call.  One of the ways around this is to use extension methods (.NET 3.5).  Consider the following example:

public static List<Episode> GetEpisodes(
     this RepositoryBase<Episode, int> repo,
     bool state)
{
     return repo.GetAll(new ActiveEpisodeQuery(state));
}

What this will do is attach a method (GetEpisodes) onto all instances of RepositoryBase<Episode, int> thus allowing us to contain the method to a single function.  While this is not a bad approach, I do not like it because it feels so awkward; I would rather the functions be part of the class themselves, I cannot call this method through reflection should I need to later on.

My chosen way is to use derived repository classes, because:

  1. Base classes should always be declared abstract and thus not allowed to be instantiated
  2. Using a derived class allows the ability to hide more from the developer and make their paths less cluttered
  3. It builds on the OO principle of encapsulation and allows us to change things at will
  4. Using inheritance from the gives us the ability to hide things we may not need thus creating a simpler API

The major drawback to this approach is that you end up having to create and maintain more classes, because now every model will need to have a specific Repository class.  A sample one could look like such:

public class EpisodeRepository :
     RepositoryBase<Episode, int>
{
     public List<Episode> GetActiveEpisodes()
     {
          return GetAll(new ActiveEpisodeQuery(true));
     }
}

Using this approach you can increase the amount of code that can be encapsulated inside the derived class and thus perform more complex operations on the repository objects, all while providing a simple API to the outside.

To conclude, we talked about the major shortcoming of the basic form of the Repository pattern: customization.  In any data driven application you will need to create custom queries that mutate and translate data in different ways.  The use of query objects can greatly enhance the clarity of the underlying logic, but without an additional layer of abstraction could pose great problems for code maintenance. To mitigate this we analyzed two possible approaches: 1) use extension methods to extend possible instantiations 2) create derived repositories for models to provide these custom data manipulations methods.

It is my personal preference to use approach #2 as it enables you to achieve complex operations with a much simpler interface.  It also helps with code maintenance and readability as the code can describe to the developer what it is doing and what is involved in the operation.

Sunday, July 19, 2009

Exploring the Repository Pattern (Part 1)

In the software development world there exists the term “pattern”.  Like its counterparts in other disciplines, a pattern is a tried and true way to do something. Developers are constantly seeking newer and more efficient ways to organize code and reduce the amount of foundation they have to do.  One of the most common problems that developers use patterns to address is data access.

For developers, data access is one of the biggest bottlenecks in any data driven application. There are many popular patterns for this from Dependency Injection to Active Record, to Repository.  Today I am going to explore the Repository pattern by using it to create a simple data access framework using Fluent NHibernate.  Please note that this article will not discuss Fluent NHibernate (FNH) mapping’s.

To start it is necessary to understand how the Repository pattern works both in general and specifically with data access.  First as you might imagine the Repository is a collection of objects that we are going to perform an action on.  This can be a static action or a polymorphic action where the time is abstracted and can vary from object to object.  For data access, this generally involves saving the Entities that exist in the repository.  So from this sense you can think of the Repository as a specialized List.  So to begin we have to define our repository, so lets think about what we will need:

  • Connection to our database (abstracted)
  • Way to add to the Repository
  • Way to remove from the Repository
  • Way to get sets of entities from the underlying connection
  • Way to get single entity instances from the underlying connection
  • Way to perform an operation on the entities within the repository

To make things consistent we will create a RepositoryBase<T> class to serve as the base for our derived repository classes.  There is a lot of discussion on whether to use an abstract class or interface to implement the Repository pattern. For the purposes of this demo I am going to use a base class in my approach, however, an interface would work in a similar fashion.  So the basic definition:

public class RepositoryBase<T> where T : EntityBase, new()
{
}

Some things to point out here.  You will see that we are using a generic base class to provide a base for the inheriting classes.  The generic type must be a subtype of EntityBase, this indicates that it is an entity and has certain required properties that are inherent to all entities.

The next thing I am going to add is a connection to my database, the code is updated as such:

public class RepositoryBase<T> where T : EntityBase, new()
{
     public ISession Session
     {
          get { return SessionProvider.GetSession(); }
     }
}

The ISession type is from FNH, and is basically a connection to our database and the mapping configuration for the entities, we will not be discussing how this is done for this post, for now its enough to understand that Session is what allows us to communicate with FNH and our database.

So now we need to provide the ability to Add/Remove entities from our Repository, so we add the following methods:

   1:  // standard generic methods
   2:  /// <summary>
   3:  /// Add an entity to the Repository to be operated on
   4:  /// </summary>
   5:  /// <param name="entity">The entity to add</param>
   6:  public void Add(T entity)
   7:  {
   8:       _itemList.Add(entity);
   9:  }
  10:   
  11:  /// <summary>
  12:  /// Remove the provided entity from the repository listing
  13:  /// </summary>
  14:  /// <param name="entity">The entity to be removed</param>
  15:  public void Remove(T entity)
  16:  {
  17:       _itemList.Remove(entity);
  18:  }

This code should be pretty self explanation.  The only thing to note is the _itemList private member variable of type List<T> where T is the specified type for the RepositoryBase.

Next we need to decide how users of our Repository will READ from the database, both to get sets of entities and a single entity based on a condition.  Below are my implementations:

   1:  /// <summary>
   2:  /// Return an entity based on its primary key (int)
   3:  /// </summary>
   4:  /// <param name="id">primary key value</param>
   5:  /// <returns>Entity with the provided primary key</returns>
   6:  public T GetBy(int id)
   7:  {
   8:       return Session.Get<T>(id);
   9:  }
  10:   
  11:  /// <summary>
  12:  /// Return all records for the given Type table
  13:  /// </summary>
  14:  /// <returns>entity of type T or null</returns>
  15:  public IEnumerable<T> GetAll()
  16:  {
  17:       return Session.CreateCriteria(typeof(T)).List<T>();
  18:  }

One thing I would like to point is this version of the Repository makes an assumption that I have abstracted out in my final version.  If you notice the GetBy generic method, it takes an int as its parameter.  This is doing a lookup based on a primary key, which we are assuming is of type int.  In my final version I have updated the class definition to support a second generic type, which represents the type of the primary key in the database, thus giving the class flexibility to deal with any primary key type.  All the falls that these methods are masking go through the FNH session.

The final thing we need to add is the method to perform the operation. However, before we do this I would like to show you the code for EntityBase:

   1:  public abstract class EntityBase
   2:  {
   3:       public abstract bool IsNew { get; }
   4:       public abstract bool CanDelete { get; }
   5:       public abstract void SetDeleted(bool state);
   6:       public bool IsDeleted { get; protected set; }
   7:  }

These are just some very simple properties, many of which are deferring their implementations until a new class is derived inheriting from EntityBase.  Now here is a scaled down version of the CommitChanges method for RepositoryBase<T>.

   1:  public void CommitChanges()
   2:  {
   3:       foreach (T item in _itemList)
   4:       {
   5:            if (item.IsDeleted)
   6:            {
   7:                 Session.Delete(item);
   8:            }
   9:            else
  10:            {
  11:                 if (item.IsNew) // is a new entity and will invoke insert
  12:                 {
  13:                       Session.Save(item);
  14:                  }
  15:                  else // is an existing entity and will invoke update
  16:                  {
  17:                       Session.Update(item);
  18:                   }
  19:            }
  20:       }
  21:       
  22:       _itemList.Clear(); // clear all the entities and reset internal operations
  23:  }
  24:          }

This is a great example of why I love polymorphism, we are ensuring that T inherits from EntityBase so that means we can use the public properties defined by EntityBase since every type that inherits from it MUST implement those members.

So there you have it, that is a basic Repository pattern base class, now lets explore how you would use this class.

In its more primitive form you could write code like such:

   1:  var series1 = new Series() { Name = "My Series" };
   2:  var series2 = new Series() { Name = "My Other Series" };
   3:   
   4:  using (var repo = new RepositoryBase<Series>())
   5:  {
   6:       repo.Add(series1);
   7:       repo.Add(series2);
   8:       repo.CommitChanges();
   9:  }

In the next section, I will continue exploring the Repository pattern including some drawbacks and how developers are overcoming them.

Tuesday, July 07, 2009

Off to New York

I am quite excited and nervous.  Recently I was chosen to work as an Embedded Expert for a company in Long Island doing a rewrite of a legacy application using .NET.  This is a big chance for me and I am very excited for the opportunity.  I look forward to being to New York, a state and city that I have never visited.

Interestingly enough, the opportunity should last for three months, which is the same amount of time I lived in Japan three years ago.  Though I personally think that this will be a bit easier, since at least I will still be in America where English is spoken more then Japanese.  I leave in August following my brother’s wedding. Wish me luck, I will try to post about things as much as I can, but this is a fairly confidential project.