October 24, 2012

For The Love of Tesla, Encapsulate Third-Party Code!

I started working on a project earlier this year for a client. We were tasked with upgrading an ASP.NET 1.1 /SQL Server 2000 application to ASP.NET 4.0/SQL Server 2008 in the initial phase. Everything was going well until I looked into upgrading the OR/M that was used by the application.

It turns out that the OR/M had been completely rewritten since the old version, so upgrading to the latest version of the OR/M would break the application. That didn't matter so much, however, because the old version of the OR/M was already breaking the application! Additionally, the old version of the OR/M was no longer supported by the developer (we couldn't even buy paid support for it if we wanted to), so we'd have no help if we tried to stick with the old version. We realized very quickly that we had no other options... regardless of whether we upgraded to the new version of the OR/M or replaced it with a different OR/M entirely, we were going to have to rewrite a significant chunk of the application's codebase.

It took over 80 man-hours to replace the old OR/M with a new one. It didn't have to be this way.

Switching OR/Ms was such an onerous task because the original developer had failed to encapsulate his use of the OR/M within the application. As a result, there were hundreds if not thousands of references to the OR/M's artifacts dispersed throughout the DAL project, BLL project, and ASP.NET web application. Had the original developer encapsulated the OR/M, switching to the new version (or another OR/M) would have saved us countless hours, because we'd only be making changes in the DAL project; the rest of the solution would have remained unchanged.

If the moral of this story isn't crystal clear, it's that you should - no, you must - encapsulate your data access code!

But let's step back and see if we can refactor the moral of the story. The OR/M issue I just told you about is a specific example of a more general problem: using ANY bare third-party APIs throughout your application will exponentially increase switching costs, which will make it difficult if not impossible for you to convince your manager to let you switch from jQuery to mooTools, or EF to NHibernate, in 6 months. The situation becomes even worse if, let's say, you decide you want to switch from PHP to Ruby on the server-side but you have mingled PHP into your HTML markup.

The fact is that while third-party libraries and frameworks make it extremely easy to get an application up and running quickly, simply dropping them into your codebase creates a tight coupling between your application code and the library's code, and you should know by now that's bad design.

So, whenever and wherever you use a third-party API, make sure you wrap it with your own API so that switching or upgrading the third-party API doesn't cripple your application or cost an arm and a leg in development time.

October 18, 2012

No More Repositories

The Repository Pattern is an implementation pattern that is used primarily to encapsulate the persistence medium of an application away from the application's business logic. The result of implementing the repository pattern correctly is that a change to the application's persistence store (from MS-SQL to MySQL, or more generally from a RDBMS to XML stored on the file system) should not necessitate a change in the application's business logic.

Let me first say that I am not categorically against the idea of the Repository Pattern. I think it's both important and useful to encapsulate data access, and I've certainly used the Repository Pattern quite a bit in my career. In principle, the Repository Pattern is a perfectly good solution to the problem of creating a dependency relationship between application logic and data access. In practice, however, successfully implementing the repository pattern can be difficult.

Let's start with the obvious - data context. Where do you new up a data context for your repository? You could let your repositories manage their own data context instances, but that causes scoping issues and gets ugly fast. Okay, so then new up your context in your application logic and then pass that into your repository instances.... oh wait, nope... that breaks encapsulation, which is the whole point of using repositories. What to do? If you've had a course in software design you're probably jumping out of your seat screaming "ABSTRACT FACTORY PATTERN!!!" at the screen. As it turns out, that's the right way to go, though you have to realize that we've just jumped from 1 conceptual layer of abstraction away from our DAL to 3. Yikes. Is it worth it? Yes, absolutely, if there's any chance whatsoever that your data store or ORM will change at any point in the future. Will most developers implementing repositories do this? Absolutely not.

The second issue with using the Repository Pattern is scalability. Most applications start out small and have few areas that require high levels of performance, and in these cases one will hardly ever consider a repository to be an impediment to scaling the application. But, if the day should come that you need to create  two separate data stores to split up read and write operations then the Repository Pattern will fall woefully short. The reason for this inadequacy is pretty obvious: repositories as objects do not follow the principle of Command-Query Separation.

So what's the solution? Well, my solution has been to ditch repositories in favor of individual generic Command and Query classes, along with a CommandProcessor class that manages units of work, for my projects that are using EF as an ORM. This is called Command-Query Responsibility Segregation, or CQRS for short. I like that I now have a very clean separation of commands and queries, which allows me to use different data contexts connected to different databases for queries and commands, and I've additionally reduced  my code footprint by implementing generics.

I'm still working out details on managing the creation of data contexts, so I still haven't resolved that dependency. However, my current plan is to apply the Abstract Factory Pattern in conjuction with some interfaces and adapters to make a single data context interface that my client objects can call regardless of what the underlying query or command pattern is.

At the end of the day, repositories are not inherently bad or evil, and I don't think that everyone should go and get rid of them. Heck, I haven't even completely removed them from my code... yet. The problem, as I see it, is that repositories are difficult to implement with encapsulation in mind, or when performance is a critical consideration. That is why I am moving away from the Repository Pattern, and toward CQRS.

October 8, 2012

Remember Not to Forget: How to Horizontally Center an IMG Element with CSS

For whatever reason I'm always forgetting a lot of useful CSS and Javascript stuff, and I end up spending 10-15 minutes on Google looking for said useful stuff over and over again. To avoid wasting those 10-15 minutes in the future I'm going to start putting those helpful things in my blog!

This is one I come across all the time: horizontally centering an IMG element with CSS.


October 3, 2012

CodeSense: Smart Naming Practices

What's in a name? If you're a developer, there's quite a lot in a name actually. The names you choose for your namespaces, classes, interfaces, methods, fields, parameters, etc.. communicate what your application is, what it does, and how it is organized. Thus, it is important (if not absolutely necessary) that you use your CodeSense when choosing names.

Namespaces

Namespaces define the organizational structure for your code, and help to uniquely identify the classes that belong to your library. Here are some CodeSense tips for namespaces:

If you're releasing your library to the public, then your namespaces should always be prefixed with your application's URI in reverse-order, for example: com.brian-driscoll.myapp. The reason for this is to prevent namespace collisions with other libraries. You can of course follow this practice even if you're not planning to release your library to the public, but it is not strictly necessary.

Namespaces should be meaningful within the domain of your library or application. Avoid using overly generic namespaces like Utils

Namespaces should create a meaningful hierarchy within your application. A rule of thumb is that if you find yourself creating similar namespaces within your application, you most likely can (and should) group those namespaces together into the same namespace. For example, if you have one namespace called Companies.Addresses and another called Customers.Addresses, you should create a single Addresses namespace instead.

Classes

Classes define the participants in your application's domain, and should be named in a logical way.

Class names should always be nouns.

Class names should be specific with respect to the object they define. If your class name is overly broad or vague, then you've likely created a monster (aka "God") object that needs to be broken down into several smaller classes.

For business entities, classes should be named according to the problem domain, e.g.: Company, Employee, Customer, etc. Avoid unnecessary clutter in business entity class names, such as EmployeeData or CustomerInfo

Classes in the solution domain should be named according to the solution domain, e.g.: DatabaseConnectionFactory, PageEventHandler, etc.

Classes should be placed into appropriate namespaces, and similar classes should be put together into the same namespace.

Interfaces

Interfaces describe functionality that any class can implement.

Interface names should always be adjectives. For example, you should use a name like Sortable rather than a name like IOrderedCollection to describe the functionality that must be implemented to sort a collection of items.

Interface names should be specific about the functionality that they describe. If your interface name is overly broad or vague, then chances are your interface describes too many different kinds of methods and should be broken up into several interfaces instead.

Like classes, interfaces should also be put into appropriate namespaces, and similar interfaces should be put together into the same namespace.

Methods

Methods describe what an instance of your class actually does.

Method names should always be verbs.

Method names should generally not give an indication of what their return type is or what parameter types it expects. There are exceptions to this, however, such as methods that convert from one type to another (Convert.ToString()). However, a method name like GetEmployeeIdAsString() should be shortened to GetEmployeeId(); the fact that the return type is String will make it obvious that we're getting a string back.

Unsurprisingly, method names should describe at a high level what the method actually does. So, if the method creates a customer record and sends a confirmation email, then it should probably be called CreateCustomerAndSendConfirmationEmail (I'll stop short of explaining why such a method is most likely poorly written...).

Method names should be consistent. For instance, if you use the term "Insert" across your business object class definitions for methods that create database records, then using the term "Add" or "Create" in another class for a method that creates database records will likely cause a bit of confusion about your API.

If you use a language that requires you to write your own accessor and mutator methods, then your accessor and mutator methods should be named getFoo() and setFoo(), respectively, where foo represents a field in your class.

Fields and Parameters

Fields and parameters represent the atomic data of your application's domain.

Field and parameter names should always be nouns.

Field and parameter names should not give an indication of type. In other words, STOP using names like strFirstName, strLastName, etc. Furthermore, field names should not be prefixed with anything, including underscores (_employeeId) or whatever else you might think of using.

Both field and parameter names should be meaningful with respect to the data that they represent. One- or two-letter names most likely aren't very descriptive and should be avoided.

Like method names, field and parameter names should also be consistent. If you use "id" for the identifier field across your business object class definitions, then using "ID" or "Identity" or "identifier" for an identifier field in another class will cause confusion. Similarly, if you have two methods that take a parameter representing a person's first name, then the parameter should have the same name in both methods.

Thanks for checking out this article on naming practices, which is part of a new series called CodeSense. CodeSense is about making code easier to read and write by developing common sense conventions that can be applied immediately in nearly any programming language. Please contact me at brian (at) brian-driscoll (dot) com if you have questions, comments, or requests for future CodeSense topics.