Structuring a .NET Core application – Part 1

Separation of Concern

Note, this series is very theoretical, and this first part especially, and might be for less experienced developers, but there should be something for everyone.

There are many ways to build an application, many right ways to build one and many wrong ways. I’m not the one to say the way you build your applications are wrong. I might be wrong; I might be right and so can you. It all depends on how you work best in your organization and under your circumstances.

With this post, I’m going to describe what I think is the best approach to application development anno 2020. Namely splitting up your code into separate blocks and services.

Previously, when I was less experienced as I am today, I used to build applications in large chunks. It’s easy. I need an application that takes X and splits it into A, B and C. A is going into a database, B and C is going to different external services.

Large methods

So, my application would have one large method, Save(X). That method would handle everything: Open a connection to the database, save A, close the database connection. Then, open a connection to External Service 1, post B, close the connection. And finally open a connection to External Service 2, post C and then close the connection.

It’s quite simple. Everything is there. My method does exactly as is advertised, (sort of), and it does get the job done.

Then, the next day, I need another method to only send B to External Service 1. Well, that’s easy. Open a connection to the service, post B and then close the connection again. Easy. Everything is fine, everything is working as expected and everyone is happy.

But one day, External Service 1, updates their API. There are new URL’s and a new authentication model. Everything in my application breaks. Who can fix it? I might, but it’s been a while since I’ve worked on that application, so my memory is a bit rusty. So, I try to remember all the places where I call the service and update to the new API. I might miss a few places, but those places are the ones not easily tested, so errors might only occur after a while. Had one of my colleagues gotten the job, the chance of failure would be very high!

This pattern of copying code from one place to another, is called duplicate code. In my experience, duplicate code occurs when you just need to add a feature or fix an error, quickly. Or, it might happen when the developer lacks the experience to foresee problems ahead of time. The latter certainly was the case for me for some time. Later it was the former.

If you need to copy and paste code, you’re doing something wrong.

Søren Spelling Lund, CPO, uCommerce

I’ve heard this quote many years ago, but it has stuck with me ever since, and I try to do my best not to copy and paste code. Well, we all copy and paste from Stack Overflow, but you know what I mean.

One of the problems with code duplication, besides the difficult maintainability, is that it, more often than not, results in huge methods and classes that attempts to do everything at once. Take our Save(X) example above. It does three things at once. It makes sense, since it has to split X into separate parts and send those parts to different services. But it has way too much responsibility. It has the responsibility to package and serialize the data into the formats each of the services requires. It also has the responsibility to open and close connections to each service and it has the responsibility to handle errors returned from each service. And most importantly, none of it can be reused. It has all that responsibility, and can only use it for that very specific task that is to save X.

Save(X) might look something like this:

public void Save(X){  
    // Split X into A, B and C  
    // Open the database connection  
    // Map A into a database entity  
    // Close the database connection  
  
    // Open a connection to External Service 1  
    // Serialize B into json  
    // Send json to External Service 1  
    // Close the connection to External Service 1  
  
    // Open a connection to External Service 2  
    // Serialize C into XML  
    // Send XML to External Service 2  
    // Close the connection to External Service 2  
}  

What we need, is to relieve Save(X) of some of its responsibilities. Save(X) should only have one responsibility: Save X.

This is called Separation of Concern.

We need to identify every small part of the application and separate them into small reusable snippets. Where each snippet has one responsibility. And only one.

Again, let’s take our Save(X) example from above.

It can be split into 3 parts, which again can be split into, at least, three parts each.

Separation of Concern

What we need to do, is to separate everything into small, easily reusable and maintainable snippets with as few responsibilities as possible. This is called Separation of Concern.

Database

Save(X), needs to store data into the database. We can separate that logic into a library that does that. The library has three responsibilities: Manage database connections, package/transform/serialize the data and store it in the database.

The first part is building a class that only has one concern: Database connection, let’s call it DatabaseConnection. It has two methods: Open () and Close (). This is a very simplistic setup and mostly theoretic, bear with me on this.

It might look like this:

class DatabaseConnection  
{  
    public OpenDatabaseConnection Open()  
    {  
        // Open the connection.  
    }  
  
    public void Close()  
    {  
        // Close the connection.  
    }  
}

Open () returns a database connection that can be used to send data to/from the database. Our Save(X) could just use this and be over with it. Save(X) no longer has the responsibility of knowing how to open a database connection or how to close it. But it still has the responsibility of knowing how the data is formatted and sent to the database.

Therefore, we need another class: UnitOfWork, (again, simplistic and theoretic. More details in upcoming posts).

This class has the responsibility to send data through an open database connection with a method called SaveData(). And since we’ve just made a class with the sole responsibility of maintaining the database connection, our UnitOfWork class can utilize this class. If we choose to make a new UnitOfWork, we can make use of the same database context class and not have any duplicate code between our two unit of works, other than calls to the Open() and Close() methods.

Our UnitOfWork class could look like this:

public class UnitOfWork  
{  
    private readonly DatabaseConnection databaseConnection;  
    public UnitOfWork(DatabaseConnection databaseConnection)  
    {  
        this.databaseConnection = databaseConnection;  
    }  
  
    public void SaveData(string tableName, object data)  
    {  
        this.databaseConnection.Open();  
        // Save the data    
        this.databaseConnection.Close();  
    }  
}  

We could add a third layer, called Repository, (I know, using Entity Framework, NHibernate and the like, the repository pattern is redundant). What this class does is package the data and send it to the database. And since we’ve just made a class that has the responsibility of sending data to the database, our Repository only has one concern: Package the data. And then send that data to our UnitOfWork:

public class Repository  
{  
    private readonly UnitOfWork unitOfWork;  
    public Repository(UnitOfWork unitOfWork)  
    {  
        this.unitOfWork = unitOfWork;  
    }  
  
	public void SaveA(AModel data)  
	{  
		// Map A into a database entity.    
	
		this.unitOfWork.SaveData("dto.A", mappedData);  
	}  
}  

We have now separated the concern of managing the database connection, mapping to a database entity and saving that entity to the database, away from our Save(X) method, it now looks like this:

public void Save(X){  
	// Split X into A, B and C  
		
	Repository.SaveA(A);  
	
	// Open a connection to External Service 1  
	// Serialize B into json  
	// Send json to External Service 1  
	// Close the connection to External Service 1  
	
	// Open a connection to External Service 2  
	// Serialize C into XML  
	// Send XML to External Service 2  
	// Close the connection to External Service 2  
}  

Much simpler, and we can now reuse Repository.SaveA() multiple places without thinking about changes to the database connection, database schema or future development.

Service clients

As with the database abstraction we did above, we can also split our service clients into separate parts with single responsibilities.

But first, lets break it down a bit.

We have, again, three responsibilities: Manage a connection to a service, serialize data and send data.

I feel lazy, so for the first part we’ll be using a build-in service called System.Net.Http.HttpClient. This service / client manages everything related to HTTP requests. Hence the name HttpClient. It handles opening and closing connections, so we don’t have to think about that. For now.

But we still need to serialize data before we can send it through the HttpClient.

We know we have to serialize at least two different kinds of data into, at least, two different kinds of string data, (JSON and XML). Let’s start by defining a reusable interface that does just that:

public interface IDataSerializer<TModel>  
{  
    System.Net.Http.HttpContent Serialize(TModel model);  
} 

This interface describes a service that converts a model of type TModel into an instance of HttpContent.

We can then create two services:

public class BSerializer : IDataSerializer<BModel>  
{  
    public System.Net.Http.HttpContent Serialize(BModel model)  
    {  
        /// serialize to json. ... omitted for brevity  
        return new System.Net.Http.StringContent(jsonString, System.Text.Encoding.UTF8, "application/json");  
    }  
}  
  
public class CSerializer : IDataSerializer<CModel>  
{  
    public System.Net.Http.HttpContent Serialize(CModel model)  
    {  
        /// serialize to xml. ... omitted for brevity  
        return new System.Net.Http.StringContent(jsonString, System.Text.Encoding.UTF8, "text/xml");  
    }  
}  

These two services each has one responsibity: Convert data to a format that can be sent to the service. Please note, these are very simplified examples.

We now handle the serialization part. What we need is a service that can send that data to our service endpoints. I like to call a service like that something like ServiceClient.

Our ServiceClient has two dependencies: HttpClient and an instance of IDataSerializer<TModel>. It also has a method called SendData that takes an instance of TModel and sends it through our HttpClient.

A such client could look something this:

class ServiceClient<TModel>  
{  
    private readonly System.Net.Http.HttpClient httpClient;  
    private readonly IDataSerializer<TModel> serializer;  
      
    public ServiceClient(  
        System.Net.Http.HttpClient httpClient,   
        IDataSerializer<TModel> serializer)  
    {  
        this.httpClient = httpClient;  
        this.serializer = serializer;  
    }  
      
    public System.Threading.Tasks.Task SendDataAsync(TModel model){  
        var serializedContent = this.serializer.Serialize(model);  
        return this.httpClient.PostAsync("path to service", serializedContent);  
    }  
}  

Again, very simplified. This client only handles a single model type and can only do HTTP Post requests. But then again, it’s for illustrative purpose only.

Please note, managing the HttpClient happens elsewhere. The serialization logic has also been moved elsewhere. The only thing this class does, is sending serialized data through the HttpClient.

Our Save(X) method above would then look something like this:

class SaveXample  
{  
	
	private readonly Repository repository;  
	private readonly ServiceClient<BModel> externalService1;  
	private readonly ServiceClient<CModel> externalService2;  
	
	public SaveXample(  
		Repository repository,  
		ServiceClient<BModel> externalService1,  
		ServiceClient<CModel> externalService2  
		)  
	{  
		this.repository = repository;  
		this.externalService1 = externalService1;  
		this.externalService2 = externalService2;  
	}  
	
	public async System.Threading.Tasks.Task Save(XModel X)  
	{  
		this.repository.SaveA(X.A);  
		await this.externalService1.SendDataAsync(X.B);  
		await this.externalService2.SendDataAsync(X.C);  
	}  
}  

As you can see, much simpler. The only responsibility this class now has, is to split X into A, B and C. Saving A to the database is handle elsewhere. B and C is being handled elsewhere also. Some of the logic that handles B and C is reused. And it’s extensible. We can, quite simple, add a new service that sends C data as a binary stream by only creating a new IDataSerializer class.

The next part will be much more in dept into how I would organize and build libraries in a .NET Core application. Including how to wire up services like the ones described in this post and use dependency injection efficiently and reusable across applications.

TL;DR

Splitting code into small libraries, small classes and small methods. Has a lot of benefits:

1: Easier to maintain

It’s easier to maintain 10-20 lines of code doing one thing, than it is to sieve through 100s of lines of code that does everything to fix a bug.

It’s also easier to make changes without impacting the entire application. You can mark methods obsolete; you can change dependencies, interfaces and schemas without impact.

2: Reusability

When splitting your application into smaller pieces, you can reuse those pieces multiple places. And fixing an error in one of those pieces, means fixing it everywhere.

Reusability also opens up for unit testing. Having separated everything into separate parts, you can mock-up all dependencies and only test essential code.

You can even reuse your code across different applications, say an ASP.NET Core application and an Azure Function. Both can use the same library. And changes in that library will not impact either of them.

3: Scalability

By separating everything, you can build more efficient code, reducing memory footprint and CPU usage. You can implement caching strategies and manage disposables within each library / module / class / method.

4: Cons

By splitting everything into smaller pieces, one must be aware of changing the interfaces and contracts. Since the code you are changing can be used places you don’t know about.

A method that does one thing, must keep doing that one thing. All code that calls that method, expects it to do that one thing. Changing it to do another thing, is unexpected behaviour and might result in errors. Such changes must be made in a different method / class marking the old one obsolete if needed.

The developer must think of everything that depends on that library as external. It might be the developer him/her self that builds those applications, but after 4 weeks, he/she is definitely a different developer.

Database design on user defined properties

As a developer, I often get across some database designs, that are quite complex caused by a developer not quite understanding the problem (we’ve all been there!), and therefore cannot solve it correctly.

The problem is as follows (or similar):

You have 2+ tables:

No references

Initial tables – no references

It’s quite simple, you have two types of data (media and documents), that you want to store in your database. I get that, I would too.

Now the requirements are as follows:

Both media and document has a set of user defined properties.

Said properties must store the following values: Type and Value, and a reference to both Media and Document.

There are a couple of ways to solve this lil’ problem, one (the one I encounter the most):

References(1) - Database Inheritance

Property-table added and References are added to media and document.

In this setup the Property-table knows about both media and document. We could make the two foreign keys nullable, either way we depend heavily on our code to keep media and document properties separated. And what happens if we add an other type (say Users), then we have to add a new foreign key to the property-table, and expand our code even more.

An other approach is this:

<img class="size-full wp-image-762" alt="References(2) – Database Inheritance" src="http://ndesoft.dk/wp-content/uploads/2013/04/media_document_properties_take2 cialis overnight shipping.png” width=”404″ height=”444″ srcset=”http://ndesoft.dk/wp-content/uploads/2013/04/media_document_properties_take2.png 404w, http://ndesoft.dk/wp-content/uploads/2013/04/media_document_properties_take2-272×300.png 272w” sizes=”(max-width: 404px) 100vw, 404px” />

Media and document property references are stored in separate tables.

I must admit, I have done this one as well as the other one, and just admit it, so have you at some point!

So what are the pro’s and con’s of this setup: Well the pros are simple, neither media or document are referenced in the property table, we can have as many properties as we want per media and document, and we can quite simple add other types, such as Users. BUT:

When we have this setup, we must rely heavily on our code to help us not to have the same property on more than one media, and to ensure we don’t mix media properties with documents and users. And if we add an other type (Users) we must create, not only one, but two new tables, and still expand a complex code to handle that new type as well as the other types.

So how can we solve this problem?

We have Media, Documents and more types, that has dynamic  properties without the other types must know about it, we could do this:

References(3) – Database Inheritance

Each type now has its own set of properties

Yeah, I’ve also done this one. And this is almost, (I wrote, almost), as bad as the other ones. Well no property can be on more than one media (or document, or whatever), and no property can be on both media and document, so whats the problem?!

Well, for starters, we have to tables instead of one, per type. If we add an other field to our properties, we must add them to all of our *Property-tables. And if we want to list all properties, including the media/document/user/whatever it is attached to, it’s nearly impossible.

So here’s the solution, I find most fitting for the problem:

References, Inheritance – Database Inheritance

Added a Node-table, with the shared fields from Media and Document. Removed ID- and Name-fields from Media and Document, added a NodeID field, as both PK and FK. Added a Property-table, that references the Node-table.

So, this is my solution. I have added a Node-table, with the shared fields from Media and Document (ID and Name). Removed ID- and Name-fields from Media and Document, added a NodeID field, as both primary key and foreign key, this field must NOT be autoincremented! It will not work, then I added a Property-table, that references the Node-table.

The pros and cons: The pros are easy, One table per type, each type gets its ID from the Node-table, all properties are stored in one table, referencing the Node-table, so a Document can get its properties, using only its primary key. No property can ever be on two entities at once, and no entity knows about other entities or properties, except its own.

The cons are, that we must have some code that handles the inheritance. When I make a SELECT * FROM Media, I must make a JOIN on the Node-table as well. If you’re a .NET developer, like I, then you should take a look at the Entity Framework, as it handles this smoothly. I will write a post on that later on.