Increasing observability of microservices

Software engineering is an evidence-based practice. We are always seeking facts. We are logically sequencing them together to generate knowledge out of them. And yet, sometimes we are completely oblivious to the state of our system in production. No amount of testing can guarantee a 100% bug free production software. Things will go wrong. The real test of the system begins when it gets deployed in production. Are we prepared for this?

Typical Enterprise Experience

How often changes are pushed to production without completely collecting data around the issue? We wait to change until someone complains. Then the investigation begins. We look at the logs. We "fix" the problem by making code changes, throwing more hardware at it, etc. and life move on. How reactionary is this!

Then someone says, "We need monitoring!". That's largely because of some random compliance. Some tools are installed, they gather info about CPU and RAM usage. They look at the spikes, they complain, they log a ticket, nothing happens, and everyone moves on.

Then there are alerts. Something goes wrong, a million of them fill up people's inboxes even during a planned outage. They quickly become annoying. Moreover, they are compared to a fire alarm but are anything but. These alerts, unlike a fire alarm, go off during the problem instead of at the first sign of smoke. What’s the point of a loud alarm in the middle of a raging fire? People create rules to ignore these and move on. That whole situation is like this:

alt

Sometimes, they also gather service's uptime using heartbeat reports with some nice charts and put them on TV. Is that enough? Can a service, returning 200 OK with a perfect heartbeat (i.e. it is up), perform its critical actions? I also hear about not making systems any slower than today, however, nobody can tell what slow or fast means?

All of this points toward a lack of value provided by a typical monitoring in the enterprise. These systems were put in place because somebody wanted them, or they came in as part of some compliance. It lacks purpose. So, how do we fix this? How do we change our systems to make them more observable?

What is Observability?

Wikipedia says:

Observability is a measure of how well internal states of a system can be inferred from knowledge of its external outputs.

Now, if we apply this to some of the scenarios we saw above, we can see how incomplete the external outputs tend to be. They don't scale very well either.

Applying to Microservices

In the distributed systems/Microservices world, we are dealing with eventual consistency, where data may not be in the identical state throughout. We are dealing with services changing constantly at their own pace. Sometimes there could be a situation when certain functionalities are not available. All of this can make triaging problems tricky.

Increasing Observability

Let's see how to achieve this:

1) Come up with service level objectives and indicators: This helps us in putting together clear and practical definition around what a healthy service looks like. Google's Service Reliability Engineering Book has good advice on how to put these numbers together and how to avoid pitfalls. Afterwards, we can gather actual numbers and build intelligence around it.

2) Log aggregation: Most of the software produce a lot of logs. These logs can be aggregated into one place. This one place could be Kibana, Splunk or HoneyComb. If we can make holistic sense of these logs, we should be good.

3) Traceable individual event lifecycles: This can be achieved by generating correlation ids and source of origins on events and they can be logged. We can have messaging headers to pass this information around. NServiceBus supports these types of headers.

4) Business goal tracking: We need to gather user's activity on our application so that we can figure out the importance of features based on its usage. A/B testing can provide great insight into this. Optimizely is one such tool that I have seen used effectively.

5) Host performance tracking: There is value in gathering info about the host running our service. However, to make it effective, we should be able to create enough log information so that any anomalies can be tracked down to a piece of code that caused it. Otherwise, any issues arising from this, are hard to recreate and resolve.

6) Meaningful notifications: I'd create less but more meaningful notifications. My proposed principle for this would be to generate a notification when something is about to go wrong. This way, the individuals can take necessary actions to avoid further escalations.

7) Custom approach: Sometimes there is no getting away from writing custom few lines of code somewhere to proactively capture data you need. In some of my systems, I had to take this approach to collect what I needed. It worked out better than customers telling us about our problems.

8) Automation: If someone must manually hit a button, that's not good. It is not going to happen. It will be ignored quickly. We need to build a culture of Automation.

9) Autonomy: It can neither bring the system down nor should it go down with the system. In the past, one of our application servers started to slow down because the monitoring tool was consuming too many resources.

Avoid these traps

Cargo cult: Blindly following practices that worked for one successful organization may seem to be working out in the beginning, but that can backfire and introduce unnecessary complications. Next time, when you are using terms such as "Netflix did it that way" or "Amazon does it this way", catch yourself and see if it really applies to you. Some of those ideas are great and noteworthy but they still need to be correctly evaluated for validity in your situation. Every environment is different.

Another form of Cargo Cult is, "We have always done it this way." or "This is our pattern." The fix here is the same as above. We always need to evaluate if past lessons are still applicable or if we can do a better job this time around.

Tool Obsessions: Whenever I had conversations around monitoring, within a few minutes, they turned into a battle of different tools out there. Yes, the capabilities of tools can give us ideas but before spending too much money on these, we need to get some of the basics right and evaluate if those tools can help us or not.

Information I found useful

Sam Newman, in his Principles of Microservices talk briefly touches upon observability being one of the core principles of building microservices. Charity Majors goes into a lot more details in her Monitoring is dead talk. Her book Database reliability engineering also has a couple of good chapters relevant to monitoring. Google's SRE book is another good source as mentioned above. Microsoft's best practices and Practical monitoring book take you deeper into monitoring.

Parting thoughts

With increased observability, when someone says our system is slow, we can add more to it with the information around how slow and the data around the cause. This can be incredibly powerful in fixing problems as they come. It will result in increased resiliency of our system and making a positive impact on the business.

Software engineering is an evidence-based practice. We are always seeking facts. We are logically sequencing them together to generate knowledge out of them. And yet, sometimes we are completely oblivious to the state of our system in production. No amount of testing can guarantee a 100% bug free production software.…

Read More

Message store clean up strategy

In the previous post, we saw how to use message store strategy to make a receiver idempotent. As we are using a table in SQL server, it can fill up quickly if you have multiple types of messages getting stored. How do we keep it from filling up?

Instead of relying on any strategy outside our endpoint, I decided to use NServicebus scheduling. I configured it to send a message every 24 hours. This can be configured easily. My send code looks like below:

await endpointInstance.ScheduleEvery(
    timeSpan: TimeSpan.FromHours(24),
    task: context=>
    {
        var message = new CleanupStoreMessage();
        return context.Send(message);
    })
.ConfigureAwait(false);

The CleanupStoreMessageHandler picks it up after. The handler goes back to the configured amount of days and removes them from the store. That code looks like below:

 var messagesToRemove = dbContext.MyMessageStore.Where(
     m =>  m.Timestamp <= DateTime.UtcNow.AddDays(-10));

 dbContext.MyMessageStore.RemoveRange(messagesToRemove);
 dbContext.SaveChanges();

The NServicebus scheduling comes with several caveats. A relevant one is below:

Since the Scheduler is non-durable, if the process restarts, all scheduled tasks (that are created during the endpoint's startup) are recreated and given new identifiers.

To get around this, my strategy is to keep the TimeSpan for the ScheduleEvery method small enough so that it gets executed at least once in a day. Since I am going back some days based on the day of the execution, if I receive more than one message, the later messages just wont find any data to cleanup and that's ok for me. My query here goes back 10 days and I am sending the CleanupStoreMessage every 24 hours. If I reduce the TimeSpan to 12, I will get 2 messages per day and the second time, I may not find any data to clean. This way my handler has a greater chance of executing and mitigating the limitation.

Another advantage here is that I am cleaning up my own data without relying on any other clever database archiving strategy or some other process to clean up it. I am less likely to forget about it and it requires less maintenance.

In the previous post, we saw how to use message store strategy to make a receiver idempotent. As we are using a table in SQL server, it can fill up quickly if you have multiple types of messages getting stored. How do we keep it from filling up? Instead of…

Read More

Idempotent receivers using message store

In Microservices/Distributed systems, messaging is a preferred form of integration. I covered some of the reasons in one of my previous posts. In these type of systems, whether we have pub-sub/event notification/sending commands, it is a good practice to make the receiving endpoints idempotent. It simply means the endpoint can receive the same message multiple times.

One way of ensuring this is by designing this message body itself. For example, in an accounting system, instead of sending a message to deduct the withdrawn amount, we can send the resulting total. To explain further, if I had $100 in my account and I took $20 out, sending a message of setting the total to $80 would be naturally more idempotent than sending the deduct by $20 message.

However, there are times when this is not possible. In the simple accounting system described above, to send a message of setting $80, the sender needs to know the current state of the destination. This can bring in a whole new set of complexities. What if this is a pub-sub/event notification system? Are we going to keep track of everything happening in subscribers? Even if this was a command, we need to be sure of the current reality of the destination. This means increased coupling between the source and the destination. We are also increasing the importance of the order of messages and the bugs around this are never fun to resolve.
Inadvertently, we could be pushing this distributed system to be more consistent and that is always challenging given several explanations of CAP theorem. I highly recommend reading everything around this topic from designing data-intensive applications.

So, it sounds easy to say, "just send set-the-total-to-$80 message" but the reality is often a little more complex than that especially when we need to have a system more available than consistent or if we are dealing with event notification system with many subscribers. In these situations, it could be more practical to send a deduct-by-$20 message. Now, what?

In my system, I am working with NServiceBus with RabbitMQ as a transport. NServiceBus provides Outbox functionality. It looks good, but it comes with some caveats.

Because the Outbox uses a single (non-distributed) database transaction to store all data, the business data and Outbox storage must exist in the same database.

I can't guarantee this in my system. Also, what if I want to use a NoSQL store for this?

The Outbox feature works only for messages sent from NServiceBus message handlers.

This is very limiting. RabbitMQ is a very capable queuing system. It comes with very easy to use clients and enqueuing a message directly is common.

In comes the simple message store approach. We can store processed messages. In my case, my model looks like below:

public class MessageStore
{
    public Guid MessageId { get; set; }
    public Guid GenericIdentifier { get; set; }
    public DateTime TimeStamp { get; set; }
    public string MessageType { get; set; }
}  

I can use any type of data store for this. I used SQL Server.

Now, in my NServiceBus message handlers, I can access this store as needed. To check if the message was already processed, I can do a simple check based on the message Id. This can happen for several reasons. For example, We can fetch a message out of RabbitMQ queue, process it but positive ack just never reaches back to the cluster because of a network blip.

Often, especially in data update scenarios, we want to discard a message if we have a more recent update processed. This can happen if the message is going through a retry logic and before it comes up again, another more recent update arrives and gets processed. In this situation, I can do a check like below:

var topProcessedMessage = dbContext.MessageStore.Where(m =>     m.MessageType.Equals(nameof(MyMessage)) && m.GenericIdentifier.Equals(message.MyRecordIdentifier))
.OrderByDescending(m => m.TimeStamp).FirstOrDefault();

I can compare the Timestamp of that message against the incoming message and discard the message if it has the Timestamp earlier than the stored one. Otherwise, process the message and store the necessary values in the MessageStore.

This way I can worry about what's relevant for my subscriber/handler. I can easily discard processed and stale messages. If multiple messages are going to make updates to the same entity, I won’t have to clutter it with many different types of LastUpdated Timestamp columns. This store can be easily expanded to store serialized messages, their checksums, etc. As we are not adding this to a pipeline, we are keeping this behavior optional and not applying by default to all the handlers.

The downside of this approach, is of course, the store can get out of hand quickly from a number of records perspective. We will see how to handle this situation in the next post.

In Microservices/Distributed systems, messaging is a preferred form of integration. I covered some of the reasons in one of my previous posts. In these type of systems, whether we have pub-sub/event notification/sending commands, it is a good practice to make the receiving endpoints idempotent. It simply means…

Read More

Why should you avoid get calls across microservices?

In the previous posts in this series, we saw what microservices are and how to start the journey towards broken out services from the monolithic application.

In the second post, I talked about not having get calls across microservices. On that, I received some questions. Here's one:

"Nice article on microservices. But i did not get reasoning behind
"You shouldn't make a query/get call to another microservice."

My reply to that is below:

As a service, you should have your own store for the data you need. If you don't, what happens if the service you're using for get calls goes down for an extended period of time? You are blocked. Your function that depends on it fails now. What if you have multiple of these? You just increased your reasons to fail by that factor.

To explain this further, too many blocking get calls may result into a system that can get completely stalled. What if the individual calls that are happening have different SLAs? Do we block the entire request for the slowest call?

One of the projects I worked on in the past had these types of calls spread through out the system. In the worst cases, the http requests took well over a minute to return. Is this acceptable?

To counter this, the team went towards caching the entire relational data stores. Pulling all that information down became very tricky. The data sync was happening once a month. The sync job itself took days to finish, pushing the changes further back. I have seen many project teams going towards this type of solution and eventually running into the scale problem. Then they settle on syncing the data using messaging.

So, what is the trade off? I explained it below:

The data duplication that may come with this is an acceptable trade off for scale. In other words, you are giving up some consistency and moving towards eventual consistency for scale and availability.

One of the teams I worked with did a good job differentiating the types of messages. Some of the changes had to be pushed forward at a higher priority, some required a little more processing, etc. We built different routes for those messages so that we could maintain our SLA for these messages. The data was still cached but we were able to get rid of the sync jobs that were trying to sync all the data at once. Messaging enables different kinds of patterns. I recommend Enterprise Integration Patterns book for those patterns. It goes deep into the different kinds of messaging patterns.

I used this example afterwards:

For a banking app, you can say, there are $100 in your account as of the last sync time instead of $0 because you can't hit that service. People are much more likely to freak out in the later scenario than the one before. This method scales too since we have reduced the consistency level to eventual. It sounds radical but you are trading in strong consistency for scale and user experience. I hope that clears things up for you. I think I should write another post on this :)

I went on further to express the points below:

Another reason is network. You can't take it for granted. In any of these situations, you don't want your customers to get affected by this implementation detail. The way to avoid this is by building your own store beforehand and keep it synced through messaging.

You don't have a distributed system if the system is not resilient enough to consider the network loss. This is something that needs to be considered on day one. Your customers are not going to care that your network failed. They want your app to work.

If your system needs too much of a cross service chatter then it could be a sign of wrong system boundaries. It means that the service is just too fine grained. Your services could be suffering from the nanoservices anti pattern. It is a subtle problem. The SOA patterns book goes into the details of that. Some of the services may require merging.

The Netflix approach

They go the hybrid route. They hit the local cache store first, if it is not available, they fall back on cross boundary calls. Josh Evans explains that very eloquently in this talk.

To summarize, we are giving up some consistency for autonomy. We can solve the consistency problem using data pumps or sync jobs easily but autonomy has to remain as strong as possible. Without autonomy, you won't be able to materialize any benefits of moving towards microservices architecture.

In the previous posts in this series, we saw what microservices are and how to start the journey towards broken out services from the monolithic application. In the second post, I talked about not having get calls across microservices. On that, I received some questions. Here's one: "Nice article on…

Read More

Sending JSON message natively to RabbitMQ for NServiceBus subscribers

I was working on putting together some sample code to demonstrate what data synchronization services may look like. I will write a post about the details and purposes of this in a later post. The TLDR; version for that is, I have a trigger which gets fired on inserting a new row in a table, it inserts a row in the NServiceBus queue in the same database. The handler then picks up that message and puts in the RabbitMQ directly. For further illustration, please see the diagram below:

I followed NServiceBus documentation for sending the message natively to the RabbitMQ. In their example they used XML. In my case, I was working with JSON. After pushing the message on to the queue, I ran into a problem on the subscriber side. Here's the error:

 2017-08-29 15:45:48.269 ERROR NServiceBus.RecoverabilityExecutor Moving message 'd38b59e7-6680-416e-a679-d9a016972240' to the error queue 'error' because processing failed due to an exception:
NServiceBus.MessageDeserializationException: An error occurred while attempting to extract logical messages from transport message d38b59e7-6680-416e-a679-d9a016972240 ---> System.Exception: Could not find metadata for 'Newtonsoft.Json.Linq.JObject'.
Ensure the following:
1. 'Newtonsoft.Json.Linq.JObject' is included in initial scanning. 
2. 'Newtonsoft.Json.Linq.JObject' implements either 'IMessage', 'IEvent' or 'ICommand' or alternatively, if you don't want to implement an interface, you can use 'Unobtrusive Mode'.
 at NServiceBus.Unicast.Messages.MessageMetadataRegistry.GetMessageMetadata(Type messageType)
at NServiceBus.Pipeline.LogicalMessageFactory.Create(Type messageType, Object message)
at NServiceBus.DeserializeLogicalMessagesConnector.Extract(IncomingMessage physicalMessage)
at NServiceBus.DeserializeLogicalMessagesConnector.ExtractWithExceptionHandling(IncomingMessage message)
--- End of inner exception stack trace ---
at NServiceBus.DeserializeLogicalMessagesConnector.ExtractWithExceptionHandling(IncomingMessage message)
at NServiceBus.DeserializeLogicalMessagesConnector.<Invoke>d__1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---

This was happening even before my breakpoint in the message handler was hit. I had Newtonsoft.Json referenced correctly. It was getting scanned as well. So, what was happening?

After feverishly scratching my head(and losing a little more hair) and some StackOverflow searches, I started to look at my message on the queue. It looked like below:

{
   "FirstName":"Jane",
   "LastName":"Doe",
    "Id":"44ECBBFD-85B6-47C8-9E08-3DCF2633862C"
}

It looks like a perfectly valid JSON. When NServiceBus picked it up off the queue, it left it as a JObject. Why? Because NServiceBus couldn't figure out the type. The type information was not part of the message. This type property has to be on the top. So, the JSON should look like below:

{
   "$type":"Messages.NewUser"
   "FirstName":"Jane",
   "LastName":"Doe",
    "Id":"44ECBBFD-85B6-47C8-9E08-3DCF2633862C"
}

There are multiple options such as using JsonConverter, adding the type property in the message, etc. I went to a simpler route as below:

 var jObjectFromMessage = JObject.FromObject(message);
 var typeName = typeof(NewUser).FullName;
 jObjectFromMessage.AddFirst(new JProperty("$type", typeName));
 var serializedMessage = jObjectFromMessage.ToString();

And that worked! The complete handler code looks like below:

public async Task Handle(NewUser message, IMessageHandlerContext context)
{
     var factory = new ConnectionFactory
     {
          HostName = "localhost",
          UserName = "guest",
          Password = "guest"
     };

     await Task.Run(() =>
     {
         using (var connection = factory.CreateConnection())
         {
             using (var channel = connection.CreateModel())
             {
                  var properties = channel.CreateBasicProperties();
                  properties.MessageId = Guid.NewGuid().ToString();

                  var jObjectFromMessage = JObject.FromObject(message);
                  var typeName = typeof(NewUser).FullName;
                  jObjectFromMessage.AddFirst(new JProperty("$type", typeName));
                  var serializedMessage = jObjectFromMessage.ToString();                        
                  var messageBytes = Encoding.UTF8.GetBytes(serializedMessage);

                  channel.QueueDeclare("SyncUsers.RabbitMqEndpoint", true, autoDelete: false, exclusive: false);
                  channel.BasicPublish(string.Empty, "SyncUsers.RabbitMqEndpoint", false, properties,  messageBytes);

                }
            }
     });
}

That's it. I hope this helps if you are trying to send JSON messages directly to RabbitMQ for NServiceBus subscribers.

UPDATE 09/04/2017

Simon Cropp on Twitter sent me a note about enclosed message types serialization headers. I tried them out and that worked too. I removed jObjectFromMessage.AddFirst(new JProperty("$type", typeName)); and added NServiceBus.EnclosedMessageTypes header to my message like below:

   var typeName = typeof(NewUser).FullName;
   var properties = channel.CreateBasicProperties();
   properties.MessageId = Guid.NewGuid().ToString();
   properties.Headers =  new Dictionary<string, object> {{"NServiceBus.EnclosedMessageTypes", typeName}};

This could be useful if you don't have control over the serialization.

I was working on putting together some sample code to demonstrate what data synchronization services may look like. I will write a post about the details and purposes of this in a later post. The TLDR; version for that is, I have a trigger which gets fired on inserting a…

Read More

Tackling a monolith - where to begin?

In the previous post, we saw what microservices are. We also looked at how they differ from the big monolithic apps.

In the startup/earlier days, you did a file => new project in Visual Studio or your favorite IDE, designed some perfectly normalized SQL database, brought in an ORM on top of it for reasons. It worked great for a while. You brought in teams of people to do heads down development. The app has grown in size along with the user base and the org. With that, you are running into problems that you didn't anticipate before. The application is feeling slower, the complaints of timeout are coming in and bug reports are increasing. The backlog of new features is starting to look bigger. The tech debt is out of control. Engineering teams are frustrated at best because they know there are problems but their hands are tied. They have pushed some of the processing to the scheduled jobs but now those jobs are running into problems. The app is feeling like a giant monolith. The business functions are vertical in nature but the supporting IT teams and solutions are cross-cutting. This has increased coupling and complexity of the system. Even smaller changes take huge app deployment lasting hours which scares everyone. The delivery to production takes longer than ever, adding to the frustration. The current infrastructure has been pushed beyond its limits. The question of scaling further is very real. Breaking this app makes sense based on all the material out there but where to begin? How do we create microservices out of this while delivering features that keep the business viable? Tackling up the monolith seems like a necessary step but how do we justify the costs for breaking it up? How do we convince the business that this is not just another refactoring effort from the engineering team? Very real questions! This is a very common scenario these days. Where do we begin when El Capitan of a monolith is staring you right in the face?
I have been there. Now, let's take a look at some of the ways.

Take clues from refactoring practices

Over a period of time, all codebases gain some tech debts.There are times when you deliberately punt on problems. The technology changes, the nature of problems is never static and yesterday's solution becomes obsolete. One approach I've taken is to refactor code just a little bit at a time, just around the area that your new feature code is going to touch. Of course you'd need to co-ordinate this with your pull request reviewers so that they don't get surprised by more code changes than necessary. This makes your code a little better today than yesterday. Since it is incremental and scoped to the small area, this can become the team's habit in a short amount of time. As opposed to this, I was never able to sell big bang refactoring to the business or project management. Coincidently, I found this supporting post from Steve Smith essentially stating the same thing. So, when it comes to tackling a monolith, it is essential to take an incremental approach instead of a big bang one.

Business Alignment

One other step that can be done in parallel is to really understand the business domain from the perspectives of bounded contexts. We can start isolating the business processes in a way that they don’t leave their boundaries. This is not a trivial task to undertake. Often times, it involves a lot of different people from the org and knowledge of the existing processes. If the app is constructed in vertical slices instead of layers as shown by Jimmy Bogard in this talk, we can consider that our app is ahead of the game.

One reason we want to do this is to understand the misalignment that has happened throughout the organization in terms of teams and functions by horizontal layering. The cross-cutting services for everything may seem appealing at first because of the code reuse, upfront time savings and the costs but it insidiously increases the tight coupling. That happens because the reason for change is not considered. Over a period of time, the gap increase by a natural progression of business processes and the horizontal structure of supporting teams and applications. This leads to too much of cross communication between services. If they are disjointed services, that could end up making the system incredibly chatty.

Another big purpose behind this is to get the organizational buy-in into this type of project. It is important for business to understand why vertical slicing is necessary, how much does it cost currently to deliver a piece of code, how this can directly improve the delivery and the possibilities it opens up from the end user’s perspective. It can also show them places to improve efficiencies by cutting down redundancies.

After identifying these areas, we want to establish the communication patterns between different pieces and start establishing service layer agreements for those. For example, when a user buys an item on an ecommerce site, do we show them a spinner until it gets shipped or is it ok to send them an email saying we will send you updates as the order goes through the different steps of processing.

Choosing the area

It is very important to have monitoring in place for a variety of reasons. In this case, we can generate a heat map to identify which parts of the app are heavily used over the others. This can help us setting some weights on these areas. Highly used areas and the slowest performing could be a good starting place.

One other thing to consider is how many new features are coming down the pipe and if the system is no longer in active development. If there is too much happening at once, changing architecture may delay the delivery. The real answer here is, it depends on the type of changes and how much time we can buy from the business. If the system is no longer under active development, it could be a candidate for retirement or get isolated enough for near future retirement. This is not a good candidate.

Remote procedure calls

This is another area that comes in the way of scaling the apps. You don't want to create an accidental DDOS attack on someone's service because you spawn hundreds of your container instances and you call them directly in your process. Also, the network shouldn't be taken for granted. This is another good area to start analysis right away. Eliminating the read RPC calls may not be a trivial task but identifying this is an important step. Our app is not going to scale very well if we are hardcoding the calls to other services. Service discovery with Consul can be helpful. We will see the details on Consul in the later posts.

Shared relational databases

One of the problematic and yet very common forms of integration patterns is the shared relational database. It creates coupling that becomes very hard to remove. Once we start introducing foreign keys into it, it becomes even harder to scale. Is it fair to say the degree of normalization is inversely proportional to the ability to scale the database? Even if some magical indexing strategy makes the calls very responsive, its maintenance is cumbersome at best. I'd recommend this talk by Pat Helland. He has also written some papers in this area. Knowing this, we can start identifying some of the most heavily used tables, their relationships and how applications are using them. This sort of analysis can give teams ideas about reducing the dependencies on these tables and setting them up for the complete break up in the future.

Containers

Before jumping too much into the container world, you will have to increase the organizational awareness and education around them. This is especially true in the Windows environments since most of the containers come from the Linux world. How do we leverage these? One place to try them out is in your build and test systems as mentioned in this .Net rocks episode.The risk is lower. The faster build and test pipeline can help in delivering the software quicker.

Parting words

All of these can be done, in parallel, with one goal in mind of identifying areas that could come in the way of breaking up a monolith. After identification, the app areas can slowly start making changes for some immediate cheap wins. I hope this helps.

In the previous post, we saw what microservices are. We also looked at how they differ from the big monolithic apps. In the startup/earlier days, you did a file => new project in Visual Studio or your favorite IDE, designed some perfectly normalized SQL database, brought in an ORM on…

Read More