Writing Effective Commit Messages
Committing code is an integral part of life as a software developer. In the last eight months alone, I’ve committed nearly 800 times to my work repository. Yet, throughout my professional career, commit commenting style has been rarely discussed. It’s easy to brush away the task of describing your work with ineffective messages like, “Fixed it”. Even worse is to commit without a message. What is a better way to go about writing commit messages?
Knowing Your Audience
Commit messages, like other forms of writing, require you to know your audience. Your audience is anyone on your team, including yourself, at the present time and at any time in the future. To further complicate matters, a majority of the people reviewing your commit are likely unfamiliar with certain regions of the codebase. Given these considerations, how can you structure your messages effectively? I structure my messages by asking two questions:
- What have I done?
- Why I did I do it?
The response to the first question typically falls into one of three categories: (1) adding/updating/removing a feature, (2) fixing a bug, or (3) refactoring existing logic. To be effective, this response must be a single succinct and expressive sentence. The response to the second question is the opportunity to elaborate on the change’s intent and context. Until recently, this is a question that I did not ask myself. This led to situations where I could easily understand what had changed in a commit, but it would be difficult or impossible to understand why I had made a change. You do not want to end up in this position while tracking down bugs. Help your future self and others by explaining the rationale for the change. I have found my intent typically revolves around one of two broad categories: (1) business decisions or (2) side-effects.
Example Commit Messages
Let’s look at a few trading-related examples from my recent work to see how this works in practice. Each example is composed of a single sentence explaining what was done, followed by a newline and additional elaboration about why a change was made.
Prevent account balance updates out of chronological order.
Enforce that the previous account balance timestamp must match the preceeding trading day’s timestamp. This defends against updating the account balance when trades have not been loaded for all prior trading days.
This example falls into the bug fix category. Notice how an action verb begins the summary sentence explaining what was done. This pattern continues throughout all the examples. Although it might be straight-forward to understand why out of order updates are a bad idea, consider reading this commit among dozens of other commits. Explaining the change’s context reduces the cognitive load required to understand this change.
Warn about stale market data after two minutes.
Previously, the system waited ten minutes to warn in order to avoid issuing many warnings during intraday trading breaks. Since all trading breaks are now modeled within MarketHours, there is no reason to keep the warning threshold at ten minutes.
Here is an example of updating a feature with an extremely straight-forward change that carries significant context. You can imagine this change diff consists of changing a ’10′ to a ’2′. But, why was this value changed at this time? Without the second part of this commit message, you need to remember why the value was set to 10min and what changed in the system to allow you to change the value to 2min. Now, at a bare minimum, you know that a change in the modeling of trading breaks is why this change is now possible. Again, consider a commit log with only the first sentence and one with the entire message. Which would you find more helpful?
Default to requiring 1+ trades before committing transactions to database.
This change prevents the account balance from being updated when there are zero trades available due to a technical error. An override flag is provided in the event that it is not an error to have zero trades.
To drive home the point that there are often external reasons for a change, the example above shows another straight-forward change. Without reading why the change was made, what would you guess is the reason for the change? Could it be a performance issue? Perhaps business requirements have changed? It’s unclear to me. The second part of this message suggests that the change is due to business requirements. The context indicates that it is likely a technical error (e.g. a remote service was unavailable) to have zero trades available when updating an account balance.
Let’s presume you discovered a bug in this change. Without having any context provided in the commit message, you might assume there is no need for this one or more trade requirement. You may ultimately end up reverting this entire change without reworking the logic to satisfy the business requirement. Now, you ‘fixed’ the bug, but are failing to fulfill your business requirements.
Practice, Practice, Practice
I hope your takeaway from this post is a realization that adding more context to your commit messages will likely simplify your life and your team members lives in the future. As you practice answering the two questions when writing your commit message, you may find situations where added context is unnecessary. That’s fine. Not all changes require in-depth explanations. Identifying the right level of detail becomes easier with practice. As you become better at identifying when more detail is needed, you will discover that reviewing your commit messages can be informative rather than mysterious.
Preparing for Live Tests
When you Can’t Drop the Database
Like high-level requirements, live testing goals appear deceptively simple. ”Exercise the trading API implementation with my trading partner” is a single statement that to the trained eye reveals numerous scenarios. Ideally, you already considered most scenarios and wrote automated tests to verify functionality. But, unlike sandbox testing, when something goes wrong in a live testing scenario, clearing a database probably won’t fix anything.
Consider the situation I found myself in as I was testing my trading API implementation. I am on a conference call with my trading partners preparing to manually submit trades. I submit a trade that should not be filled when I am told, ”That order shouldn’t have been filled, but you were executed. Close your position immediately!” I need to act quickly. Real money is the on the line and people from another company are awaiting my response.
I admit this is a dramatic example. In live testing scenarios, I tend to be a bit tense and nervous. Even with a suite of automated tests, I still fear something will go wrong. I don’t do my best thinking under-the-gun, so I prepare plans for handling unexpected outcomes in advance. This helps keep me calm and in the event something goes wrong, I will hopefully have a fallback available.
How to Prepare
Reflecting on the different areas of preparation led me to the following categories: communication, flexibility, and instrumentation. Let me detail questions I answer in each category to determine readiness.
Communication
Have I explicitly stated the test scope?
While the purpose of a test may seem obvious to you, the third party may be unaware of special cases you wish to exercise. Express your test cases up-front to avoid discovering during the test that a subsequent test is needed for a special case.
Has the third party confirmed that it can take corrective action in the event of X?
X is a disaster scenario where you lose control to fix the situation. For example, if an order erroneously executes and you lose network connectivity, it is imperative that the third party can close a position on your behalf.
Flexibility
How easily can my test harness test each known scenario?
Using the trading example, it is helpful to be able to interactively specify all order properties on-the-fly. It becomes increasingly more important to have a flexible test harness when live tests are hard to coordinate.
In the event of an emergency, what is the fastest way to deploy a new build?
Ideally this option is rarely or never used. But, in the event that you need to build a kludge for testing purposes, knowing the deployment process will minimize delays.
Instrumentation
When I perform test case X, what information will the third-party want me to confirm?
Asking this question of each test scenario is a good way to ensure all the requisite information will be on-hand.
If X goes wrong, what information would help me debug?
When a test scenario goes awry, it is extremely discomforting to discover a piece of relevant information is not visible.
Are log levels configured appropriately?
This sounds trivial, but I’ve missed this item before and nullified all my efforts to view my debug statements.
And a useful tip for testing on remote machines: use Screen! This way when you lose network connectivity at the worst possible moment, your foreground process continues running and you can reconnect and resume as if nothing happened.
I learned some of these lessons the hard way. Learn from my experiences so you don’t fall prey to the same mistakes!
How do you Represent a Price?
Source code shown in examples is available on Github.
If you answered, “double” or “BigDecimal”, I think you can do better. These data types capture the value of the price, but what do they say about its units? Absolutely nothing! I believe this is a big problem. How can you prevent a price expressed in U.S. Dollars from being added to a price expressed in Euros? Naming conventions, like priceInDollars, provide zero enforcement of a policy and result in unwieldy variable names. Ideally, the solution to this problem will provide compile-time safety. Compile-time safety ensures that there will never be an operation that violates dimensional analysis. But, how can we do it?
Case Classes
One approach is to construct a case class for each unit of a price. A trivial example may look like:
Case classes representing prices
This approach provides the compile-time safety we want, but it is unwieldy for several reasons. Its usage requires an invocation of ‘value’ member, which adds noise to the code. Each new unit of price requires creating a new class and then implementing every operation (e.g. addition, subtraction, etc.). Fortunately, in Scala, it is possible to leverage language features to come closer to the goal without so much overhead.
Tagged Types
In Scala, there is this concept of ‘tagging’ a type to add context to its meaning that is enforced at compile-time. Originally, I came across this concept from Miles Sabin’s gist, which I use for the examples below. Abstractly, tagged types makes use of Scala mixins and structural typing to enable one type U, to be attached to a type T, such that the API of type T is still accessible. If we are representing a price with tagged types, then T is BigDecimal and type U is the unit of the price.
This is a powerful concept because defining a new unit of price (e.g. U.S. Dollars or Euros) will only involve defining a new type. Using this new type to ‘tag’ a BigDecimal exposes all of the operations of (i.e. API) BigDecimal. Let’s take a look at a concrete example:
Tagged types representing prices
The example defines a ProfitCalculator to demonstrate how the compiler enforces the type safety. ProfitCalculator is able to calculate profits only with prices of the same unit. As is shown in the commented out code, mixing units of price will cause compilation to fail. Using tagged types, it is possible to perform arithmetic because the underlying referenced value is a BigDecimal. And since the definition of a unit of price is a type instead of a class that references a value, there are no awkward ‘value’ references.
In addition to removing noise from the code, I find this approach valuable because it increases one’s ability to reason about a program. Explicitly referencing the types in the argument list of ProfitCalculator.calculateProfit(), makes it transparent to the API consumer what values are acceptable. In comparison, the case class approach wraps the BigDecimal value, which makes it more challenging to understand the price’s data type.
Units of Measurement
In my opinion, applying tagged types to the representation of a price is a substantial improvement to simply representing it with a BigDecimal. However, one should not mistake tagged types for units of measurement. From the compiler’s point of view, the following is valid:
tag[Usd](BigDecimal(5)) + tag[Eur](BigDecimal(5))
The plus operator acts on a BigDecimal, so here tagged types cannot defend against arithmetic that violates dimensional analysis. The only way to prevent this type of mistake is to implement a system for units of measurement.
This is a topic that I will be exploring. In principle, I believe representing units in code is a good idea. However, in practice, I think it needs to be done in such a way that there is minimal overhead to write code expressing units. Two references on the topic I have found are Scala macros and ScalaQuantity. If you are interested seeing additional examples with tagged types, see this excellent blog, Practical Uses for Unboxed Tagged Types.
Refactoring for Maintainability and Extensibility
The Merits of Object-Oriented Design
Check out Github for the full source code used in this post.
Although I have shown that object-oriented design is not the only way to organize code, I would like to share two object-oriented principles that I believe are worth rigorously applying: (1) single responsibility principle (SRP), and (2) open/closed principle (OCP). When applied judiciously, these principles lead you to writing loosely coupled code that narrows the scope of each class/function. In my experience, this is the only kind of code that can be maintained in a large project. Limiting the responsibilities of a code segment allows you to effectively reason about its logic without concerning yourself with the complexities of the entire code base. As is often the case in software development, proper abstraction is a core concept to SRP and OCP.
In Dire Need of Refactoring
Let’s explore an example that exposes the weaknesses of complecting code and then take a look at how to make it better. Consider a financial system processing trades that applies special (legal) rules to modify the execution price of a trade before it is reported to the trader. For simplicity, the execution price is only modified when:
- The symbol associated with a trade matches a configurable symbol.
- The volume associated with a trade is less than or equal to a configurable volume threshold.
The execution price modifications are reflected in the reported price. With these two rules in mind, consider one implementation of a TradeExecutionProcessor:
Complecting trade execution processor
Well, it works, but is it great? The answer is a resounding “No!” Here are questions I would raise if I encountered this approach to the problem:
- How will this code be unit tested?
As written, it is impossible to isolate testing individual rules. To exercise the volume price modification, one must always also consider the symbol price modification. This unit test will become increasingly harder to maintain as rules are added. Eventually, when it is too complicated, the unit test will just not be updated. - How will this code be extended?
Inserting an additional rule requires modifying the internals of the TradeExecutionProcessor. The lack of separation between what rules are available and how they are applied limits one’s ability to reason about rules in isolation. Another way of expressing this is that ComplectingTradeExecutionProcessor exhibits tight coupling between rule definition and rule application. - How will rule ordering be changed?
The structure of this logic implies ordering. Symbol-based price modification must occur before volume-based modifications. From a business perspective, is this true? In this case, switching the order of rule application yields the same result. This logic does a poor job of explicitly express this notion. However, for the moment, let’s assume that rule application order matters. In this scenario, if a change in rule ordering is required, one must modify the internals of ComplectingTradeExecutionProcessor and risk breaking unit tests and other functionality.
How Did it Happen?
In my experience, I’ve (unfortunately) encountered numerous analogs to ComplectingTradeExecutionProcessor. Worse yet, I’m sure I’ve been involved in the fabrication of these maintainability and extensibility nightmares. Bad code often begins as OK code that iteratively morphs through successive (unexpected) feature requests. Here’s how it might have happened:
- You break ground on an exciting new task: processing trade executions. The super simple process is to look up a trade request provided a request ID and then set the executed price to the given executed price.
- Your business realizes it can make money by occasionally modifying the execution price in its favor. Your product manager requests that the price sent to the trader is changed when the trade matches a certain symbol, let’s say, EURUSD. You are also informed that it is imperative for the business to maintain a record of the initially received price and the modified price. Armed with the awesome power of Scalaz, you make quick work of this story by adding a reported price property to ExecutedTrade and by introducing the symbol-based rule shown in ComplectingTradeExecutionProcessor.Here is where this code begins breaking SRP. While it is true, that the symbol rule is part of trade execution processing, it should not be the responsibility of this function to define the rules to be applied. Similarly, I expect you would find it strange if there was a SQL statement to find the requested trade embedded in this logic.
- It turns out that traders submitting small volume trades are ruining the profit margins of your business. Your business decides to reign in costs by charging traders making small trades more. And now we come full circle to the current picture. While this picture is not too scary, I think it is easy imagine how this process continues.
Let’s Make it Better
The ideal solution to this problem will make it easy to answer the questions posed earlier. It will apply SRP and OCP to lead us to a solution that is extensible and maintainable. Given the ComplectingTradeExecutionProcessor, I would refactor it in the following ways.
Extensible trade execution processor
Let’s analyze how this implementation allays our earlier concerns:
- How will this code be unit tested?
The ExtensibleTradeExecutionProcessor unit test will exclusively focus on ensuring that a set of rules are applied to the executed price and that the resulting reported price calculation is mathematically correct. There will be unit tests for each rule that are free from the side-effects of other rules. Successfully separating which rules are applied from how the rules are applied fulfills the desire to write code that applies the SRP. - How will this code be extended?
Unlike before, inserting an additional rule no longer requires modifying the internals of ExtensibleTradeExecutionProcessor. This is a successful application of the OCP. As a developer, this should be a welcomed relief because rules may be added or removed without fear of breaking the reported calculation. An additional benefit of this approach is that since each rule is now a function, each rule has an easily identifiable name. The function names should use the same vocabulary as the business, which makes it simpler to review rule implementation with project stakeholders. - How will rule ordering be changed?
Rule application is now a configuration concern when the application is instantiated, instead of being embedded in the TradeExecutionProcessor implementation. The current implementation clearly indicates that rule ordering does not matter because the price mutators are a set. Should requirements change, price mutators can be changed to a list to denote that rule ordering matters. In this case, it is still a configuration concern to order rule application. This approach yields dividends when rule application ordering is changed and as a developer, you can configure rule ordering without fear of breaking unit tests or the reported price calculation.
I hope that the refactored solution struck you as being simple. Simplicity is often identified with software that obeys SRP and OCP. Study the unit tests to see how this new approach to trade execution processing removes complexity. Then, go find some complecting code in your own project and make it simpler!
A Taste of Monadic Design
The Imperative Way
All code associated with this post is available on Github.
I recently came across Scala code trying to map external IDs to entities. The nature of this problem is financial trading. This particular code focused on translating a newly received order from a client into an internal (resolved) version of the order. The code read imperatively, meaning it was written in a Java style. Take a look to see what I mean.
Imperative style order resolution
(My apologies, I’m trying to get WordPress plugin support so I can embed gists!)
On the plus side, this function returns an Either instead of throwing exceptions, however there are two significant code smells present:
- Branching that leads to multiple return statements
- Invocation of Option.get to return an Option’s value
The code flow is disjointed and it is not written idiomatically. Can we do better? Definitely. One of the interesting aspects of writing Scala (or other functional languages), is that there are opportunities to learn concepts that can fundamentally alter your programming style.
Enter Monadic Design
Until just a few months, ‘monad’ was not part of my vocabulary. Although I’ve learned a bit about monads and the larger related subject of category theory, I’m by no means an expert. For those interested in digging deeper into the subject, I recommend this talk by Dan Rosen and these two StackOverflow posts.
Without requiring a formal understanding of monads, applicative functors, etc, I hope to show that you can write more maintainable and expressive code. There are two keys to understanding how we can improve the existing logic:
- Most of the NewOrder properties need to be conditionally translated (i.e. mapped) to another form. For example, a reference to a Symbol is required, but only if the symbol name is valid.
- Some of the translated NewOrder properties are required for future computations. For example, an account ID cannot be mapped to a TradingAccount without first having a reference to a MarketTaker.
With this in mind, check out the refactored version of the order resolution code. Rather than focusing on the details of the solution, focus on the overall code structure.
Monadic style order resolution
A lot has changed, but how do these changes relate to the fundamental issues outlined earlier? Let’s explore the major concepts at play to better understand what is happening.
- All of the “questions” asked (i.e. Does a Symbol with the provided name exist?) return Option or are otherwise converted to Option. Option expresses that there may or may not be a returned value.
- Each Option is translated into a Validation in order to express what should happen if the Option is a None. In this example, the Failure type is a String that explains why resolution failed.
- The result of each operation is bound to a value using the for-comprehension syntax. The bound value has useful properties.
- The bound value is of the contained Success type. This means that the first generator is of type Symbol, instead of Validation[String, Symbol] [1]. The binding provided by the for-comprehension is useful because it eliminates the branching statements as well as the awkward Option.get invocations, which satisfies the first key to improvement.
- Each bound value is in scope for all operations that follow. This property enables a Success value to be provided to future computations within the same scope, which satisfies the second key to improvement.
- Since the bound value represents a Success, this implies that if the result of a computation is a Failure, no additional computations will occur.
- The yield expression is the equivalent of a map invocation that returns an instance of Success with type ResolvedNewOrder.
- The entire for-comprehension produces a Validation[String, ResolvedNewOrder] that will either be a String Failure or a ResolvedNewOrder Success.
Another way of gleaning insight into how this is working, is to check out the unit test that runs the same tests for both styles of resolvers. Take some time to review both sets of code in order to better understand how the solution was migrated from an imperative to a monadic approach [2].
Even if you do not follow 100% of the changes, my hope is that you at least see the benefit in exploring monadic behavior to solve problems more expressively than in an imperative language.
Is There Room for Improvement?
The refactored solution satisfies all of the initial goals, but I think it leaves room for further improvement. I want to focus on the three generator expressions that are bound to an unused value (i.e. _). This is a code smell because this example’s use case for a for-comprehension is to provide locally scoped values, but the last three generators are not providing any used values.
I believe the last three generators are concerned with validation rather than resolution. As an exercise for the interested reader, I propose separating resolution and validation. This is an opportunity to showcase composing Validations because when performing validation, it is desirable to see all failures, rather than just the first failure. If you are looking for inspiration for how to attack this problem, I recommend reviewing Chris Marshall’s (amusing) tale of three nightclubs and Debasish Ghosh’s post on composable domain models.
And last, but not least, I would like to express my gratitude to my colleague, Dave Stevens. He first introduced the use of Validation and monadic design to me and he has been a great functional programming mentor to me. Thanks Dave!
[1] In this scenario, the for-comprehension generator is syntactic sugar for a flatMap expression. flatMap is invoked on Validation, which accepts one argument of type A => Validation[EE, B], where A is the Success type, EE is the Failure type, and B is the resulting Success type. Given this definition, it follows that that the bound value is of type Symbol (i.e. the Success type).
[2] It should be made clear that a Validation is not a monad. Instead, it is an applicative functor. As I understand it, the difference between these two concepts is that unlike a monad, an applicative functor can carry forward results of previous computations.
Do You Know if Your Distributed System Always Works Correctly?
From Hero to Zero
You’ve spent months developing features for your distributed, fault-tolerant system. Your system dynamically balances load, handles new components entering the cluster, and chugs along when parts of the cluster fail. Production deployment is just a couple of weeks away. And then, it happens. You realize the system works most of the time, instead of all of the time.
Maybe it’s a Bad Build
“How is this possible?” I asked myself while staring at the output of my failed integration tests, which were running in a clustered environment for the first time. The tests had been consistently passing in a single instance environment, so why would they fail in a clustered environment? I realized that I had probably uncovered a large problem, but rather than assume the worst, I began with denial. I told my team lead, “Maybe it’s a bad build.” We all know it probably wasn’t a bad build, but remember, it was the denial stage. I re-ran the integration tests and they passed. Whew! It was a bad build. Another two builds passed all the integration tests, and then the same set of tests failed again. Tests failing once might be a coincidence, but not twice. How did we get here?
The Cost of Waiting
Like most teams, my team has a Hudson build server running unit tests and zero tolerance for breaking the build. While the unit tests have been continuously running since day one (over a year ago), my team did not have a set of automated integration tests until just a few months ago. At first, the integration tests were run against a non-clustered environment. Although this setup did not mimic a true production environment (i.e. multiple instances of each component), it is valuable to have a minimally complete system setup as a first step to ensure correctness. Many days of debugging later, the integration tests consistently passed against the single instance environment.
Only in the last few weeks did my team have a story to continuously run integration tests against an environment that emulates production. Unsurprisingly, the distributed system functions differently in a distributed environment. Suddenly, tests that once passed, now failed. When were broken changes introduced? Was the logic always broken? Since so much code was written before continuously testing against a production environment, it becomes exponentially more challenging to identify the source of the problem. The lack of proper integration testing is a form of technical debt. Like all forms of debt, it must be paid off, and the longer you wait, the more expensive it becomes.
Staying out of Debt
My recent experiences have shown me the true cost of debugging distributed system bugs. It’s extremely expensive and it’s complex. Below are my thoughts on how to approach integration testing to avoid going into a lot of technical debt. These thoughts come from working on a distributed system, but should be considered general principles of system design.
Include Integration Testing in the Cost of a Story
Pricing out the cost of integration testing makes it clear to stakeholders how long it will take to have a working feature. If developers don’t include the cost of integration testing a feature, then the debt still exists, but it’s just temporarily hidden from view. There is no way to avoid proper integration testing, so don’t fudge the numbers.
In my case, since my team lacked a proper set of tools to repeatedly integration test in an automated manner, integration testing was effectively brushed aside because it was practically impossible to do (simple scenarios took approximately an hour to setup and execute). If integration testing were included in the price of each story, each story would have become exorbitantly expensive, which leads to the next item.
Automate Integration Test Execution
It must be a priority early in the development of a project to have some form of automated integration testing. Just like it is a top priority to immediately setup a continuous integration server to run unit tests, so should be the case for integration tests. If the end-to-end tests are not run automatically, they won’t be run at all. And if end-to-end tests are not running, how do you know if your system is working?
After the successful completion of each project build (i.e. unit tests pass), my team has configured Hudson to run integration tests against a non-clustered environment. This is a first pass to ensure that the most recent commits have not broken end-to-end functionality. Then, integration tests are run against a replica of the production environment each night. At most, one day’s worth of changes can be committed without discovering errors.
Simplify Writing Integration Tests
Now that tests are run automatically, there must be tests written to exercise new functionality. As with most things, the more complex something is, the less likely someone is to do it. Consider the pain of writing unit tests without a great mocking framework like Mockito. If writing integration tests is complicated, then new tests will be hard to write and existing tests will be harder to maintain. The cost of your stories will still be expensive.
Drawing on Mockito’s architecture for inspiration, my team has developed an integration test framework with a straight-forward DSL for defining tests (this is a great Scala use case!). The framework loads and creates all of its data based on the user-specified environment host, which provides the flexibility to run all integration tests locally and remotely via configuration. Now that my team is equipped with a powerful and simple-to-use framework for end-to-end testing, integration testing is easy enough to do that it is part of estimating every user story.
Keep Integration Test Data Independent
It is very alluring to share data among tests because if multiple tests need the same data, why bother recreating it? Resist the urge! Sure, two tests need the same data today, but what about two weeks from now? Once data are shared, tests become fragile because they depend on each other. It will not be a nightmare on day one, but once there is a sufficient number of tests, it will be nearly impossible to ensure that the shared data works across tests. Believe me, I’ve tried. As a bonus, if you avoid sharing the data, you are able to easily focus on the last two items.
Run Integration Tests on Multiple Environments
Inevitably, integration tests will not always pass. If you run integration tests against multiple setups, it will simplify identifying the source of the problem. The system my team is building is horizontally scalable, meaning additional instances of a component can be added to handle load. With this type of system, consider executing integration tests against a minimally complete environment (i.e. one instance of each type of component) as well as environments where there are multiple instances of each component.
Make Integration Tests as Fast as Possible
A short feedback loop is essential to tracking down the source of errors. Two ways to attack the speed problem are to focus on the execution of a single test and to focus on the execution of all tests.
First, attempt to replace all Thread.sleep() calls with either a polling mechanism (e.g. CountDownLatch) or an event-driven abstraction (e.g Futures). Not only will tests execute faster because the entire sleep duration will not be exhausted during execution, but you will have also replaced a code smell.
Once you have replaced as many Thread.sleep() calls as possible, consider executing tests concurrently. If you have maintained independent data for each test, it should be trivial to parallelize test execution with an Executor (or Actors if you’re an Akka fan).
Verify Side-Effects
When writing integration tests, it is often simpler to consider the system under test to be a black box. While it is important to verify end users are seeing correct data, it is equally important to verify internal side-effects throughout test execution.
Let’s consider the highly simplified representation of my team’s financial trading system shown below, in order to find side-effects that need to be verified. In this system, market data (e.g. the price of the currency pair EUR/USD) and trades (requests to buy/sell currency pairs) enter the system at one end and exit at the other.
What side-effects might be present?
- Market data (or a lack thereof) may trigger alerts in the system to notify end users of a bad system state.
- Added market data volume (i.e. load) might cause resource redistribution in a cluster.
- Orders have a lifecycle consisting of multiple states that need to be properly represented in the database.
- Completed orders are published to other internal systems (e.g. reporting).
Verifying the state of an order in the database, knowing alerts are raised at proper times, and ensuring the cluster works as designed gives you the confidence to say, “Yes, my system is always working.”
How have you dealt with exercising complex systems end-to-end? I’m always interested in learning more to make my life easier! In an upcoming post, I will share the lessons I’ve learned while debugging a distributed system.
