Software Engineering Principles
In my career in software engineering I’ve found there are a number of conventions that enable teams to deliver good software. Having principles like these in place gives an organisation a framework for judging new technology and tools, and empowers team members to advocate best practices in their team. In this post I will set out ten technical principles that help teams deliver high-quality applications.
First of all, some caveats:
These principles are mainly for engineering teams building software applications such as back-end web servers, APIs, web front-ends and data processing systems that are usually hosted in a cloud. Some of these principles may not be applicable if you’re building embedded software, operating systems or AI. Also, depending on the problem you’re trying to solve, there may be exceptions to some of these rules.
With those caveats in mind, the following is what I consider to be good software engineering principles:
- Keep code simple by enforcing separation of concerns and other practices
- Keep infrastructure simple with basic components and microservices
- No code silos
- Automated code checking
- Regular feedback on code
- Simple CI/CD pipeline
- Developers own infrastructure but don’t manage it
- Easy to find logs, metrics and alerts
- Simple documentation
- Automate everything
In the rest of this article I will explain each of these in turn:
1. Keep code simple
We can keep our code simple by following this set of practices:
1a. Enforce separation of concerns in code through modularity and composability
Separation of concerns in code is a standard best practice in software engineering. In the following example we can see a good separation of concerns. There are separate function calls for reading from the db and transforming the data into a response. Our handler composes these functions together before returning some data that can be composed with other modules.
In the next example there is a poor separation of concerns. The logic for reading from the db and transforming the data hasn’t been modularised. This handler also has responsibility for returning a response to the client so it can’t be composed with anything else.
Separation of concerns is important for the following reasons:
- The code becomes more readable as each module or function is only doing one thing
- Code can be more easily reused as each module has a simple input and output
- It’s easier to change underlying implementations. As we have modularised the
dbexample above, we can change it from a Postgres to a DynamoDB implementation without having to update any other part of the codebase
1b. Always be explicit and avoid global state
It should always be obvious where code originates. When modules local to the codebase are used it must be clear where those modules are located (this is usually handled through a module system). External dependencies that code relies on need to be made explicit typically through a package manager. Code becomes harder to understand when it’s unclear where modules and dependencies come from.
In the following example there are a set of dependencies listed in a
package.json but in the code the
toJSON dependency is used without it being listed. The
db module is also used in the code as a global variable but it’s unclear where this comes from.
We should always strive to be explicit and avoid global state because:
- Code is harder to understand when it’s unclear where modules and dependencies originate
- When engineers are not explicit in code it reinforces knowledge silos. Other engineers will often have to rely on being told by the original authors how modules and dependencies are used
- IDEs are optimised for understanding modules and dependencies when they are explicit. Developer productivity is harmed when this is not the case
1c. No magic!
It is often tempting to use tools like Object Relational Mappers (ORMs) and large frameworks for building applications, such as Spring for Java, when creating software. Proponents of these tools would argue that they help developer productivity by making it easy to solve commonly occurring problems (inserting and querying data from a database and handling api requests for example).
But these tools often provide constructs that obfuscate the interface between the engineer’s code and the system with which they’re integrating. I call this “magic”!
In the example below a user is created in a SQL database but it’s not clear how this is done or what query was used. We have to trust in the ORM’s magic for this to work.
The next example shows an alternative. The library that is used for interacting with the SQL database provides a simple API. It’s clear to engineers how the user is being created and there is no magic involved.
Large application frameworks also suffer from magic. The following example shows a http request handler using Spring for Java. There are a number of Java annotations that configure the handler but it’s not clear what they do without consulting the documentation. Again we need to trust in the framework’s magic.
An alternative is to use Vertx for Java. Vertx provides simple APIs for handling http requests. In the next example it’s immediately clear how the request is being handled without needing to consult any documentation.
Avoiding magic in code is desirable because:
- Magic often obfuscates how things work. Engineers should always be clear what code is doing as soon as they look at it
- ORMs and large frameworks require a lot of buy-in from engineers. Everyone in the team needs to be familiar with the tool and understand the documentation. When teams can depend on simpler tools and abstractions it is easier to onboard new engineers who are already comfortable with idioms in the programming language
- Whilst it’s true that ORMs and large frameworks save developers from having to write similar code (such as inserting and retrieving data from a database) this comes at a cost of flexibility. As the tool enforces particular ways of implementing an application, it can be difficult to solve specific problems or problems the tool’s authors haven’t considered
1d. Use functional programming concepts
Functional programming concepts like pure functions, immutable data, referential transparency and avoiding side effects makes code cleaner and easier to understand. Code becomes more composable and programmer intent is explicit. Without these concepts our software is harder to reason about.
The following example is rather contrived but an engineer looking at this code needs to understand what
BaseHandler does and how
transform will modify the object:
We can achieve the same thing in a much simpler way by using functional programming concepts. In this example data is kept immutable and the function is pure. The intent of the programmer is much clearer as we do not have to concern ourselves with inheritance or how data is being changed:
It’s often the case in our applications that we need to perform a side effect whether that’s returning an API response or updating a row in a table. We can still keep our code mostly pure and immutable by moving these side effects to the edges of our application. In this example, the entry point to our api composes a couple of pure functions before performing the side effect of returning a response:
2. Keep infrastructure simple
We can keep infrastructure simple by following these principles:
2a. Start with basic infrastructure components
Unless we know our system will need to handle a lot of throughput from day one, we should start with very basic components. Serverless technologies and managed databases enable us to build simple APIs with basic storage without having to run any of our own servers. If we need more control over how our applications run then using simple servers through AWS EC2 or Digital Ocean droplets is preferable. We’re not going to need Kubernetes and Kafka from day one.
2b. Enforce separation of concerns through decoupling of services and apps
When our system becomes more complex or we know it will have to handle a lot of throughput, we need to consider how to build or modify the system to accommodate this. An important principle here is that, similarly with code, we separate the different concerns of our system into separate applications i.e. microservices. This has the following benefits:
- Services can be more easily developed, tested and deployed when they are only responsible for a single domain. When they have multiple responsibilities defects are more likely to creep in as changes to the service can have unintended effects
- Services can be scaled independently of each other meaning they are able to handle higher throughput whilst keeping costs low
- Services can be replaced with newer applications that are able to peform better, save money or take advantage of more modern software engineering techniques
A microservice architecture can be designed so that applications communicate directly with each other. But this also means we have tight coupling between services. If service A depends on service B and service B goes down then service A will also fail. A better approach is to decouple applications through message queues or event streams.
In this architecture services B and C will continue to operate if the event stream or service A goes down. They just won’t receive any new data.
2c. Have interfaces in front of data storage systems e.g. a REST or RPC API
In a microservice architecture if we have one service calling another to access some data or send messages to a message queue, we should have APIs in front of these data storage systems. Having multiple services accessing the same database or event stream is problematic because it makes it much harder to swap the underlying data storage system with something else. If we decide at a later date to use DynamoDB rather than Postgres or AWS Kinesis rather than SQS, we would need to update all the services in our system to use the new implementation. Things like schema changes would also require changes and need to be thoroughly tested across our system. By having a single interface that all services use we can be much more agile.
2d. Use containers and virtualisation
If we do need to run our own servers then we should favour deploying applications with containers or a virtualisation technology. This is preferable because:
- The environment the service requires to run is determined in code such as a
- The dependencies the service requires are explicit in the code
- Application environments are reproducible in case of failure or deploying with another cloud provider
- Services can be deployed on-premises or with multiple cloud providers as the service is decoupled from the environment
2e. Write infrastructure as code
Writing infrastructure as code spreads knowledge throughout the team about how the system has been configured and deployed. Without infrastructure as code this knowledge is only held by the team members who’ve done the manual setup. It also gives ownership to engineers over how their application needs to run and what resources it requires. Monitoring services becomes simpler when engineers know how they’ve been deployed.
3. No code silos
When writing an application it should never be the case that only one person is authoring the entire codebase. Having multiple developers on a project leads to better code as the best ideas come about through debate and teamwork. With a single engineer there is a big risk that knowledge about the software is lost if that person leaves or moves onto a different project.
4. Automated code checking
There should always be automated checks of code including:
- Automated tests of code i.e. TDD
- Code coverage so we know how much of our code is covered by tests
- Code style rules that can be automatically applied to a project e.g. Prettier or Google’s Java Formatter
- Tools that check for potential bugs and insecure dependencies e.g. Snyk
This makes our code much more reliable. Tests help to catch bugs and edge cases before we deploy to production. Enforcing code style rules ensures consistency in a project and keeps code readable. Automated code checking helps us to manage change when new developers join the team or new features need to be built.
5. Regular code feedback
Code should never be deployed without being reviewed by another member of the team because research shows that feedback:
- Helps to catch 70% of bugs and many of these bugs would not be caught by regular QA or testing
- Creates positive impacts for security
- Pushes engineers to write smaller, more self-contained commits
- Improves code quality, communication, and understanding
6. Simple CI/CD pipeline
A good pipeline for CI/CD should:
- Run automated code checks when a developer wants to merge to master
- Make it obvious to the developer when a code check has failed and the reason why
- Block merges without an approved code review
- Run automated code checks again when the code is merged to master
- Auto-deploy to a CI or Staging environment if the checks are successful
- Enable production deployments with one click or command
- Only deploy the service in question
- Enable deployments at any time
- Be defined in code and be reproducible
- BONUS: enable canary deployments, segmented deployments & automatic rollback
It’s important that code changes are tested in an environment before they are deployed to production so that any errors or bugs are caught. Engineers should be empowered to deploy their applications whenever required as this enables teams to deliver features more quickly. They should also be confident that the deployment pipeline works reliably and is resilient in case of any errors.
7. Developers own infrastructure but don’t manage it
Software engineers should write the code for the infrastructure their service requires using a tool like Terraform. They can easily deploy this infrastructure and modify it as needed. They monitor the infrastructure and are alerted when problems occur. But they do not manage the infrastructure. The underlying servers or hardware that power the infrastructure should be managed by someone else. This could be a cloud provider such as AWS or a team of site reliability engineers.
Empowering engineers to own infrastructure has numerous benefits. They have more knowledge over how their application is deployed which helps when diagnosing problems with the service. There are fewer bottlenecks when updating infrastructure as engineers do not need to wait for another team to make these changes. It also spreads knowledge throughout the team about how an organisation’s infrastructure is configured as any engineer can consult the code.
8. Easy to find logs, metrics, tracing and alerts
Software engineers should always have:
- Metrics to understand how the system is behaving e.g. throughput, CPU, latency
- Logs to debug problems when they occur
- Tracing information to understand in which part of the system bottlenecks may be occurring
- Immediate alerts when their service is down or not functioning correctly
- One obvious place to find all of this information
- Code that defines the metrics they want to see for their service e.g. AWS Cloudwatch dashboards
9. Simple documentation
Documentation for software should answer the following questions:
- What is the main purpose of the service?
- How do I build the app?
- How do I run it?
- How do I run the automated code checks?
- Where is the CI/CD pipeline?
- How is the service deployed to production?
- Where are the logs and metrics?
- Where does this service fit in as part of the broader architecture of the system?
Do not write pages and pages of documentation as code should always ultimately be the source of truth.
10. Automate everything!
- Script your CI/CD pipeline
- Script your infrastructure creation
- Script your deployment process
- Script the creation of metrics and logs dashboards
- Script the alerts that notify you if your service is down
- Script all the random stuff that doesn’t need to be done manually
You can read some further discussion about this article on Twitter.