A Requirements Document

Oct 05 2020
In his book, Practice of System and Network Administration, Tom Limoncelli explains the importance of writing down the requirements for the system you are building. I think the list applies equally well to software projects. Having a list of clear requirements written down is probably one of the most important things you can do before starting a project.

Requirements are a list of what the service will be able to do. Requirements should list desired functionality, features, and capabilities. Focus on the end goal: what the system will enable people to achieve. The list of requirements guides all the other steps: design, engineering, implementation, testing, and so on. 

Requirements are written down. They are not simply agreed to in a verbal discussion, tracked on a dry-erase board in your office, or kept in your head. Writing them down in a shared requirements document has many benefits.

Transparency: Unlike your brain or the dry-erase board in your office, everyone can read the requirements document.

Fewer gaps: Writing down requirements reduces errors caused by a request being misheard or forgotten. People can verify that their request was recorded properly.

Fewer misunderstandings: Two people often think they have verbal agreement when they do not. Seeing the decision in writing verifies that everyone is agreeing to the same thing.

Buy-in and approval: Written requirements provide the ability to get buy-in and management approval in an accountable way. People know what they are agreeing to. People can’t claim they didn’t sign off on the requirements if they’ve literally signed their name on a printout. They may claim they didn’t understand the document or that circumstances and budgets have changed, but at least you have baseline agreement.

Fixed scope: A formal requirements document provides a mechanism for handling additional feature requests that arrive after the agreement has been reached, preventing feature-creep. New requirements need to be formally agreed to, documented as a scope change, and subjected to buy-in and management approval before they can be added to the scope.

Accountability: Accountability is a two-way street. Customers can point to the document when a requirement you’ve agreed to turns out to be missing or incomplete. Documentation of requirements also helps delineate features versus bugs. A bug is a requirement that is incorrectly implemented. A feature request is a new requirement that wasn’t previously agreed to. While a bug should be fixed, a feature request requires approval, resource allocation, and possibly budget approval.

A requirements document does not need to be hundreds of pages long. A bullet list may be sufficient for small projects. The essence of business writing is brevity. A short, concise document is easier to read and understand.

Define terminology early in the document. Ontology is the system of terms and definitions that define the system and its parts. Often during a heated debate one realizes that everyone is using the same words but meaning different things. Getting agreement to the ontology that will be used is very important.

The requirements should focus on the list of features, stated from the perspective of what the customer should be able to accomplish using business terms, not technical terms. Record the “what,” not the “how.”

This also means not proscribing particular technology. For example, when an email system is being designed, users might request that it be compatible with the IMAP protocol (RFC3501). They should, instead, be stating their needs at a higher level of abstraction. Ask them why they need such a specific thing. For example, maybe they need to be able to read their email from their smartphone and they know that their phone supports IMAP. 

Conversely, specifying IMAP support may under-specify the feature. Imagine the user’s surprise when the IMAP support is available but his or her particular smartphone is unable to access email. The feature is complete as specified—IMAP is supported—but the user is unable to read email. Requesting that this problem be fixed would be a feature request, not a bug. It would be rejected much to the surprise and dismay of the customer. Technically speaking, the product works as requested.

It is this kind of situation that makes perfect sense to a technical person but is frustrating and unreasonable to a typical user. This is one reason why users view IT departments as difficult to deal with. They’re being told they got what they asked for, but in their mind that’s not true: They can’t read email from their phone. Phrasing requirements at the right level of abstraction is one way that we can prevent this problem.

Infrastructure as Code

Sep 24 2020
In an IaC environment, we don’t make changes to systems directly. Instead, we update the code and data that are used to create our environments.

When we administer systems this way, the code we use to control our infrastructure is not just part of our infrastructure, it is our infrastructure. It describes the network connections between machines, the machines themselves, and the applications that run on the machines.

To create a new machine, we update our machine-processable description and let the automation create it. If we do not like that change, we can revert to a previous version of that machine-processable description. We fix our infrastructure as a developer fixes a bug: We make a change to code and test it in a test environment; if that is successful, we push the change to production.

When we manage our infrastructure as code, every change would be made by updating machine-processable definition files that, when processed, created machines, deleted machines, configured systems, created files, started processes, and so on. We would gain the ability to track changes, know the history of our infrastructure, and identify who made which change. Our infrastructure would be parameterized so that we could build the same environment with different quantities, names, and locations. System administration could reap the benefits that software engineers have enjoyed for decades.


  • Reduced Cost: Manual labor is reduced or eliminated. Automation is a workforce multiplier: It permits one person to do the work of many. 
  • Improved Speed: Not only can tasks be done faster, but they can also be done in parallel. They can be done without waiting for a person to be available to do the work. 
  • Reduced Risk:  Security risk is reduced because we can prove compliance. We can reduce the risk of errors and bugs by applying software engineering techniques like unit tests and version control. 

  • Make all changes via configuration files, such as a Dockerfile or PowerShell scripts.
  • Document systems and processes in code to make it the source of truth. 
  • Use version control system to track changes in the configuration files. 
It can be intimidating to get started with IaC, especially in a preexisting environment. The best strategy is to start small. Automate one thing on one machine to get comfortable. Then manage more aspects of the system over time and build toward an environment where all changes are made via the CM system.

Pets vs. Cattle

Sep 22 2020
Currently I am reading the the Practice of System and Network Administration, by Thomas Limoncelli. In this book, I came across the concept of pets vs. cattle when it comes to managing your IT infrastructure. It's one of those ideas that changes the way you look at your servers. In a nutshell, pets are the highly customized machines and cattle are the generic machines.

Cattle-like systems give us the ability to grow and shrink our system’s scale. In cloud computing a typical architecture pattern has many web server replicas behind a load balancer. Suppose each machine can handle 500 simultaneous users. More replicas are added as more capacity is needed.

Another way of describing pets is to note that they contain a lot of irreproducible state. Cattle are stateless, or contain only reproducible state. State is, essentially, data or information. That information may be data files, configuration, or status. In a web-based application, there is the application itself plus the database that is used to store the user’s data. That database is state.

The more state a machine holds, the more irreplaceable it is—that is, the more pet-like it is. Cattle are generic because we can rebuild one easily thanks to the fact that cattle contain no state, or only state that can be copied from elsewhere.

We can turn pets into cattle by isolating the state. Imagine a typical web application running entirely on a single machine. The machine includes the Apache HTTP server, the application software, a MySQL database server, and the data that the database is storing. This is the architecture used by many small web-based applications.
The problem with this architecture is that the single machine stores both the software and the state. It is a pet. If the server goes down, our application goes down and the data is lost.

We can improve the situation by separating out the database. We can move the MariaDB database software and the data it stores to another machine. The web server is now cattle-like because it can be reproduced easily by simply installing the software and configuring it to point to the database on the other machine.
This process is also called decoupling state. The all-in-one design tightly couples the application to the data. The last design decouples the data from the software entirely. This decoupling permits us to scale the service better. For example, the web server can be replicated to add more capacity.

In such systems we no longer are concerned with the uptime of a particular machine, if it contains only the application code, which is stateless. If one machine fails, the autoscaler will build a new one. If a machine gets sick, we delete it and let the autoscaler do its job.


Sep 18 2020
In operational theory, the term bottleneck refers to the point in a system where Work-In-Progress (WIP) accumulates. 

A team that produces a software application has different members performing each task: writing the code, testing it, deploying it to a beta environment, and deploying it to the production environment. 

Watch what happens when a bug is reported. 
  • If it sits in the bug tracking system, untouched and unassigned to anyone, then the bottleneck is the triage process. 
  • If it gets assigned to someone but that person doesn’t work on the issue, the bottleneck is at the developer step. 
  • If the buggy code is fixed quickly but sits unused because it hasn’t been put into testing or production, then the bottleneck is in the testing or deployment phase.
Once the bottleneck is identified, it is important to focus on fixing the bottleneck itself. There may be problems and things we can improve all throughout the system, but directing effort toward anything other than optimizations at the bottleneck will not help the total throughput of the system. 
  • Optimizations prior to the bottleneck simply make WIP accumulate faster at the bottleneck. 
  • Optimizations after the bottleneck simply improve the part of the process that is starved for work.
From: Practice of System and Network Administration, The: DevOps and other Best Practices for Enterprise IT, Volume 1

Managing IIS with PowerShell

Sep 04 2020
With Windows 10 and Windows Server 2016, a new and simplified IISAdministration module was released side by side with the existing WebAdministration Cmdlets. The new module contains simple cmdlets and provide direct access to the server manager. It also offers better support for pipeline and scaling. 

Though you can manage IIS with the IIS Manager GUI, using PowerShell gives you much more control, is much more convenient and can be easily automated. Here's a brief overview of the basic IIS management with PowerShell.  

Create a directory for your site. I will name my site 'CityOverflow'.
New-Item -ItemType Directory -Name 'CityOverflow' -Path 'C:\Sites'
Create the index.html that will be served when you go to your site
New-Item -ItemType File -Name 'index.html' -Path 'C:\Sites\CityOverflow'
Add some html content to your landing page. 
Add-Content -Path 'C:\Sites\CityOverflow\index.html' -Value "<h1>Hello World</h1>"
So far, we haven't used any IIS-specific features. We have just created a directory for our site that contains HTML. You can do that by using your favourite editor. I just wanted to show the basic PowerShell commands that do the same. 

Now, let's create a new IIS site that will serve our CityOverflow website. 
New-IISSite -Name 'CityOverflow' -PhysicalPath 'C:\Sites\CityOverflow\' -BindingInformation '*:8000:'
Once the IIS site is created, run Get-IISSite to get information about that site. 

Name             ID   State      Physical Path                  Bindings
----             --   -----      -------------                  --------
CityOverflow     2    Started    C:\Sites\CityOverflow\         http *:8000:
That's it. If you navigate to localhost:8000 in your browser, you will see the web page that you just created. 

To stop a running site, use the Stop-IISSite cmdlet, providing the name of the site. 
Stop-IISSite -Name "CityOverflow"

Are you sure you want to perform this action?
Performing the operation "Stop-IISSite" on target "CityOverflow".
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is "Y"): Y
To start the site again, run Start-IISSite cmdlet.
Start-IISSite -Name "CityOverflow"
If you have more than one sites, you can start/stop all of them by piping the output of Get-IISSite to either StartIISSite or Stop-IISSite. 
Get-IISSite | Start-IISSite
To remove a site, run Remove-IISSite. 
Remove-IISSite -Name 'CityOverflow'