Unit Testing Shell Scripts:
Part Two
In the first post of the series, we wrote a test script to validate the functionality of a sample script by Vivek Gite. We saw that it’s not difficult to follow the conventional structure of a unit check, arrange-act-assert, using a shell language. We also saw that it’s straightforward to fake out or “mock” external dependencies and to assert the results of a function or a whole script using standard conditional statements. We also considered the benefits of using a unit testing framework, such as consistency, reusability, simplification, and understandability.
In this installment, we’ll use shunit2 to write unit checks for the same sample script we used in the first installment, and we’ll explain why neither BATS nor zunit was able to support the sample use case.
To reiterate, here’s the test script we came up with to check the functionality of diskusage.sh:
#!/bin/bash shopt -s expand_aliases # Before all alias mail="echo 'mail' > mailsent;false" echo 'Test results for diskusage.sh' > test_results tcnt=0 # It does nothing when disk usage is below 90% # Before (arrange) alias df="echo 'Filesystem Size Used Avail Use% Mounted on';echo '/dev/sda2 100G 89.0G 11.0G 89% /'" echo 'no mail' > mailsent # Run code under test (act) . ./diskusage.sh # Check result (assert) ((tcnt=tcnt+1)) if [[ $(< mailsent) == 'mail' ]]; then echo "$tcnt. FAIL: Expected no mail to be sent for disk usage under 90%" >> test_results else echo "$tcnt. PASS: No action taken for disk usage under 90%" >> test_results fi # It sends an email notification when disk usage is at 90% alias df="echo 'Filesystem Size Used Avail Use% Mounted on';echo '/dev/sda1 100G 90.0G 10.0G 90% /'" echo 'no mail' > mailsent . ./diskusage.sh ((tcnt=tcnt+1)) if [[ $(< mailsent) == 'mail' ]]; then echo "$tcnt. PASS: Notification was sent for disk usage of 90%" >> test_results else echo "$tcnt. FAIL: Disk usage was 90% but no notification was sent" >> test_results fi # After all unalias df unalias mail # Display test results cat test_results
Testing diskusage.sh with shunit2
Let’s see how our test script would look using shunit2. The main advantages of shunit2 are:
- support for multiple shell languages & platforms
- clean assertion syntax based on the “assertThat” model
- support for skipping test cases
- support for skipping assertions
- support for mocking files (but not commands)
The ability to skip assertions and fail calls within a single test case is not characteristic of unit test frameworks for application languages. It’s helpful when testing shell scripts because many scripts are not modular; they perform a large number of steps in sequence, and you can’t selectively execute parts of the script for purposes of isolated testing.
The ability to skip specific assertions selectively gives you a mechanism to explore the possible behaviors of such a script, or to troubleshoot it when something goes wrong, without ripping it to shreds to get at specific functionality. When you have to break up a script to test parts of it, there’s a risk that the behavior will be different when the parts are reassembled into a single script.
shunit2 has functions equivalent to the set up and tear down steps in a test script:
- Before all => oneTimeSetUp
- After all => oneTimeTearDown
- Before => setUp
- After => tearDown
The first thing to do is “install” shunit2. “Install” is a pretty fancy word for a shell script. We’ll download it from the project wiki, which offers several releases of shunit2. At the time this post was written, the latest stable release was 2.1.7, available from this download page.
shunit2 displays test results, so we don’t need the code at the end of our script to display the test output, where the comment reads, “Display test results”. We’ll remove that.
Now, let’s make use of those set up and tear down functions. We’ll replace the code under the comment, “Before all”, with a function definition using the reserved name oneTimeSetUp, which shunit2 will recognize when it runs. We don’t need to number our test cases, as shunit2 will take care of that for us. That leaves us with this:
oneTimeSetUp() {} alias mail="echo 'mail' > mailsent;false" }
Doing the same for the “After all” code, we have:
oneTimeTearDown() { unalias df unalias mail }
We have to source shunit2 to include it in our test script. The statement goes at the end of the test script.
. ./shunit2
Now, let’s replace our hard-coded conditional logic with shunit2 assertions. To do that, we have to make each of our test cases into a function whose name begins with “test”. We end up with this:
#!/bin/bash shopt -s expand_aliases test_itDoesNotSendNotification_whenUsageIsBelowThreshold() { alias df="echo 'Filesystem Size Used Avail Use% Mounted on';echo '/dev/sda2 100G 89.0G 11.0G 89% /'" echo 'no mail' > mailsent . ./diskusage.sh assertTrue "It should not send a notification when disk usage is under 90%" \ '[[ $(< mailsent) != "mail" ]]' } test_itSendsNotification_whenUsageIsAtOrAboveThreshold() { alias df="echo 'Filesystem Size Used Avail Use% Mounted on';echo '/dev/sda1 100G 90.0G 10.0G 90% /'" echo 'no mail' > mailsent . ./diskusage.sh assertTrue "It should send a notification when disk usage is at or above 90%" \ '[[ $(< mailsent) == "mail" ]]' } oneTimeSetUp() { alias mail="echo 'mail' > mailsent;false" } oneTimeTearDown() { unalias df unalias mail } . ./shunit2
Comparing this with the original test script, you can see several advantages. First, the code is much simpler overall and easier to read. We don’t have hard-coded logic to give test cases unique numbers, and we don’t need busy-looking conditional statements to check the results of each case. The naming conventions for the shunit2 functions helps clarify the intent of the code and ensure consistency.
The names of the individual test case functions reflect a style some application developers like to use. The string “test” is required by the testing framework. After that, the next segment of the function name states what we expect the code under test to do, and the following segment summarizes the preconditions in a human-readable way.
This naming convention is not a hard-and-fast requirement of unit testing. Any reasonable and consistent naming convention will be fine. The thing to avoid is random or haphazard names, as that will make it harder to understand the test suite.
No Luck with BATS or zunit
Shell scripts are widely used for system administration tasks and for system provisioning. A natural consequence of that reality is many production shell scripts are (a) dependent on the current state of a running system, and/or (b) change the state of the target system by installing packages and/or changing configuration settings.
To test such a script in isolation at the unit level, in the same way as one would unit test application code, we need a way to define ‘fake’ or ‘mock’ system commands. To test our sample script, diskusage.sh, we use aliases to replace real system commands with fake ones. In our example, we mock the ‘df’ and ‘mail’ commands in this way.
BATS and zunit are implemented in such a way that we were unable to support these aliased ‘mock’ commands. In response to a user question in December, 2016, Suewon Bahng and Michael Diamond (separately) proposed different workarounds for the problem using BATS, neither of which will support our sample use case.
In a nutshell, the problem is that the tools seem to “swallow” alias definitions even when ‘shopt -s expand_aliases’ is specified. The workarounds are not visible to the code under test, but only to the test script itself. Therefore, these methods can’t inject fake system command output into the code under test to provide isolation.
As a practical alternative exists in shunit2, we decided not to invest more time in BATS or zunit, and to move on. Should this problem be fixed in future, we will gladly revisit either or both these tools.
What’s Next?
So far, we’ve seen it’s possible to hand-roll unit checks for shell scripts, and we’ve seen some of the benefits of using a unit testing framework. We’ve learned that shunit2 is a solid choice for the purpose, and we’ve discovered a limitation in a couple of other frameworks that may make them less useful in enterprise environments, even if they work fine in other situations.
In the third installment, we’ll have a look at two side projects of mine, bash-spec and korn-spec. These frameworks take a “behavioral” approach. That means the assertions take the form “expect alpha to match beta”, as opposed to the more traditional forms, “assert equal alpha, beta” or “assert that beta is equal to alpha”. They also attempt to use a “fluid” calling style in an attempt to make the test cases more amenable to humans.
Still to come: Pester (for Powershell), ChefSpec (for Chef), and rspec-puppet (for Puppet).