Test Modes

One unique feature of Test::Nginx is that it allows running the same test suite in wildly different ways, or test modes, by just configuring some system environment variables. Different test modes have different focuses and may find different categories of bugs or performance issues in the applications being tested. The data driven nature of the test framework makes it easy to add new test modes without changing the user test files at all. And it is also possible to combine different test modes to form new (hybrid) test modes. The capability of running the same test suite in many different ways helps squeezing more value out of the tests we already have.

This section will iterate through various different test modes supported by Test::Nginx::Socket and their corresponding system environment variables used to enable or control them.

Benchmark Mode

Test::Nginx has built-in support for performance testing or benchmarking. It can invoke external load testing tools like ab and weighttp to load each test case as hard as possible.

To enable this benchmark testing mode, you can specify the TEST_NGINX_BENCHMARK system environment variable before running the prove command. For example,

export TEST_NGINX_BENCHMARK='2000 2'
prove t/foo.t

This will run all the test cases in t/foo.t in benchmark mode. In particular, the first number, 2000 in the environment variable value indicates the total number of requests used to flood the server while the second number, 2, means that the number of concurrent connections the client will use.

If the test case uses an HTTP 1.1 request (which is the default), then the test scaffold will invoke the weighttp tool. If it is an HTTP 1.0 request, then the test scaffold invokes the ab tool.

This test mode requires the unbuffer command-line utility from the expect package, as well as the ab and weighttp load testing tools. On Ubuntu/Debian systems, we can install most of the dependencies with the command

sudo apt-get install expect apache2-utils

You may need to build and install weighttp from source on Ubuntu/Debian yourself due to the lack of the Debian package.

For the Mac OS X system, on the other hand, we can use homebrew to install it like this:

brew install expect weighttp

Now let’s consider the following example.

t/hello.t

use Test::Nginx::Socket 'no_plan';

run_tests();

__DATA__

=== TEST 1: hello world
--- config
    location = /hello {
        return 200 "hello world\n";
    }
--- request
    GET /hello
--- response_body
hello world

Then we run this test file in the benchmark mode, like this:

export TEST_NGINX_BENCHMARK='200000 2'
prove t/hello.t

The output should look like this:

t/hello.t .. TEST 1: hello world
weighttp -c2 -k -n200000 http://127.0.0.1:1984/hello
weighttp - a lightweight and simple webserver benchmarking tool

starting benchmark...
spawning thread #1: 2 concurrent requests, 200000 total requests
progress:  10% done
progress:  20% done
progress:  30% done
progress:  40% done
progress:  50% done
progress:  60% done
progress:  70% done
progress:  80% done
progress:  90% done
progress: 100% done

finished in 2 sec, 652 millisec and 752 microsec, 75393 req/s, 12218 kbyte/s
requests: 200000 total, 200000 started, 200000 done, 200000 succeeded, 0 failed, 0 errored
status codes: 200000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 33190005 bytes total, 30790005 bytes http, 2400000 bytes data
t/hello.t .. ok
All tests successful.
Files=1, Tests=2,  3 wallclock secs ( 0.01 usr  0.00 sys +  0.33 cusr  1.47 csys =  1.81 CPU)
Result: PASS

The most important line in this:

finished in 2 sec, 652 millisec and 752 microsec, 75393 req/s, 12218 kbyte/s

We can see that this test case can achieve 75393 requests per second and 12218 KB per second. Not bad for a single NGINX worker process!

It is also important to keep an eye on failed requests. We surely do not care about the performance of error pages. We can get the number of error responses by checking the following output lines:

requests: 200000 total, 200000 started, 200000 done, 200000 succeeded, 0 failed, 0 errored
status codes: 200000 2xx, 0 3xx, 0 4xx, 0 5xx

We are glad to see that all our requests succeeded in this run.

If we want to benchmark the performance of multiple NGINX worker processes so as to utilize multiple CPU cores, then we can add the following lines to the test file prologue, before the line run_tests():

master_on();
workers(4);

This way we can have 4 NGINX worker processes sharing the load.

Behind the scenes, the test scaffold assembles the command line involving weighttp from the test block specification, in this case, the command line looks like this:

weighttp -c2 -k -n200000 http://127.0.0.1:1984/hello

There exists complicated cases, however, where the test scaffold fails to derive the exact command line equivalent.

We can also enforce HTTP 1.0 requests in our test block by appending the "HTTP/1.0" string to the value of the --- request section:

--- request
    GET /hello HTTP/1.0

In this case, the test scaffold will invoke the ab tool to flood the matching HTTP 1.0 request. The output might look like this:

t/hello.t .. TEST 1: hello world
ab -r -d -S -c2 -k -n200000 http://127.0.0.1:1984/hello
This is ApacheBench, Version 2.3 <$Revision: 1706008 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
Completed 20000 requests
Completed 40000 requests
Completed 60000 requests
Completed 80000 requests
Completed 100000 requests
Completed 120000 requests
Completed 140000 requests
Completed 160000 requests
Completed 180000 requests
Completed 200000 requests
Finished 200000 requests


Server Software:        openresty/1.9.15.1
Server Hostname:        127.0.0.1
Server Port:            1984

Document Path:          /hello
Document Length:        12 bytes

Concurrency Level:      2
Time taken for tests:   3.001 seconds
Complete requests:      200000
Failed requests:        0
Keep-Alive requests:    198000
Total transferred:      33190000 bytes
HTML transferred:       2400000 bytes
Requests per second:    66633.75 [#/sec] (mean)
Time per request:       0.030 [ms] (mean)
Time per request:       0.015 [ms] (mean, across all concurrent requests)
Transfer rate:          10798.70 [Kbytes/sec] received

Connection Times (ms)
              min   avg   max
Connect:        0     0    1
Processing:     0     0  132
Waiting:        0     0  132
Total:          0     0  132
t/hello.t .. ok
All tests successful.
Files=1, Tests=2,  4 wallclock secs ( 0.02 usr  0.00 sys +  0.51 cusr  1.39 csys =  1.92 CPU)
Result: PASS

The most important output lines, in this case, are

Failed requests:        0
Requests per second:    66633.75 [#/sec] (mean)
Transfer rate:          10798.70 [Kbytes/sec] received

Different hardware and operating systems may lead to very different results. Therefore, it generally does not make sense at all to directly compare numbers obtained from different machines and systems.

Clever users can write some external scripts to record and compare these numbers across different runs, so as to keep track of performance changes in the web server or application. Such comparison scripts must take into account any measurement errors and any disturbances from other processes running in the same system.

Performance benchmark is a large topic and we gives it a more detailed treatment in a dedicated chapter.

HUP Reload Mode

By default, the test scaffold always starts a fresh instance of the NGINX server right before running each individual test block and stops the server right after the checks of the current test block are all done. This ensures that there is no side effects among test blocks, especially those running successively. But it can also be desired to ensure everything also works fine when the NGINX server is just reloading its configurations without a full server restart. Such configuration reloading is usually done via sending the HUP signal to the master process of NGINX. So we usually call it "HUP reload".

Note

On some non-UNIX-style operating systems like Microsoft Windows, there is no such things as signals. In such platforms, NGINX users usually use the -s reload command-line option of the nginx executable to do the same thing. It should be noted, however, the use of the -s reload option has one side effect that can be annoying: it loads the nginx configuration twice instead of just once, which may incur unnecessary initialization overhead. Therefore, we should always use the HUP signal instead of -s reload whenever possible.

One example of OpenResty features that behaves different upon HUP reload than server restart is the shared dictionary mechanism (lua_shared_dict) that does not wipe out any existing data in the shared memory storage during HUP reload. When testing this feature or application code relying on this feature, it is wise to test how it behaves upon HUP reload. We saw in the past that some 3rd-party NGINX C modules dealing with shared memory, for example, have bugs across HUP reloads, like nasty memory leaks.

Test::Nginx has built-in support for the HUP reload test mode, which can be enabled by specifying the TEST_NGINX_USE_HUP=1 environment:

export TEST_NGINX_USE_HUP=1

Then we can run our existing test suite as usual but now HUP signal is used by the test scaffold to reload the NGINX configuration specified by different test blocks. The NGINX server will only be automatically shut down when the test harness finishes running each test file.

Note	We can even avoid the automatic server shutdown behavior upon test file completion by specifying the `TEST_NGINX_NO_CLEAN=1` environment. See the later section Manual Debugging Mode for more details.

UNIX signals like HUP usually work asynchronously. Thus, there is a delay between the test scaffold finishes sending the HUP signal to the NGINX server and the NGINX server forks off a new worker process using the newly loaded configuration and starts accepting new connections with the new worker. For this reason, there is a (small) chance that the request of a test block is served by an NGINX worker process still using the configuration specified by the previous test block. Although Test::Nginx tries hard to wait as long as it can with some simple heuristics, some test blocks may still experience some intermittent test failures due to the mismatch of the NGINX configuration. Be prepared for such false positives when using the HUP reload testing mode. This is also one of the reasons why the HUP reload mode is not the default. We hope this issue can be further improved in the future.

Another limitation with the HUP reload mode is that HUP reloads only happen upon test block boundaries. There are cases where it is desired to issue HUP reload in the middle of a test block. We can achieve that by using some custom Lua code in your test block to send a HUP signal yourself, as in

local f = assert(io.open("t/servroot/logs/nginx.pid", "r"))
local master_pid = assert(f:read())
assert(f:close())
assert(os.execute("kill -HUP " .. master_pid) == 0)

Valgrind Mode

One of the biggest enemies in web servers or web applications that are supposed to run in a 24x7 manner is memory issues. Memory issues include memory leaks, memory invalid reads (like reading beyond the buffer boundary), and memory invalid writes (like buffer overflow). In case of memory leaks, the processes can take up more and more memory in the system and eventually exhaust all the physical memory available, leading to unresponsive systems or triggering the system to start killing processes with force. Memory invalid accesses, on the other hand, can lead to process crashes (like segmentation faults), or worse, leading to nondeterminism in the process' s behavior (like giving out wrong results).

Valgrind is a powerful tool for programmers to detect a wide range of memory issues, including many memory leaks and many memory invalid accesses. This is usually for debugging lower level code like the OpenResty core (including the NGINX core), the Lua or LuaJIT VM, as well as those Lua libraries involved with C and/or FFI. Plain Lua code without using FFI is considered "safe" and is not subject to most of the memory issues.

Note

Plain Lua code without using FFI can still contain bugs that result in memory leaks, like inserting new keys into a globally shared Lua table without control or appending a string to a global Lua string infinitely. Such memory leaks, however, cannot be detected by Valgrind since it is managed by Lua or LuaJIT’s garbage collector.

Test::Nginx provides a testing mode that can automatically use Valgrind to run the existing tests and check if there is any memory issues that can be caught by Valgrind. This test mode is called "Valgrind mode". To enable this mode, just set the environment TEST_NGINX_USE_VALGRIND, as in

export TEST_NGINX_USE_VALGRIND=1

Then just run the test files as usual.

Let’s consider the following example.

=== TEST 1: C strlen()
--- config
    location = /t {
        content_by_lua_block {
            local ffi = require "ffi"
            local C = ffi.C

            if not pcall(function () return C.strlen end) then
                ffi.cdef[[
                    size_t strlen(const char *s);
                ]]
            end

            local buf = ffi.new("char[3]", {48, 49, 0})
            local len = tonumber(C.strlen(buf))
            ngx.say("strlen: ", len)
        }
    }
--- request
    GET /t
--- response_body
strlen: 2
--- no_error_log
[error]

Here we use the ffi.new API to allocate a C string buffer of 3 bytes long and initialize the buffer with the bytes 48, 49, and 0, in the decimal ASCII code. Then we call the standard C function strlen via the ffi.C API with our C string buffer.

It is worth noting that we need to first declare the strlen function prototype via the ffi.cdef API. Since we declare the C function in the request handler (content_by_lua_block), we should only declare it once instead of upon every request. To achieve that, we use a Lua if statement to check if the symbol strlen is already declared (when strlen is not declared or defined, the Lua expression C.strlen would throw out a Lua exception, which can make the pcall call fail).

This example contains no memory issues since we properly initialize our C string buffer by setting the null terminator character (\0) at end of our C string. The C function strlen should correctly report back the length of the string, which is 2, without reading beyond our buffer boundary. Now we run this test file with the Valgrind mode enabled using the default OpenResty installation’s nginx:

export TEST_NGINX_USE_VALGRIND=1
export PATH=/usr/local/openresty/nginx/sbin:$PATH

prove t/a.t

There should be a lot of output. The first few lines should look like this:

t/a.t .. TEST 1: C strlen()
==7366== Invalid read of size 4
==7366==    at 0x546AE31: str_fastcmp (lj_str.c:57)
==7366==    by 0x546AE31: lj_str_new (lj_str.c:166)
==7366==    by 0x547903C: lua_setfield (lj_api.c:903)
==7366==    by 0x4CAD18: ngx_http_lua_cache_store_code (ngx_http_lua_cache.c:119)
==7366==    by 0x4CAB25: ngx_http_lua_cache_loadbuffer (ngx_http_lua_cache.c:187)
==7366==    by 0x4CB61A: ngx_http_lua_content_handler_inline (ngx_http_lua_contentby.c:300)

Ouch! Valgrind reports a memory invalid read error. Fortunately it is just a false positive due to the optimization inside the LuaJIT VM when it is trying to create a new Lua string. The LuaJIT code repository maintains a file named lj.supp that lists all the known Valgrind false positives that can be used to suppress these messages. We can simply copy that file over and rename it to valgrind.suppress in the current working directory. Then Test::Nginx will automatically feed this valgrind.suppress file into Valgrind while running the tests in Valgrind mode. Let’s try that:

cp -i /path/to/luajit-2.0/src/lj.supp ./valgrind.suppress
prove t/a.t

This time, the test scaffold is calmed:

t/a.t .. TEST 1: C strlen()
t/a.t .. ok
All tests successful.
Files=1, Tests=3,  2 wallclock secs ( 0.01 usr  0.00 sys +  1.51 cusr  0.06 csys =  1.58 CPU)
Result: PASS

We might encounter other Valgrind false positives like some of those in the NGINX core or the OpenSSL library. We can add those to the valgrind.suppress file as needed. The Test::Nginx test scaffold always outputs suppression rules that can be added directly to the suppression file. For the example above, the last few lines of the output are like below.

{
   <insert_a_suppression_name_here>
   Memcheck:Addr4
   fun:str_fastcmp
   fun:lj_str_new
   fun:lua_setfield
   fun:ngx_http_lua_cache_store_code
   fun:ngx_http_lua_cache_loadbuffer
   fun:ngx_http_lua_content_handler_inline
   fun:ngx_http_core_content_phase
   fun:ngx_http_core_run_phases
   fun:ngx_http_process_request
   fun:ngx_http_process_request_line
   fun:ngx_epoll_process_events
   fun:ngx_process_events_and_timers
   fun:ngx_single_process_cycle
   fun:main
}
t/a.t .. ok
All tests successful.
Files=1, Tests=3,  2 wallclock secs ( 0.01 usr  0.00 sys +  1.47 cusr  0.07 csys =  1.55 CPU)
Result: PASS

The suppression rule generated is the stuff between the curly braces (including the curly braces themselves):

{
   <insert_a_suppression_name_here>
   Memcheck:Addr4
   fun:str_fastcmp
   fun:lj_str_new
   fun:lua_setfield
   fun:ngx_http_lua_cache_store_code
   fun:ngx_http_lua_cache_loadbuffer
   fun:ngx_http_lua_content_handler_inline
   fun:ngx_http_core_content_phase
   fun:ngx_http_core_run_phases
   fun:ngx_http_process_request
   fun:ngx_http_process_request_line
   fun:ngx_epoll_process_events
   fun:ngx_process_events_and_timers
   fun:ngx_single_process_cycle
   fun:main
}

We could have simply copied and pasted this rule into the valgrind.suppress file. It is worth mentioning however, we can make this rule more general to exclude the C function frames belonging to the NGINX core and the ngx_lua module (near the bottom of the rule) since this false positive is related to LuaJIT only.

Let’s continue our experiment with our current example. Now we edit our test case and change the following line

local buf = ffi.new("char[3]", {48, 49, 0})

local buf = ffi.new("char[3]", {48, 49, 50})

That is, we replace the null character (with ASCII code 0) to a non-null character whose ASCII code is 50. This change makes our C string buffer lacks any null terminators and thus calling strlen on it will result in memory reads beyond our buffer boundary.

Unfortunately running this edited test file fail to yield any Valgrind error reports regarding this memory issue:

t/a.t .. TEST 1: C strlen()
t/a.t .. 1/?
#   Failed test 'TEST 1: C strlen() - response_body - response is expected (repeated req 0, req 0)'
#   at /home/agentzh/git/lua-nginx-module/../test-nginx/lib/Test/Nginx/Socket.pm line 1346.
#          got: "strlen: 4\x{0a}"
#       length: 10
#     expected: "strlen: 2\x{0a}"
#       length: 10
#     strings begin to differ at char 9 (line 1 column 9)
# Looks like you failed 1 test of 3.

The response body check fails as expected. This time strlen returns 4, which is larger than our buffer size, 3. This is a clear indication of memory buffer over-read. So why does Valgrind fail to catch this?

To answer this question, we need some knowledge about how LuaJIT allocates memory. By default, LuaJIT uses its own memory allocator atop the system allocator (usually provided by the standard C library). For performance reasons, LuaJIT pre-allocates large memory blocks than request. Because Valgrind has no knowledge about LuaJIT’s own allocator and Lua user-level buffer boundary definitions, it can be cheated and can get confused.

To remove this limitation, we can enforce LuaJIT to use the system allocator instead of its own. To achieve this, we need build LuaJIT with special compilation options like below.

make CCDEBUG=-g XCFLAGS='-DLUAJIT_USE_VALGRIND -DLUAJIT_USE_SYSMALLOC'

The most important option is -DLUAJIT_USE_SYSMALLOC which forces LuaJIT to use the system allocator. The other options are important for our debugging as well, for example, the CCDEBUG=-g option is to enable debug symbols in the LuaJIT binary while -DLUAJIT_USE_VALGRIND enables some other special collaborations with Valgrind inside the LuaJIT VM.

If we are using the OpenResty bundle, we can simply build another special version of OpenResty like below:

./configure \
    --prefix=/opt/openresty-valgrind \
    --with-luajit-xcflags='-DLUAJIT_USE_VALGRIND -DLUAJIT_USE_SYSMALLOC' \
    --with-debug \
    -j4
make -j4
sudo make install

This will build and install a special debug version of OpenResty for Valgrind checks to the file system location /opt/openresty-valgrind.

Note	There is some other LuaJIT special build options that can further help us, like `-DLUA_USE_APICHECK` and `-DLUA_USE_ASSERT`. But they are beyond the scope of our current example.

Now let’s try running our previous buggy example with this special OpenResty and Valgrind:

export TEST_NGINX_USE_VALGRIND=1
export PATH=/opt/openresty-valgrind/nginx/sbin:$PATH

prove t/a.t

This time Valgrind succeeds in catching the memory bug!

t/a.t .. TEST 1: C strlen()
==8128== Invalid read of size 1
==8128==    at 0x4C2BC34: strlen (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==8128==    by 0x5467217: lj_vm_ffi_call (in /opt/luajit21sysm/lib/libluajit-5.1.so.2.1.0)
==8128==    by 0x54B5DE7: lj_ccall_func (lj_ccall.c:1136)
==8128==    by 0x54CAD45: lj_cf_ffi_meta___call (lib_ffi.c:230)
==8128==    by 0x5465147: lj_BC_FUNCC (in /opt/luajit21sysm/lib/libluajit-5.1.so.2.1.0)
==8128==    by 0x4C72BC: ngx_http_lua_run_thread (ngx_http_lua_util.c:1015)
==8128==    by 0x4CB039: ngx_http_lua_content_by_chunk (ngx_http_lua_contentby.c:120)
...

We omit the rest of the output for brevity. Here Valgrind reports an invalid read of one byte of memory in the C function strlen, which is exactly what we’d expect. Mission accomplished!

Note	LuaJIT built with the system allocator should only be used with Valgrind only. On computer architectures like x86_64, such LuaJIT may not even start up.

From this example, we can see how application-level memory allocation optimizations and management can compromise the effectiveness of Valgrind’s memory issue detection. Similarly, the NGINX core also comes with its own memory allocator via "memory pools". Such memory pools tend to allocate page-sized memory blocks for small allocations and thus can also inversely affect Valgrind' s detection. OpenResty provides a patch for the NGINX core to disable the memory pool optimizations altogether. The easiest way to use the patch is to specify the --with-no-pool-patch option when running the ./configure script while building OpenResty.

Note	Since NGINX 1.9.13, NGINX provides a C macro `NGX_DEBUG_PALLOC` which when set can be used to achieve similar effect as OpenResty’s "no-pool patch". But still the "no-pool patch" is much more aggressive and thorough and can help find more potential memory problems in NGINX related C code.

This Valgrind mode is used by OpenResty developers on a daily basis and has helped locate countless memory manage bugs in the OpenResty C and Lua/FFI code base. Interestingly, this test mode also located memory issues in the official NGINX core and the official LuaJIT core. Unlike analyzing core dumps, Valgrind can almost always find the first scene of memory offends, studying the memory error reports can usually give rise to immediate code fixes.

As with all the other tools, Valgrind has its own limitations and cannot find all the memory issues even when we carefully disable application level memory allocators as demonstrated above. For example,

memory issues on the C runtime stack cannot be caught by Valgrind (at least for Valgrind' s default memcheck tool).
Also, memory leaks in application-level resource managers cannot be detected. For example, memory leaks in NGINX’s global memory pool won’t get detected since NGINX always destroy all the memory pools upon process termination. Similarly, an ever growing Lua object managed by the Lua garbage collector (GC) won’t get caught either, since the Lua VM always frees all its GC-managed objects.

Understanding the weakness of the tool is as important as understanding its strengths. We shall see an alternative approach in the next section for detecting leaks in the application-level memory managers.

Note

Google’s AddressSanitizer tool can also be used to detect memory issues. As compared to Valgrind, it has the advantages of running much faster and can detect memory issues on the C runtime stack as well. Unfortunately it has its own limitations too. For example, it requires special C/C++ compiler options to rebuild all the related C code and C libraries for the best result. Also, it cannot find problems in dynamically generated machine code (like from a Just-in-Time compiler) or hand-written assembly code (like LuaJIT’s Lua interpreter). Therefore, OpenResty developers use Valgrind much more often.

Naive Memory Leak Check Mode

As we have seen from the previous section, Valgrind is great at detecting a wide range of memory leaks and memory invalid accesses. But Valgrind also suffers from limitations in detecting leaks in application-level memory managers such as garbage collectors (GC) and memory pools, which is also quite common in reality. To see this, let’s consider the following simple example that leaks in LuaJIT’s GC-managed memory.

=== TEST 1:
--- config
    location = /t {
        content_by_lua_block {
            package.path = "/path/to/some/lib/?.lua;" .. package.path
            ngx.say("ok")
        }
    }
--- request
    GET /t
--- response_body
ok
--- no_error_log
[error]

This example demonstrates a common mistake made by many OpenResty beginners. The package.path field specifies the search paths used by the require builtin function for loading pure Lua modules. This string value is hooked up in the global Lua table package which has the same lifetime as the current Lua virtual machine (VM) instance. Since Lua VM instances usually have the same lifetime as NGINX worker processes (unless the lua_code_cache directive is turned off in nginx.conf), prepending a new string to the value of package.path in a request handler like content_by_lua_block apparently results in a memory leak.

Unfortunately Valgrind cannot find this leak at all since the leak happens in the GC-managed memory inside the Lua VM because all such leaked memory will always get released upon GC destruction (or VM destruction) before the current process exits, which fools Valgrind to think that there is no leaks at all. Interested readers can try running this example with the "Valgrind test mode" as explained in the previous section.

To address this limitation of Valgrind, Test::Nginx::Socket introduces a new test mode called "naive memory leak check mode", or just "check leak mode" for short. In this mode, the test scaffold performs the following things:

loads the NGINX server with many of the test request specified in the test block, in a way similar to the "benchmark test mode" we discussed earlier,
and at the same time, periodically polls and records the memory footprint of the NGINX worker process with the system command ps,
and finally analyzes the memory usage data points collected in 2) by finds the slope (k) of a line that best fits those data points.

To make use of this mode, just specify the TEST_NGINX_CHECK_LEAK=1 environment, before running existing test files, as in

export TEST_NGINX_CHECK_LEAK=1
prove t/a.t

Assuming the t/a.t test file contains the test block example given above, we should get an output similar to the following.

t/a.t .. TEST 1:
LeakTest: [3740 3756 3620 3624 4180 3808 4044 4240 4272 4888 3876 3520 4516
 4368 5216 4796 4420 4508 4068 5336 5220 3888 4196 4544 4100 3696 5028 5080
 4580 3936 5236 4308 5320 4748 5464 4032 5492 4996 4588 4932 4632 6388 5228
 5516 4680 5348 5420 5964 5436 5128 5720 6324 5700 4948 4312 6208 5192 5268
 5600 4144 6556 4248 5648 6612 4044 5408 5120 5120 5740 6048 6412 5636 6488
 5184 6036 5436 5808 4904 4980 6772 5148 7160 6576 6724 5024 6768 7264 5540
 5700 5284 5244 4512 5752 6752 6868 6064 4940 5636 6388 7468]
LeakTest: k=22.6
t/e.t .. ok
All tests successful.
Files=1, Tests=3,  6 wallclock secs ( 0.01 usr  0.01 sys +  0.61 cusr  1.68 csys =  2.31 CPU)
Result: PASS

The special output lines from this test mode have the prefix LeakTest:. The first such line lists all the data points for the memory footprint size in the unit of kilo bytes (KB), collected every 0.02 seconds. And the second line is the slope (k) of the data line that best fits these data points. And in this case, k equals to 22.6.

The slope of the line can usually serve as an indication for the speed of memory leaking. The larger the slope is, the faster the leak is. A 2-digit data line slope here is very likely an indication of memory leak. To be sure, we plot these data points in a graph using the gnuplot tool.

There are quite some fluctuations in the graph. This is due to how garbage collector normally behaves. It usually allocates page-sized or even larger memory blocks than actually requested for performance reasons and delays the release of unused memory blocks because of the sweep phase or something else. Still, it is clear that the memory usage is going up over all.

We can try enforcing a full garbage collection cycle upon the entry of our request handler, like this:

content_by_lua_block {
    collectgarbage()
    package.path = "/path/to/some/lib/?.lua;" .. package.path
    ngx.say("ok")
}

This way we can ensure that there is no memory garbage hanging around after the point we call the Lua builtin function collectgarbage().

Now the output looks like this:

t/e.t .. TEST 1:
LeakTest: [2464 2464 2360 2464 2232 2520 2380 2536 2440 2320 2300 2464
 2576 2584 2540 2408 2608 2420 2596 2596 2332 2648 2660 2460 2680 2320
 2688 2616 2332 2628 2408 2728 2716 2380 2752 2360 2768 2376 2372 2376
 2732 2800 2808 2816 2464 2396 2668 2688 2848 2672 2412 2416 2536 2420
 2424 2632 2904 2668 2912 2564 2724 2448 2932 2944 2856 2960 2616 2672
 2976 2620 2984 2600 2808 2980 3004 2996 3236 3012 2724 3168 3072 3536
 3260 3412 3028 2700 2480 3188 2808 3536 2640 3056 2764 3052 3440 3308
 3064 2680 2828 3372]
LeakTest: k=7.4
t/e.t .. ok
All tests successful.
Files=1, Tests=3,  6 wallclock secs ( 0.02 usr  0.00 sys +  0.62 cusr  1.75 csys =  2.39 CPU)
Result: PASS

We can see this time, the slope of the best-fitting line is much smaller, but still much larger than 0.

The line graph is now much smoother, as expected:

And we can see that the line is still going upward relatively steadily over time.

Large fluctuations and variations in the memory footprint may create noises in our data samples and even result in false positives. We already saw how big fluctuations may result in large data-fitting line slopes. It is usually a good idea to enforce full garbage collection cycles frequently to reduce such noises at least in GC-managed memory. The collectgarbage() function, however, is quite expensive in terms of CPU resources and may hurt the over-all performance very badly. Ensure you do not call it often (like in every request) in the "benchmark test mode" introduced above or even in production applications.

In reality, this brute-force "check leak" test mode has helped catching quite a lot of real memory leaks in OpenResty’s test suites over the years. Most of those leaks made their way around the Valgrind test mode since they happened in GC-managed memory or NGINX’s memory pools.

Note	The NGINX no-pool patch mentioned in the previous section does not help here since all the leaked memory blocks in the pool still get released before the process exits.

Nevertheless, there exists one big drawback of this test mode. Unlike Valgrind, it cannot give any detailed information about the locations where leaks (may) happen. All it reports are just data samples and other metrics that verify just the existence of a leak (at least to some extend). We shall see in a later chapter how we can use the "memory leak flame graphs" to overcome this limitation even for leaks and big swings in GC-managed or pool-managed memory.

Test Modes

Test Modes

Benchmark Mode

HUP Reload Mode

Valgrind Mode

Naive Memory Leak Check Mode

Mockeagain Mode

Manual Debugging Mode

SystemTap Mode

results matching ""

No results matching ""