Skip to content

Commit bcddef8

Browse files
authored
Merge pull request github#16131 from erik-krogh/cpp-path
C++: Improve the cpp/path-injection qhelp
2 parents e721399 + a51d24c commit bcddef8

File tree

7 files changed

+153
-40
lines changed

7 files changed

+153
-40
lines changed

cpp/ql/src/Security/CWE/CWE-022/TaintedPath.c

-22
This file was deleted.

cpp/ql/src/Security/CWE/CWE-022/TaintedPath.qhelp

+39-17
Original file line numberDiff line numberDiff line change
@@ -3,36 +3,57 @@
33
"qhelp.dtd">
44
<qhelp>
55
<overview>
6-
<p>Accessing paths controlled by users can allow an attacker to access unexpected resources. This
6+
<p>Accessing paths controlled by users can allow an attacker to access unexpected resources. This
77
can result in sensitive information being revealed or deleted, or an attacker being able to influence
88
behavior by modifying unexpected files.</p>
99

10-
<p>Paths that are naively constructed from data controlled by a user may contain unexpected special characters,
11-
such as "..". Such a path may potentially point to any directory on the filesystem.</p>
10+
<p>Paths that are naively constructed from data controlled by a user may be absolute paths, or may contain
11+
unexpected special characters such as "..". Such a path could point anywhere on the file system.</p>
1212

1313
</overview>
1414
<recommendation>
1515

16-
<p>Validate user input before using it to construct a filepath. Ideally, follow these rules:</p>
16+
<p>Validate user input before using it to construct a file path.</p>
1717

18-
<ul>
19-
<li>Do not allow more than a single "." character.</li>
20-
<li>Do not allow directory separators such as "/" or "\" (depending on the filesystem).</li>
21-
<li>Do not rely on simply replacing problematic sequences such as "../". For example, after applying this filter to
22-
".../...//" the resulting string would still be "../".</li>
23-
<li>Ideally use a whitelist of known good patterns.</li>
24-
</ul>
18+
<p>Common validation methods include checking that the normalized path is relative and does not contain
19+
any ".." components, or checking that the path is contained within a safe folder. The method you should use depends
20+
on how the path is used in the application, and whether the path should be a single path component.
21+
</p>
22+
23+
<p>If the path should be a single path component (such as a file name), you can check for the existence
24+
of any path separators ("/" or "\"), or ".." sequences in the input, and reject the input if any are found.
25+
</p>
26+
27+
<p>
28+
Note that removing "../" sequences is <i>not</i> sufficient, since the input could still contain a path separator
29+
followed by "..". For example, the input ".../...//" would still result in the string "../" if only "../" sequences
30+
are removed.
31+
</p>
32+
33+
<p>Finally, the simplest (but most restrictive) option is to use an allow list of safe patterns and make sure that
34+
the user input matches one of these patterns.</p>
2535

2636
</recommendation>
2737
<example>
2838

29-
<p>In this example, a username and file are read from the arguments to main and then used to access a file in the
30-
user's home directory. However, a malicious user could enter a filename which contains special
31-
characters. For example, the string "../../etc/passwd" will result in the code reading the file located at
32-
"/home/[user]/../../etc/passwd", which is the system's password file. This could potentially allow them to
33-
access all the system's passwords.</p>
39+
<p>In this example, a file name is read from a user and then used to access a file.
40+
However, a malicious user could enter a file name anywhere on the file system,
41+
such as "/etc/passwd" or "../../../etc/passwd".</p>
42+
43+
<sample src="examples/TaintedPath.c" />
44+
45+
<p>
46+
If the input should only be a file name, you can check that it doesn't contain any path separators or ".." sequences.
47+
</p>
48+
49+
<sample src="examples/TaintedPathNormalize.c" />
50+
51+
<p>
52+
If the input should be within a specific directory, you can check that the resolved path
53+
is still contained within that directory.
54+
</p>
3455

35-
<sample src="TaintedPath.c" />
56+
<sample src="examples/TaintedPathFolder.c" />
3657

3758
</example>
3859
<references>
@@ -41,6 +62,7 @@ access all the system's passwords.</p>
4162
OWASP:
4263
<a href="https://owasp.org/www-community/attacks/Path_Traversal">Path Traversal</a>.
4364
</li>
65+
<li>Linux man pages: <a href="https://man7.org/linux/man-pages/man3/realpath.3.html">realpath(3)</a>.</li>
4466

4567
</references>
4668
</qhelp>
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
int main(int argc, char** argv) {
2+
char *userAndFile = argv[2];
3+
4+
{
5+
char fileBuffer[PATH_MAX];
6+
snprintf(fileBuffer, sizeof(fileBuffer), "/home/%s", userAndFile);
7+
// BAD: a string from the user is used in a filename
8+
fopen(fileBuffer, "wb+");
9+
}
10+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
#include <stdio.h>
2+
#include <string.h>
3+
4+
int main(int argc, char** argv) {
5+
char *userAndFile = argv[2];
6+
const char *baseDir = "/home/user/public/";
7+
char fullPath[PATH_MAX];
8+
9+
// Attempt to concatenate the base directory and the user-supplied path
10+
snprintf(fullPath, sizeof(fullPath), "%s%s", baseDir, userAndFile);
11+
12+
// Resolve the absolute path, normalizing any ".." or "."
13+
char *resolvedPath = realpath(fullPath, NULL);
14+
if (resolvedPath == NULL) {
15+
perror("Error resolving path");
16+
return 1;
17+
}
18+
19+
// Check if the resolved path starts with the base directory
20+
if (strncmp(baseDir, resolvedPath, strlen(baseDir)) != 0) {
21+
free(resolvedPath);
22+
return 1;
23+
}
24+
25+
// GOOD: Path is within the intended directory
26+
FILE *file = fopen(resolvedPath, "wb+");
27+
free(resolvedPath);
28+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
#include <stdio.h>
2+
#include <string.h>
3+
4+
int main(int argc, char** argv) {
5+
char *fileName = argv[2];
6+
// Check for invalid sequences in the user input
7+
if (strstr(fileName , "..") || strchr(fileName , '/') || strchr(fileName , '\\')) {
8+
printf("Invalid filename.\n");
9+
return 1;
10+
}
11+
12+
char fileBuffer[PATH_MAX];
13+
snprintf(fileBuffer, sizeof(fileBuffer), "/home/user/files/%s", fileName);
14+
// GOOD: We know that the filename is safe and stays within the public folder
15+
FILE *file = fopen(fileBuffer, "wb+");
16+
}

cpp/ql/test/query-tests/Security/CWE/CWE-022/semmle/tests/TaintedPath.expected

+10
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@ edges
22
| test.c:8:27:8:30 | **argv | test.c:9:23:9:29 | *access to array | provenance | |
33
| test.c:8:27:8:30 | **argv | test.c:31:22:31:28 | *access to array | provenance | |
44
| test.c:8:27:8:30 | **argv | test.c:69:14:69:20 | *access to array | provenance | |
5+
| test.c:8:27:8:30 | **argv | test.c:80:25:80:31 | *access to array | provenance | |
6+
| test.c:8:27:8:30 | **argv | test.c:88:22:88:28 | *access to array | provenance | |
57
| test.c:9:23:9:29 | *access to array | test.c:17:11:17:18 | *fileName | provenance | TaintFunction |
68
| test.c:31:22:31:28 | *access to array | test.c:32:11:32:18 | *fileName | provenance | |
79
| test.c:37:17:37:24 | scanf output argument | test.c:38:11:38:18 | *fileName | provenance | |
@@ -11,6 +13,8 @@ edges
1113
| test.c:54:21:54:26 | *call to getenv | test.c:55:11:55:16 | *buffer | provenance | TaintFunction |
1214
| test.c:74:13:74:18 | read output argument | test.c:76:11:76:16 | *buffer | provenance | |
1315
| test.c:75:13:75:18 | read output argument | test.c:76:11:76:16 | *buffer | provenance | |
16+
| test.c:80:25:80:31 | *access to array | test.c:84:11:84:20 | *fileBuffer | provenance | TaintFunction |
17+
| test.c:88:22:88:28 | *access to array | test.c:98:24:98:33 | *fileBuffer | provenance | TaintFunction |
1418
nodes
1519
| test.c:8:27:8:30 | **argv | semmle.label | **argv |
1620
| test.c:9:23:9:29 | *access to array | semmle.label | *access to array |
@@ -30,6 +34,10 @@ nodes
3034
| test.c:74:13:74:18 | read output argument | semmle.label | read output argument |
3135
| test.c:75:13:75:18 | read output argument | semmle.label | read output argument |
3236
| test.c:76:11:76:16 | *buffer | semmle.label | *buffer |
37+
| test.c:80:25:80:31 | *access to array | semmle.label | *access to array |
38+
| test.c:84:11:84:20 | *fileBuffer | semmle.label | *fileBuffer |
39+
| test.c:88:22:88:28 | *access to array | semmle.label | *access to array |
40+
| test.c:98:24:98:33 | *fileBuffer | semmle.label | *fileBuffer |
3341
subpaths
3442
#select
3543
| test.c:17:11:17:18 | fileName | test.c:8:27:8:30 | **argv | test.c:17:11:17:18 | *fileName | This argument to a file access function is derived from $@ and then passed to fopen(filename). | test.c:8:27:8:30 | **argv | user input (a command-line argument) |
@@ -41,3 +49,5 @@ subpaths
4149
| test.c:69:14:69:20 | access to array | test.c:8:27:8:30 | **argv | test.c:69:14:69:20 | *access to array | This argument to a file access function is derived from $@ and then passed to readFile(fileName), which calls fopen(filename). | test.c:8:27:8:30 | **argv | user input (a command-line argument) |
4250
| test.c:76:11:76:16 | buffer | test.c:74:13:74:18 | read output argument | test.c:76:11:76:16 | *buffer | This argument to a file access function is derived from $@ and then passed to fopen(filename). | test.c:74:13:74:18 | read output argument | user input (buffer read by read) |
4351
| test.c:76:11:76:16 | buffer | test.c:75:13:75:18 | read output argument | test.c:76:11:76:16 | *buffer | This argument to a file access function is derived from $@ and then passed to fopen(filename). | test.c:75:13:75:18 | read output argument | user input (buffer read by read) |
52+
| test.c:84:11:84:20 | fileBuffer | test.c:8:27:8:30 | **argv | test.c:84:11:84:20 | *fileBuffer | This argument to a file access function is derived from $@ and then passed to fopen(filename). | test.c:8:27:8:30 | **argv | user input (a command-line argument) |
53+
| test.c:98:24:98:33 | fileBuffer | test.c:8:27:8:30 | **argv | test.c:98:24:98:33 | *fileBuffer | This argument to a file access function is derived from $@ and then passed to fopen(filename). | test.c:8:27:8:30 | **argv | user input (a command-line argument) |

cpp/ql/test/query-tests/Security/CWE/CWE-022/semmle/tests/test.c

+50-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
// Associated with CWE-022: Improper Limitation of a Pathname to a Restricted Directory. http://cwe.mitre.org/data/definitions/22.html
33

44
#include "stdlib.h"
5-
5+
#define PATH_MAX 4096
66
///// Test code /////
77

88
int main(int argc, char** argv) {
@@ -75,6 +75,55 @@ int main(int argc, char** argv) {
7575
read(0, buffer, 1024);
7676
fopen(buffer, "wb+"); // BAD [duplicated with both sources]
7777
}
78+
79+
{
80+
char *userAndFile = argv[2];
81+
char fileBuffer[PATH_MAX];
82+
snprintf(fileBuffer, sizeof(fileBuffer), "/home/%s", userAndFile);
83+
// BAD: a string from the user is used in a filename
84+
fopen(fileBuffer, "wb+");
85+
}
86+
87+
{
88+
char *fileName = argv[2];
89+
// Check for invalid sequences in the user input
90+
if (strstr(fileName , "..") || strchr(fileName , '/') || strchr(fileName , '\\')) {
91+
printf("Invalid filename.\n");
92+
return 1;
93+
}
94+
95+
char fileBuffer[PATH_MAX];
96+
snprintf(fileBuffer, sizeof(fileBuffer), "/home/user/files/%s", fileName);
97+
// GOOD: We know that the filename is safe and stays within the public folder. But we currently get an FP here.
98+
FILE *file = fopen(fileBuffer, "wb+");
99+
}
100+
101+
{
102+
char *userAndFile = argv[2];
103+
const char *baseDir = "/home/user/public/";
104+
char fullPath[PATH_MAX];
105+
106+
// Attempt to concatenate the base directory and the user-supplied path
107+
snprintf(fullPath, sizeof(fullPath), "%s%s", baseDir, userAndFile);
108+
109+
// Resolve the absolute path, normalizing any ".." or "."
110+
char *resolvedPath = realpath(fullPath, 0); // <- we're using `NULL` in the example, but 0 here to get it to compile. Same for next line.
111+
if (resolvedPath == 0) {
112+
perror("Error resolving path");
113+
return 1;
114+
}
115+
116+
// Check if the resolved path starts with the base directory
117+
if (strncmp(baseDir, resolvedPath, strlen(baseDir)) != 0) {
118+
free(resolvedPath);
119+
return 1;
120+
}
121+
122+
// GOOD: Path is within the intended directory
123+
FILE *file = fopen(resolvedPath, "wb+");
124+
free(resolvedPath);
125+
126+
}
78127
}
79128

80129
void readFile(char *fileName) {

0 commit comments

Comments
 (0)