Skip to content

Commit 322d6c1

Browse files
add first post
1 parent d5c693e commit 322d6c1

39 files changed

+318
-36
lines changed
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.

Diff for: _posts/2018-08-29-sample-post.md

-34
This file was deleted.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,316 @@
1+
---
2+
layout: post
3+
title: Hooking Internal Functions of ELF Binaries with LD_PRELOAD via Redirection to the PLT
4+
tags: [hooking, reverse-engineering, instrumentation, debugging, ELF, LD_PRELOAD, redirect-to-PLT, AMD64, x86-64, PIE]
5+
author-id: julian
6+
---
7+
8+
It is well known that `LD_PRELOAD` can be used to override shared library
9+
functions loaded at runtime by the dynamic linker [1]. What is not so well known is
10+
that *internal functions* - functions whose code lies within the `.text` section
11+
of the binary - can also be be hooked indirectly using a simple trick that relies on `LD_PRELOAD`, even though
12+
these functions obviously are not imported from dynamically-linked libraries
13+
(shared objects).
14+
15+
16+
### Overview
17+
18+
The following will be discussed:
19+
- a discription of redirect-to-PLT
20+
- use cases
21+
- redirect-to-PLT is not GOT/PLT hooking or infection
22+
- demonstration of the technique with toy program
23+
24+
Prerequisites:
25+
- basic familiarity with the following:
26+
- the ELF format
27+
- dynamic linking in Linux
28+
- `LD_PRELOAD` - what it is, how to use it
29+
30+
Tools:
31+
- Keystone Engine
32+
- Python 3
33+
- GCC
34+
35+
# Introduction
36+
37+
When a function in the `.text` section is called, the instruction pointer jumps
38+
to the address of the first instruction of that function. To hook such a function,
39+
the instruction pointer can be redirected to jump to the entry in the
40+
Procedure Linkage Table (PLT) of a shared library function which will be called
41+
instead. This shared library function can then be overridden via `LD_PRELOAD` to
42+
inject a custom shared library function which contains the code to be executed
43+
in place of the hooked internal function. Crucially, even though this code is
44+
called from within a shared library that is used elsewhere in the program and called under different
45+
conditions from the hooked internal function, it is possible to control when this
46+
code executes.
47+
48+
Put simply, this technique is essentially an extension of the `LD_PRELOAD`
49+
technique such that it can be used to override internal functions as well,
50+
wherein flow of execution detours from code resident in the binary's
51+
`.text` section to code imported from an injected shared library. It consists
52+
of a redirect and an override:
53+
54+
1. First, the call to the target internal function is replaced via patching
55+
with a call to a shared library function in the PLT.
56+
2. Next, that particular shared library function is overridden with code from
57+
a custom shared library, and that shared library is loaded via `LD_PRELOAD`
58+
59+
### Use Cases
60+
61+
Redirect-to-PLT may be useful when there is a need to insert debugging instrumentation into internal
62+
functions or if we want to override an internal function, but adding code to the
63+
binary itself is not desirable.
64+
- code may be added to a binary by adding a new segment or via segment padding
65+
infection techniques [3][4], but this is quite cumbersome for a few reasons:
66+
1. adding code this way usually requires re-engineering the binary file
67+
to some extent, extending or adding segments, changing flags, updating
68+
information in the ELF header and the program load table to reflect
69+
changes made to the binary image and so forth.
70+
2. calling shared library functions in code added to the binary is rather
71+
complex, thus system calls are typically made directly. This often
72+
necessitates writing code in assembly rather than C or using both
73+
together.
74+
75+
As a result of the restructions imposed by this approach, it is not very
76+
flexible and writing code to accomplish this appears to be a comapratively
77+
slow and error-prone endeavor.
78+
79+
- However, if we want to analyze the behavior of an internal function (via debug `printf()`
80+
statements for exemple) using the redirect-to-PLT trick, we can recreate the
81+
logic of the function in a shared library, add the desired modifications, patch
82+
the code to call that library instead of the chosen internal function,
83+
and then inject this shared library with `LD_PRELOAD`. The instrumented
84+
code in this shared library will then be executed instead of the original
85+
internal function code.
86+
87+
### Hooking with redirect-to-PLT vs GOT/PLT hooking
88+
89+
It should be noted that even though this method relies on the PLT for redirection,
90+
it is not related to GOT/PLT hooking [2], in which the GOT or PLT are overwritten
91+
in order to override imported shared library functions in a similar vein to
92+
`LD_PRELOAD`. This *redirect-to-PLT* trick is a hack to override *internal
93+
functions* specifically; no changes are made to the GOT or the PLT.
94+
95+
96+
# Overriding an Internal Function in a Toy Program
97+
98+
For the following program (`example_program1`), we want to hook the
99+
`detour_me()` function:
100+
101+
```c
102+
/*
103+
detour the detour_me(void) function via redirect-to-PLT to print
104+
a string of our choosing. For now, let us choose "I <3 LD_PRELOAD"
105+
*/
106+
107+
#include <stdio.h>
108+
109+
void detour_me(void) {
110+
printf("Can ");
111+
printf("you ");
112+
printf("detour ");
113+
printf("this ");
114+
printf("function?\n");
115+
}
116+
117+
int main(void) {
118+
printf("In main(), before detour_me()\n");
119+
detour_me();
120+
printf("In main(), after detour_me()\n");
121+
}
122+
```
123+
124+
The approach is as follows:
125+
1. Select a suitable shared library function to override
126+
2. Patch the `CALL` to `detour_me()` to point to the PLT entry of the chosen shared
127+
library function
128+
3. Design the custom shared library to inject
129+
4. Use `LD_PRELOAD` to inject the shared library. In this case the hook will print
130+
"I <3 LD_PRELOAD".
131+
132+
**Before beginning, a copy of the original binary should be made. Here the copy
133+
will be called `copy_to_patch`. Subsequent steps will involve this copy, not
134+
the original binary.**
135+
136+
To select a suitable shared library function to override, we can examine which
137+
shared library functions have entries in the PLT. One way of doing this is using `grep` to
138+
search through disassembly of the binary output by `objdump`:
139+
140+
```shell
141+
$ objdump -dj .text copy_to_patch | grep plt
142+
65e: e8 0d ff ff ff callq 570 <__cxa_finalize@plt>
143+
69a: e8 c1 fe ff ff callq 560 <printf@plt>
144+
6ab: e8 b0 fe ff ff callq 560 <printf@plt>
145+
6bc: e8 9f fe ff ff callq 560 <printf@plt>
146+
6cd: e8 8e fe ff ff callq 560 <printf@plt>
147+
6d9: e8 72 fe ff ff callq 550 <puts@plt>
148+
6ec: e8 5f fe ff ff callq 550 <puts@plt>
149+
6fd: e8 4e fe ff ff callq 550 <puts@plt>
150+
151+
```
152+
Since this example program is trivial, we could override any of these, but here
153+
`__cxa_finalize()` will be chosen since it illustrates the flexibility of this
154+
approach and will also introduce an interesting challenge associated with using this
155+
technique.
156+
157+
Next, the call to `detour_me()` needs to be patched to point to the entry in the
158+
PLT for `__cxa_finalize()`. From the bit of output above, it can be seen that the file offset
159+
of the the PLT entry for `__cxa_finalize()` is 0x570. According to the disassembly
160+
of `main()`, `detour_me()` is called at file offset 0x6f1:
161+
162+
```shell
163+
00000000000006e1 <main>:
164+
6e1: 55 push %rbp
165+
6e2: 48 89 e5 mov %rsp,%rbp
166+
6e5: 48 8d 3d ca 00 00 00 lea 0xca(%rip),%rdi # 7b6 <_IO_stdin_used+0x26>
167+
6ec: e8 5f fe ff ff callq 550 <puts@plt>
168+
6f1: e8 94 ff ff ff callq 68a <detour_me> <----------------
169+
6f6: 48 8d 3d d7 00 00 00 lea 0xd7(%rip),%rdi # 7d4 <_IO_stdin_used+0x44>
170+
6fd: e8 4e fe ff ff callq 550 <puts@plt>
171+
702: b8 00 00 00 00 mov $0x0,%eax
172+
707: 5d pop %rbp
173+
708: c3 retq
174+
709: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
175+
```
176+
177+
Key pieces of information for patching:
178+
179+
- `main()` calls `detour_me()` at offset 0x6f1
180+
- the PLT entry for `__cxa_finalize()` is at 0x570
181+
182+
Python script to patch the copy of the example program:
183+
184+
<script src="https://gist.github.com/BinaryResearch/e70d29e2d3e36f9967fe7d0c64cb1841.js"></script>
185+
186+
After the patch is applied, `__cxa_finalize()` is called from `main()` instead of `detour_me()`:
187+
188+
```shell
189+
00000000000006e1 <main>:
190+
6e1: 55 push %rbp
191+
6e2: 48 89 e5 mov %rsp,%rbp
192+
6e5: 48 8d 3d ca 00 00 00 lea 0xca(%rip),%rdi # 7b6 <_IO_stdin_used+0x26>
193+
6ec: e8 5f fe ff ff callq 550 <puts@plt>
194+
6f1: e8 7a fe ff ff callq 570 <__cxa_finalize@plt> <--------------
195+
6f6: 48 8d 3d d7 00 00 00 lea 0xd7(%rip),%rdi # 7d4 <_IO_stdin_used+0x44>
196+
6fd: e8 4e fe ff ff callq 550 <puts@plt>
197+
702: b8 00 00 00 00 mov $0x0,%eax
198+
707: 5d pop %rbp
199+
708: c3 retq
200+
709: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
201+
202+
```
203+
204+
Now that the binary has been patched, it is time to write a shared library to
205+
inject. Fortunately, in this case the logic of the program is very simple and
206+
the library function chosen to be overridden can simply be substituted. We need
207+
not concern ourselves with wrapping it.
208+
209+
Here is the code of the custom shared library to inject:
210+
211+
<script src="https://gist.github.com/BinaryResearch/14348015a7e62bd6619f68e04ef172ed.js"></script>
212+
213+
This will be compiled via
214+
215+
```shell
216+
$ gcc -shared -fPIC -o override_cxa_finalize.so override_cxa_finalize.c
217+
```
218+
219+
Now we are ready to inject the code!
220+
221+
```shell
222+
$ LD_PRELOAD=$PWD/override_cxa_finalize.so ./copy_to_patch
223+
In main(), before detour_me()
224+
I <3 LD_PRELOAD
225+
In main(), after detour_me()
226+
I <3 LD_PRELOAD
227+
I <3 LD_PRELOAD
228+
```
229+
230+
It works, but there is a problem: `__cxa_finalize()` is called 3 times, whereas
231+
in the original binary the function we want to hook, `detour_me()`, is called
232+
only once. How can we ensure that the detour for `detour_me()` is executed **only**
233+
when `__cxa_finalize()` is called from `main()`?
234+
235+
This is one of the main challenges associated with using
236+
a library function to hook an internal function; depending on which library function
237+
is chosen, it may be called an arbitrary number of times and across a variety of
238+
circumstances which may be hard or impossible to predict or account for.
239+
240+
In this case, one possible solution is to take advantage of the fact that according to the prototype for
241+
`__cxa_finalize()`, the function takes an argument and that the value of this argument
242+
will vary across calls to `__cxa_finalize()`. The code overriding `detour_me()`
243+
can be set to execute for a particular value of the argument.
244+
245+
<script src="https://gist.github.com/BinaryResearch/4ebb6c2dca3f725ba05414e30c44598e.js"></script>
246+
247+
This produces the desired behavior:
248+
249+
```shell
250+
$ LD_PRELOAD=$PWD/override_cxa_finalize_A.so ./copy_to_patch
251+
In main(), before detour_me()
252+
Argument to __cxa_finalize(): 0x1
253+
I <3 LD_PRELOAD
254+
In main(), after detour_me()
255+
Argument to __cxa_finalize(): 0x556fdf12d008
256+
Argument to __cxa_finalize(): 0x7f22c14cb028
257+
```
258+
259+
Another option is counting the number of times `__cxa_finalize()` is called so that the
260+
"I <3 LD_PRELOAD" message is printed only when `detour_me()` is being hooked.
261+
Aside from the very first call to `__cxa_finalize()`, we do not want our code for
262+
`detour_me()` to execute. Therefore,
263+
if the number of times `__cxa_finalize()` has been called can be checked *within*
264+
`__cxa_finalize()`, the code overriding `detour_me()` can be made to execute *only*
265+
upon the first call to `__cxa_finalize()` and otherwise not.
266+
267+
This can be accomplished by using `setenv()` and `getenv()` within the injected shared library
268+
to create, update and read an
269+
environmental variable stored on the stack that keeps track of the number of times `__cxa_finalize()` is called during program runtime:
270+
271+
<script src="https://gist.github.com/BinaryResearch/6721ab6e867e8837752000a64fa23dce.js"></script>
272+
273+
And inject the new library:
274+
```shell
275+
$ LD_PRELOAD=$PWD/override_cxa_finalize_B.so ./copy_to_patch
276+
In main(), before detour_me()
277+
__cxa_finalize() called 1 time!
278+
I <3 LD_PRELOAD
279+
In main(), after detour_me()
280+
__cxa_finalize() called 2 times!
281+
__cxa_finalize() called 3 times!
282+
```
283+
284+
Once again, the code for `detour_me()` in the injected library is executed
285+
only when `__cxa_finalize()` is called in `main()` in place of `detour_me()`.
286+
287+
288+
# Conclusion
289+
290+
- By patching a function call to an internal function to jump to a shared library
291+
function entry in the PLT, that shared library function will be called instead
292+
of the internal function. Thus the internal function is now hooked by a shared
293+
library function.
294+
- The shared library function that hooks the internal function can be overridden
295+
with a custom library via `LD_PRELOAD`.
296+
- Since execution detours to the shared library function, there are few constraints
297+
on what can be executed instead of the code of the internal function. For example,
298+
unlike when inserting code into the binary itself,
299+
library calls can be made easily, and space is a non-factor. There is no need to
300+
use code caves, look for `00` padding, extend segments, update variable relocations manually, etc.
301+
- However, ensuring that the code overriding the internal function is executed
302+
only when that internal function is hooked by the shared library function may require
303+
coding triggers in the custom shared library, depending on which library function was chosen as the
304+
internal function override;
305+
program- and runtime-specific conditions may be very particular.
306+
307+
In this post, a toy example was used to introduce this technique. In the next part,
308+
it will be demostrated how to use redirect-to-PLT to insert debugging instrumentation into the internal functions of crackme programs.
309+
310+
311+
### Links and References
312+
313+
1. [Dynamic linker tricks: Using LD_PRELOAD to cheat, inject features and investigate programs](https://rafalcieslak.wordpress.com/2013/04/02/dynamic-linker-tricks-using-ld_preload-to-cheat-inject-features-and-investigate-programs/)
314+
2. [SHARED LIBRARY CALL REDIRECTION VIA ELF PLT INFECTION](http://phrack.org/issues/56/7.html)
315+
3. [Infecting the plt/got](https://lief.quarkslab.com/doc/latest/tutorials/05_elf_infect_plt_got.html)
316+
4. [UNIX VIRUSES](https://www.win.tue.nl/~aeb/linux/hh/virus/unix-viruses.txt)

Diff for: _site/feed.xml

+1-1
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
<atom:link href="http://localhost:4000/feed.xml" rel="self" type="application/rss+xml"/>
88
<link>http://localhost:4000/</link>
99
<description>binary analysis, reverse engineering, security research, machine learning</description>
10-
<pubDate>Thu, 29 Aug 2019 02:40:06 +0800</pubDate>
10+
<pubDate>Mon, 02 Sep 2019 20:22:13 +0800</pubDate>
1111
<webfeeds:icon>http://localhost:4000/assets/img/julian-avatar.jpg</webfeeds:icon>
1212

1313
<item>

Diff for: _site/tags/index.html

+1-1
Original file line numberDiff line numberDiff line change
@@ -218,7 +218,7 @@ <h5 class="tag-title">
218218
</a>
219219

220220
<div class="meta">
221-
August 29, 2019
221+
September 2, 2019
222222
</div>
223223
</h5>
224224

0 commit comments

Comments
 (0)