缓冲区溢出漏洞
重点 (Top highlight)
缓冲 (Buffer)
A buffer is a temporary storage, usually present in the physical memory used to hold data.
缓冲区是一种临时存储,通常存在于用于保存数据的物理内存中。


Consider the program shown in the left image where a character buffer of length 5 is defined. In a big cluster of memory, a small memory of 5 bytes would be assigned to the buffer which looks like the image on the right.
考虑左图所示的程序,其中定义了长度为5的字符缓冲区。 在大内存集群中,将为缓冲区分配5个字节的小内存,该缓冲区看起来像右边的图像。
缓冲区溢出 (Buffer Overflow)
A buffer overflow occurs when more data is written to a specific length of memory in such a way that adjacent memory addresses are overwritten.
当更多数据写入特定长度的内存,从而覆盖相邻的内存地址时,就会发生缓冲区溢出。
演示(控制局部变量) (Demo (controlling local variables))
Let’s take an example of a basic authentication app that asks for a password and returns Authenticated!
if the password is correct.
让我们以一个基本的身份验证应用程序为例,该应用程序要求输入密码并返回Authenticated!
如果密码正确。
Without really knowing how the app works, let’s enter a random password.
在不真正知道应用程序如何工作的情况下,让我们输入一个随机密码。

It says Authentication Declined!
since the password wasn’t correct. To test, we need to enter large random data.
它说Authentication Declined!
因为密码不正确。 为了进行测试,我们需要输入大量随机数据。

You must be wondering why it got authenticated and why there is a Segmentation fault
. Let’s see a more detailed version of the app.
您一定想知道为什么它通过了身份验证以及为什么存在Segmentation fault
。 让我们看看该应用程序的更详细版本。

As you can see, there are three variables: auth
, sys_pass
, and usr_pass
.
如您所见,存在三个变量: auth
, sys_pass
和usr_pass
。
The auth
variable determines if the user is authenticated or not depending on the value (initially, 0). The usr_pass
stores the password that the user enters and the sys_pass
variable is what the correct password is.
auth
变量根据值(最初为0)确定用户是否通过身份验证。 usr_pass
存储用户输入的密码,而sys_pass
变量是正确的密码。
How the app works is, if the usr_pass
variable is equal to sys_pass
, then the auth
variable becomes 1
. If the auth
variable is not 0
, then the user is authenticated.
应用程序的工作方式是,如果usr_pass
变量等于sys_pass
,则auth
变量变为1
。 如果auth
变量不为0
,则对用户进行身份验证。
You may also see how the variables are stored in memory. Since the address is in hexadecimal and there is a difference of 1, therefore, usr_pass
and sys_pass
variables are buffers of length 16.
您可能还会看到变量如何存储在内存中。 由于地址为十六进制且相差1,因此usr_pass
和sys_pass
变量是长度为16的缓冲区。
To test for buffer overflow, a long password is entered as shown.
要测试缓冲区溢出,请输入一个长密码,如下所示。

As you can see, the password entered in usr_pass
variable overflows the sys_pass
variable and then the auth
variable.
如您所见,在usr_pass
变量中输入的密码usr_pass
sys_pass
变量和auth
变量溢出。
Note: C functions like strcpy()
, strcmp()
, strcat()
do not check the length of the variable and can overwrite later memory addresses, which is precisely what buffer overflow is.
注意:诸如strcpy()
, strcmp()
, strcat()
类的C函数不会检查变量的长度,并且会覆盖以后的内存地址,而这正是缓冲区溢出的原因。
Refer to the code below for better understanding.
请参阅下面的代码以更好地理解。
#include <stdio.h>int main(void) {
int auth = 0;
char sys_pass[16] = "Secret";
char usr_pass[16]; printf("Enter password: ");
scanf("%s", usr_pass); if (strcmp(sys_pass, usr_pass) == 0) {
authorized = 1;
} printf("usr_pass: %s\n", usr_pass);
printf("sys_pass: %s\n", sys_pass);
printf("auth: %d\n", authorized);
printf("sys_pass addr: %p\n", (void *)sys_pass);
printf("auth addr: %p\n", (void *)&authorized); if (auth) {
printf("Authenticated!\n");
}
else{
printf("Authentication declined!\n");
}
}
Note: This might be an unrealistic example and is only meant for understanding purposes. You may not see such situations in real life.
注意:这可能是不现实的示例,仅用于理解目的。 在现实生活中,您可能看不到这种情况。
Let’s dive a little deeper into the concepts now.
现在让我们更深入地研究这些概念。
正在运行的进程的内存划分 (Division of Memory for a Running Process)

Source: Techno Trick.
资料来源: Techno Trick 。
This is what the memory assigned to a process looks like. There are various sections like stack
, heap
, Uninitialized data
, etc. used for different purposes.
这就是分配给进程的内存。 有不同的部分,如stack
, heap
, Uninitialized data
等用于不同目的。
You may read more about the memory layout here: Memory layout of a process.
您可以在此处阅读有关内存布局的更多信息: 进程的内存布局。
This blog focuses on buffer overflow in a stack so let’s look at that.
该博客着重于堆栈中的缓冲区溢出,因此让我们来看一下。
- Stack: A LIFO data structure extensively used by computers in memory management, etc. 堆栈:计算机在内存管理等中广泛使用的LIFO数据结构。
- There is a bunch of registers present in the memory, but we will only concern ourselves with EIP, EBP, and ESP. 内存中有一堆寄存器,但是我们只关心EIP,EBP和ESP。
- EBP: It’s a stack pointer that points to the base of the stack. EBP:这是一个指向堆栈底部的堆栈指针。
- ESP: It’s a stack pointer that points to the top of the stack. ESP:这是一个指向堆栈顶部的堆栈指针。

5. EIP: It contains the address of the next instruction to be executed.
5. EIP:它包含要执行的下一条指令的地址。

堆叠布局 (Stack Layout)

The above image shows what a stack
looks like. It might look intimidating, but trust me, it isn’t.
上图显示了stack
外观。 它可能看起来令人生畏,但请相信我,事实并非如此。
Let’s see some important points related to the stack:
让我们看一下与堆栈有关的一些要点:
- A stack is filled from higher memory to lower memory. 堆栈从较高的内存填充到较低的内存。
- In a stack, all the variables are accessed relative to the EBP. 在堆栈中,所有变量都是相对于EBP访问的。
- In a program, every function has its own stack. 在程序中,每个函数都有其自己的堆栈。
- Everything is referenced from the EBP register. 一切都从EBP寄存器中引用。

Source: IT & Security Stuff.
资料来源: IT与安全资料 。
Above the EBP, function parameters are stored.
在EBP上方存储功能参数。
For example:
例如:
void foo(int a, int b, int c){
//Function body
}
Here, a
, b
, and c
are the function parameters stored above the EBP.
这里, a
, b
和c
是存储在EBP上方的功能参数。
- All the local variables of a function are stored below the EBP. 函数的所有局部变量都存储在EBP下方。
The
Old %ebp
is the value of the EBP of the previous function. Since, after a function is executed, it has to return back to an older function, we need to store the values of both old EBP and EIP.Old %ebp
是上一个函数的EBP值。 由于执行完函数后必须返回到较旧的函数,因此我们需要存储旧的EBP和EIP的值。- ESP register stores the address of the bottom of the stack. ESP寄存器存储堆栈底部的地址。
For example:
例如:
void foo(int a, int b, int c){
int x;
int y;
int z;
}
Here, x
, y
, z
are local variables to the function and are stored below the EBP.
在此, x
, y
, z
是函数的局部变量,并存储在EBP下方。
利用缓冲区溢出 (Exploiting Buffer Overflow)
It’s time to get into buffer overflow exploitation using the stack.
现在是时候使用堆栈来进行缓冲区溢出利用了。
Before that, let’s try to understand how a stack is built for any function.
在此之前,让我们尝试了解如何为任何函数构建堆栈。
Let’s look at an example, below:
让我们看下面的示例:

The stack on the right is of the function foo
as seen in the left image.
如左图所示,右侧的堆栈具有功能foo
。
Since
a
,b
, andc
are parameters passed to the function, they are stored above the EBP. Also, because the stack is filled from higher to lower memory and parameters are read from right to left,c
is written first in the memory, followed byb
anda
.由于
a
,b
和c
是传递给函数的参数,因此它们存储在EBP上方。 另外,由于堆栈是从较高的内存填充到较低的内存,并且参数是从右向左读取的,因此,c
首先写入到内存中,然后是b
和a
。x
,y
, andz
are the local variables stored below the EBP.x
,y
和z
是存储在EBP下面的局部变量。It is also required to store the
Old EIP
andOld EBP
of the functionmain
in the stack to know where to return to after the function executes.还需要将函数
main
的Old EIP
和Old EBP
存储在堆栈中,以了解函数执行后返回的位置。
Now, as shown in the previous demo, you could see how buffer overflow took place, using the local variables.
现在,如先前的演示所示,您可以使用局部变量查看缓冲区溢出是如何发生的。

Source: Security Sift.
资料来源: 安全筛查 。
Imagine a situation where you overflow the variables x
, y
, and z
in such a way that the old EIP is modified and stores the address of the memory where the malicious code is placed.
想象一下您以这样的方式溢出变量x
, y
和z
的情况,即修改了旧的EIP并存储了放置恶意代码的内存地址。
Refer to the below image for better understanding.
请参阅下图以更好地理解。

Assume a buffer with a length of 500 defined in a function. Now it is overflowed in such a way that it has some random data, followed by the shellcode (malicious code) and then the return address which points to the shellcode.
假定在函数中定义的长度为500的缓冲区。 现在它溢出了,它具有一些随机数据,然后是shellcode(恶意代码),然后是指向该shellcode的返回地址。
So, after the function gets executed, the instruction pointed to by the Return address gets executed and this is how our shellcode gets executed.
因此,在函数执行后,返回地址所指向的指令将被执行,这就是我们的shellcode的执行方式。
This is pretty much how buffer overflow happens.
这几乎就是缓冲区溢出发生的方式。
You must watch this video: Buffer Overflow Attack — Computerphile to get a more realistic idea of buffer overflow. The codes used in the above video are on GitHub.
您必须观看以下视频: 缓冲区溢出攻击-Computerphile,以获得更现实的缓冲区溢出概念。 上面视频中使用的代码在GitHub上 。
安防措施 (Security Measures)
- Use programming languages like Python, Java, or Ruby in which dynamic memory allocation takes place and the language itself manages the memory for you. 使用诸如Python,Java或Ruby之类的编程语言,在其中进行动态内存分配,并且该语言本身为您管理内存。
- In languages like C and C++, before writing data to a buffer, perform all the relevant checks and input validation. 在诸如C和C ++之类的语言中,在将数据写入缓冲区之前,请执行所有相关检查和输入验证。
- Before using any external libraries, check for security vulnerabilities in it. 在使用任何外部库之前,请检查其中的安全漏洞。
- Use source code analysis tools for static analysis against vulnerabilities. 使用源代码分析工具对漏洞进行静态分析。
- Use a non-executable stack: This means that even if a machine code is injected into the stack, it cannot be executed as that particular region of memory is non-executable. It is done by setting up NX bit. 使用不可执行的堆栈:这意味着即使将机器代码注入到堆栈中,也无法执行该代码,因为特定的内存区域是不可执行的。 通过设置NX位来完成。
Note: Even after these measures are taken, it might be possible to exploit buffer overflow. Therefore, these are just layers of security that can help to prevent the exploitation of buffer overflow.
注意:即使采取了这些措施,也有可能利用缓冲区溢出。 因此,这些只是安全层,可以帮助防止利用缓冲区溢出。
翻译自: https://medium.com/better-programming/an-introduction-to-buffer-overflow-vulnerability-760f23c21ebb
缓冲区溢出漏洞