debugging - How does adding a run-time breakpoint in Visual Studio work?

Question

Welcome To Ask or Share your Answers For Others

debugging - How does adding a run-time breakpoint in Visual Studio work?

1 Answer

深蓝 · Answer 1 · 2021-10-23T18:38:51+0000

This is actually a rather large and complicated topic, and it is also architecture-specific, so I'll only aim in this answer to provide a summary of the common approaches on the Intel (and compatible) x86 microarchitecture.

The good news is, it is language-independent, so the debugger is going to work the same way whether it's debugging VB.NET, C#, or C++ code. The reason why this is true is that all code is ultimately going to compile (whether statically [i.e., ahead-of-time like C++ or with a JIT compiler like .NET]) or dynamically [e.g., via a run-time interpreter]) to object code that can be natively executed by the processor. It is this native code that the debugger ultimately works on.

Furthermore, this isn't limited to Visual Studio. Its debugger certainly works in the way that I'll describe, but so does any other Windows debugger, like the Debugging Tools for Windows debuggers (WinDbg, KD, CDB, NTSD, etc.), GNU's GDB, IDA's debugger, the open-source x64dbg, and so on.

Let's start with a simple definition—what is a breakpoint? It's just a mechanism that allows execution to be paused so that you can conduct further analysis, whether that's examining the call stack, printing the values of variables, modifying the contents of memory or registers, or even modifying the code itself.

On the x86 architecture, there are several fundamental ways that breakpoints can be implemented. They can be divided into the two general categories of software breakpoints and hardware breakpoints.

Although a software breakpoint uses features of the processor itself, it is primarily implemented within software, hence the name. Specifically, interrupt #3 (the assembly language instruction INT 3) provides a breakpoint interrupt. This can be placed anywhere in the executable code, and when the CPU hits this instruction during execution, it will trap. The debugger can then catch this trap and do whatever it wants to do. If the program is not running under a debugger, then the operating system will handle the trap; the OS's default handler will simply terminate the program.

There are two possible encodings for the INT 3 instruction. Perhaps the most logical encoding is 0xCD 0x03, where 0xCD means INT and 0x03 specifies the "argument", or the number of the interrupt that is to be triggered. However, because breakpoints are so important, the designers at Intel also added a special-case representation for INT 3—the single-byte opcode 0xCC.

The nice thing about this being a one-byte instruction is that it can be inserted pretty much anywhere in a program without much difficulty. Conceptually, this is simple, but the way it actually works is somewhat tricky. Basically, there are two options:

If it's a fixed breakpoint, then the debugger can insert this INT instruction into the code when it is compiled. Then, every time you hit that point, it will execute that instruction and break.

In C/C++, a fixed breakpoint might be inserted via a call to the DebugBreak API function, with the __debugbreak intrinsic, or using inline assembly to insert an INT 3 instruction. In .NET code, you would use System.Diagnostics.Debugger.Break to emit a fixed breakpoint.

At runtime, a fixed breakpoint can be easily removed by replacing the one-byte INT instruction (0xCC) with a one-byte NOP instruction (0x90). NOP is the mnemonic for no-op: it just causes the processor to waste a cycle without doing anything.
But if it's a dynamic breakpoint, then things get more complicated. The debugger must modify the binary in-memory and insert the INT instruction. But where is it going to insert it? Even in a debugging build, a compiler cannot reasonably insert a NOP between every single instruction, and it doesn't know in advance where you might want to insert a breakpoint, so there won't be space to insert even a one-byte INT instruction at an arbitrary location in the code.

So what it does instead is insert the INT instruction (0xCC) at the requested location, writing over whatever instruction is currently there. If this is a one-byte instruction (such as an INC), then it is simply replaced by an INT. If this is a multi-byte instruction (most of them are), then only the first byte of that instruction is replaced by 0xCC. The original instruction then becomes invalid because it's been partially overwritten. But that's okay, because once the processor hits the INT instruction, it will trap and stop executing at precisely that point. The partial, corrupted, original instruction will not be hit. Once the debugger catches the trap triggered by the INT instruction and "breaks" in, it undoes the in-memory modification, replacing the inserted 0xCC byte with the correct byte representation for the original instruction. That way, when you resume execution from that point, the code is correct and you don't hit the same breakpoint over and over. Note that all of this modification happens to the current image of the binary executable stored in memory; it is patched directly in memory, without ever modifying the file on disk. (This is done using the ReadProcessMemory and WriteProcessMemory API functions, specifically designed for debuggers.)

Here it is in machine code, showing both the raw bytes as well as the assembly-language mnemonics:
```
31 C0             xor  eax, eax     ; clear EAX register to 0
BA 02 00 00 00    mov  edx, 2       ; set EDX register to 2
01 D0             add  eax, edx     ; add EDX to EAX
C3                ret               ; return, with result in EAX
```
If we were to set a breakpoint on the line of source code that added the values (the ADD instruction in the disassembly), the first byte of the ADD instruction (0x01) would be replaced with 0xCC, leaving the remaining bytes as meaningless garbage:
```
31 C0             xor  eax, eax     ; clear EAX register to 0
BA 02 00 00 00    mov  edx, 2       ; set EDX register to 2
CC                int  3            ; BREAKPOINT!
D0                ???               ; meaningless garbage, never executed
C3                ret               ; also meaningless garbage from CPU's perspective
```

Hopefully you were able to follow all of that, because that is actually the simplest case. Software breakpoints are what you use most of the time. Many of the most commonly used features of a debugger are implemented using software breakpoints, including stepping over a call, executing all code up to a particular point, and running to the end of a function. Behind the scenes, all of these use a temporary software breakpoint that is automatically removed the first time that it is hit.

However, there is a more complicated and more powerful way to set a breakpoint with the direct assistance of the processor. These are known as hardware breakpoints. The x86 instruction set provides 6 special debug registers. (They are referred to as DB0 through DB7, suggesting a total of 8, but DR4 and DR5 are the same as DR6 and DR7, so there are actually only 6.) The first 4 debug registers (DR0 through DR3) store either a memory address or an I/O location, whose values can be set using a special form of the MOV instruction. DR6 (equivalent to DR4) is a status register that contains flags, and DR7 (equivalent to DR5) is a control register. When the control register is set accordingly, an attempt by the processor to access one of these four locations will cause a hardware breakpoint (specifically, an INT 1 interrupt will be raised), which can then be caught by a debugger. Again, the details are complicated and can be found various places online or in Intel's technical manuals, but not necessary to gain a high-level understanding.

The nice thing about these special debug registers is that they provide a way to implement data breakpoints without needing to modify the code! However, there are two serious limitations. First, there are only four possible locations, so without a lot of cleverness, you are limited to four breakpoints. Second, the debug registers are privileged resources, and instructions that access and manipulate them can be executed only at ring 0 (essentially, kernel mode). Attempts to read or write these registers at any other privilege level (such as in ring 3, which is effectively user mode) will cause a general protection fault. Therefore, the Visual Studio debugger has to jump through some hoops to use these. I believe that it first suspends the thread and then calls the SetThreadContext API function (which causes a switch to kernel mode internally) to manipulate the contents of the registers. Finally, it resumes the thread. These debug registers are very powerful for setting read/write breakpoints for memory locations that contain data, as well as for setting execute breakpoints for memory locations that contain code.

However, if you need more than 4, or hit against some other limitation, then these hardware-provided debug registers won't work. The Visual Studio debugger has to have some other, more general way of implementing data breakpoints. This is, in fact, why having a large number of breakpoints can really slow down the execution of your program when running under the debugger.

There are various tricks here, and I know a lot less about exactly which ones are used by the different closed-source debuggers. You could almost certainly find out by reverse-engineering or even closer observation, and perhaps there is someone that knows more about this than me. But I'll briefly summarize a couple of the tricks I know about:

One trick for memory-access breakpoints is to use <a href="https://msdn.microsoft.com/en-us/library/win

Categories

debugging - How does adding a run-time breakpoint in Visual Studio work?

debugging - How does adding a run-time breakpoint in Visual Studio work?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags