http://www.drpaulcarter.com/cs/debug.php#1.
1. How to Start
The most common types of mistakes when programming are:
- Programming without thinking
- Writing code in an unstructured manner
Let's take these in order
1.1 Thinking about Programming
When a real programmer (or programming team) is given a problem to solve,they do not immediately sit down at a terminal and start typingin code! They first design the program by thinking about the numerous waysthe problem's solution may be found.
One of the biggest myths of programming is that: The sooner I start codingthe sooner a working program will be produced. This is NOTtrue!! A program that is planned before coding will become a working program before an unplanned program. Yes, an unplanned program will betyped in and maybe compiled faster, but these are just the firststeps of creating a working program! Next comes the debugging stage.This is where the benefits of a planned program will appear. In the vastmajority of the time, a planned program will have less bugs than an unplannedprogram. In addition, planned programs are generally more structured thanunplanned programs. Thus, finding the bugs will be easier in the plannedprogram.
So how does one design a program before coding? There are severaldifferent techniques. One of the most common is called top-down design. Here an outline of the program is first created. (Essentially,one first looks at the general form of the main()
functionand then recursively works down to the lowest level functions.) There aremany references on how to write programs in this manner.
Top-down design divides the program into sub-tasks. Each sub-task is asmaller problem that must be solved. Problems are solved by using analgorithm. Functions (and data) in the program implement the algorithm.Before writing the actual code, take a very small problem and trace byhand how the your chosen algorithm would solve it. This serves severalpurposes:
- It checks out the algorithm to see if will actually work on the given problem. (If it does not work, you can immediately start looking for another algorithm. Note that if you had immediately starting coding you would probably not discover the algorithm would not work until many lines of code had been entered!)
- Makes sure that you understand how the algorithm actually works! (If you can not trace the algorithm by hand, you will not be able to write a program to do it!)
- Gives you the detail workings of a short, simple run of the algorithm that can be used later when debugging the code.
Only when you are confident that you understand how the entire program willlook should you start typing in code.
1.2 Structured Programming
When a program is structured, it is divided into sub-units that may be testedseparately. This is a great advantage when debugging! Code in asub-unit may be debugged separately from the rest of the program code. I findit useful to debug each sub-unit as it is written. If no debugging is performed until the entire program is written, debugging is much harder. Theentire source of the program must be searched for bugs. If sub-units aredebugged separately, the source code that must be searched for bugs is muchsmaller! Storing the sub-units of a program into separate source files can make it easier to debug them separately.
The ADT and OOP paradigms also divide programs into sub-units. Often withthese methods, the sub-units are even more independent than with normalstructured code. This has two main advantages:
- Sub-units are even easier to debug separately.
- Sub-units can often be reused in other programs. (Thus, a new program can use a previously debugged sub-unit from an earlier program!)
Sub-units are generally debugged separately by writing a small driverprogram. Driver programs set up data for the sub-task that the sub-unitis supposed to solve, calls the sub-unit to perform his sub-task, and thendisplays the results so that they can be checked.
Of course, all the following debugging methods can be used to debug a sub-unit, just as they can be used to debug the entire program. Again, theadvantage of sub-units are that they are only part of the program and soare easier to debug than the entire program at once!
2. Compiling Programs
The first step after typing in a program (or just a sub-unit of a program) is to compile the program.The compiling process converts the source code file you typed in into machinelanguage and is stored in an object file. This is known as compiling. With most systems, the object file is automaticallylinked with the system libraries. These libraries contain the codefor the functions that are part of the languages library. (For example, theC libraries contain the code for the printf()
function.)
2.1 Compiler Errors
Every language has syntax rules. These rules determine whichstatements are legal in the language and which are not. Compilerprograms are designed to enforce these rules. When a rule is broken,the compiler prints an error message and an object file is notcreated. Most compilers will continue scanning the source file afteran error and report other errors it finds. However, once an error hasbeen found, the compiler made be confused by later perfectly legalstatements and report them as errors. This brings us to the first ruleof compiler errors:
-
First Rule of Compiler Errors
- The first listed compiler error is always a true error; however, later errors may not be true errors.
Succeeding errors may disappear when the first error is removed. So, if later error messages are puzzling, ignore them. Fix the errors that you are sure are errors and re-compile. The puzzling errors may magically disappear when the other true errors are removed.
If you have many errors, the first error may scroll off the screen. Onesolution to this problem is to save the errors into a file using redirection.One problem is that errors are written to stderr
not stdout
which the > redirection operator uses. To redirectoutput to stderr
use the 2> operator. Here's an example:
$cc x.c 2>errors $more errors
The compiler is just a program that other humans created. Often the errormessages it displays are confusing (or just plain wrong!). Do notassume that the line that an error message refers to is always the line where the true error actually resides. The compiler scans source files from the top sequentially to the bottom. Sometimes an error is not detected by the compiler until many lines below where the actual error is.Often the compiler is too stupid to realize this and refers to the line in the source file where it realized something is wrong. The true error is earlier in the code. This brings us to the second rule of compiler errors:
-
Second Rule of Compiler Errors
- A compiler error may be caused by any source code line above the line referred to by the compiler; however, it can not be caused by a line below.
In C (and C++), do not forget that the #include
preprocessorstatement inserts the code of a header file into the source file. An errorin the header file, may cause a compiler error referencing a line in themain source file. Most systems allow the preprocessed code (that the C compiler actually compiles!) to be stored in a file. This allows you to seeexactly what is being compiled. This file will also show how each C macrowas expanded. This can be very helpful to discover the cause ofnormally very hard to find errors.
A useful technique for finding the cause of puzzling compiler errors is todelete (or comment out) preceding sections of code until the error disappears. When the error disappears, the last section removed must have caused the error.
The compiler can also display warnings. A warning is not a syntax error; however, it may be a logical error in the program.It marks a statement in your program that is legal, but is suspicious. Youshould treat warnings as errors unless you understand why the warning wasgenerated. Often compilers can be set to different warning levels. It isto your advantage to set this level as high as possible, to have the compiler give as many warnings as possible. Look at these warningsvery carefully!.
Remember that just because a program compiles with no errors or warnings doesnot mean that the program is correct! It only means that every lineof the program is syntactically correct. That is, the compiler understandswhat each statement says to do. The program may still have many logicalerrors! An English paper may be grammatically correct (i.e.,have nouns, verbs, etc. in the correct places), but be gibberish.
2.2 Linker Errors
The linker is a program that links object files (which contain the compiled machine code from a single source file) and libraries (which are files that are collections of object files) together to create an executable program. The linker matches up functions and global variablesused in object files to their definitions in other object files. The linkeruses the name (often the term symbol is used) of the function or global variable to perform the match.
The most common type of linker error is an unresolved symbol or name.This error occurs when a function or global variable is used, butthe linker can not find a match for the name. For example, on an IBM AIX system, the errormessage looks like this:
0706-317 ERROR: Unresolved or undefined symbols detected: Symbols in error (followed by references) are dumped to the load map. The -bloadmap:<filename> option will create a load map. .fun
This message means that a function (or global variable) named fun
(ignore the period) was referenced in the program, but never defined. There are two common causes of these errors:
-
Misspelling the name of the function
-
In the example, above there was a function named
func
. This is not a compiler error. Code in one source file can use functions defined in another. The compiler assumes that any function referenced, but not defined in the file that references it, will be defined in another file and linked. It is only at the link stage that this assumption can be checked. (Note that C++ compilers will usually generate compiler errors for this, since C++ requires prototypes for all referenced functions!)
The correct libraries or object files where not linked
-
The linker must know what libraries and object files are needed to form the executable program. The standard C libraries are automatically linked. UNIX systems, like the AIX system, do
not automatically link in the standard C math library! To link in the math library on the AIX system, use the
-lm
flag on the compile command. For example, to compile a C program that usessqrt
, type:
cc prog.c -lm
Remember that the#include
statement only inserts text into source files. It is a common myth that it also links in the appropriate library! The linker never sees this statement!
There are also bugs related to the linker. One difficult bug to uncover occurs when there are two definitions of a function or global variable. The linker will pick the first definition it finds and ignores the other.Some linkers will display a warning message when this occurs (The AIXlinker does not!)
Another bug related to linking occurs when a function is called with thewrong arguments. The linker only looks at the name of the function whenmatching. It does no argument checking. Here's an example:
File: x.c
int f( int x, int y) { return x + y; }
File: y.c
int main() { int s = f(3); return 0; }
These types of bugs can be prevented by using prototypes. For example, if theprototype:
int f( int, int);
is added to y.c the compiler will catch this error. Actually, thebest idea is to put the prototype in a header file and include it in bothx.c and y.c. Why use a header file? So that there is onlyone instance of the prototype that all files use. If a separate instance istyped into each source file, there is no guarantee that each instance is thesame. If there is only one instance, it can not be inconsistent with itself!Why include it in x.c (the file the function is defined in)? So that the compiler can check the prototype and ensure that it is consistentwith the function's definition. (Additional note: C++ uses a technique calledname mangling to catch these type of errors.)
3. Runtime Errors
A runtime error occurs when the program is running and usually results in the program aborting. On a UNIX/Linux system, an aborting program creates a coredump. A coredump is a binary file named core that contains information about the state of program when it aborted. Debuggerslike gdb and dbx can read this file and tellyou useful information about what the program was doing when it aborted. There are several types of runtime errors:
-
Illegal memory access
-
This is probably the most common type of error. Under UNIX/Linux, the program will coredump with the message
Segmentation fault(coredump).
Using Win95 or NT, programs will also abort. However, traditional DOS does not check for illegal memory accesses; the program continues running, but the results are unpredictable. The DOS Borland/Turbo C/C++ compilers will check for data written to the NULL address. However, the error message
NULL pointer assignment
is not displayed until the program terminates.
Division by zero
- All operating systems detect this error and abort the program.
4. Debugging Tools
Many methods of debugging a program compare the program's behavior withthe correct behavior in great detail. Usually the normal output of theprogram does not show the detail needed. Debugging tools allow youto examine the behavior of the in more detail.
4.1 The assert
Macro
The assert
macro is a quick and easy way to put debugging testsinto a C/C++ program. To use this macro, you must include the assert.h
header file near the top of your source file. Usingassert
is simple. The format is:
-
assert(boolean (or int) expression);
If the boolean expression evaluates to true (i.e., notzero), the assert
does nothing. However, if it evaluates tofalse (zero), assert
prints an error message and aborts theprogram. As an example, consider the following assert
:
-
assert( x != 0 );
If x is zero, the following will be displayed:
Assertion failed: x != 0, file err.c, line 6 Abnormal program termination
and the program will abort. Notice that the actual assertion, the name of thefile and the line number in the file are displayed.
The assert
macro is very useful for putting sanity checks intoprograms. These are conditions that should always be true if the program isworking correctly. It should not be used for user error checking (such aswhen the file a user requested to read does not exist). Normal if
statements should be used for these runtime errors.
Of course, in a commercial program, an assertion failure is not particularhelpful to an end user. Also, checking assertions will make the program runat least a little slower than without them. Fortunately, it is easy todisable the assert
macro without even removing it. If themacro NDEBUG
is defined (above the statement that includesassert.h
!), the assert
macro does absolutely nothing.If the assertions need to be enabled later, just remove the line that definesNDEBUG
. (If this technique is used, be sure that theassert
statements do not execute code needed for the program torun correctly. If NDEBUG
is defined, the code would notbe run!)
4.2 Print Statements
This time honored method of debugging involves inserting debugging print statements liberally in your program. The print statements should bedesigned to show both what code the program is executing and what valuescritical variables have.
4.3 Debuggers
The previous method of debugging by adding print statements has twodisadvantages:
- When new print statements are added, program must be recompiled.
- Information output is fixed and can not be changed as program is running.
Source-level debuggers provide a much easier way to trace the execution ofprograms. They allow one to:
- Look at the value of any variable as the program is running.
- Pause execution when program reaches any desired statement. (This position in the program is called a breakpoint).
- Single step statement by statement through a program.
I strongly recommend that you learn to use the debugger for whatever system you program on. A debugger can save lots of timewhen debugging your program!
4.4 Lint
The lint program checks C programs for a common list of bugs. Itscans your C source code and prints out a list of possible problems. Bewarned that lint is very picky! For example, the line:
-
printf("Hello, World");
will produce a warning message because printf
returns an integervalue that is not stored. The return value of printf
is oftenignored, but lint still produces an warning. There are severalways to make lint happy with this statement, one is:
-
(void) printf("Hello, World");
This says to ignore the return value.
4.5 Walk through
A walk through is a process of hand checking the logic of a program.The programmer sits down with someone else (best if another programmer, butanybody will do) and walks through the program for an example case. Oftenit is the programmer himself who finds the bug in the process of explaininghow the program is supposed to work and carefully looking at his code. However,it is easy for the programmer to "know" what the program should be doing and remain blind to what the program is actually doing.
Students need to be very careful using this approach with other students. Twostudents in the same class should not walk through a program together.
5. General Tips
Here are some general tips for debugging programs.
5.1 Finding Bugs
Before bugs are removed they must be discovered!
- Aggressively test programs!
- Start with small problems that can be easily checked by hand. (You should already have one of these worked out from the planning stage!)
- Test every feature of the program at least once! And is once really enough? Test features in different ways if possible.
- Do not forget to test trivial problems.
- Do not make invalid assumptions about input data.
5.2 Determining the Causes of Bugs
A bug can only be caused by the code in the program that has already executed.Be sure you do not waste time searching through code that has not run yet.A debugger or print statements can be used to determine which code has executedand which has not.
Do not fix bugs by mindlessly changing code until it seems to work. You needto figure out why one statement does work and another does not. You shouldhave a good reason for every line of code. "It doesn't work withoutthis line" is not a good reason!
If you are using C/C++, you might find my Common C Errors Page helpful.
Maintainer: Paul Carter ( email: pacman128@gmail.com )
Copyright © 2001 All Rights Resevered
Last Updated: Thu Sep 6 16:28:20 CDT 2001