I l@ve RuBoard Previous Section Next Section

17.5 Debugging a Binary Search

The binary search algorithm is fairly simple. You want to see whether a given number is in an ordered list. Check your number against the one in the middle of the list. If it is the number, you were lucky—stop. If your number is bigger, you might find it in the top half of the list. Try the middle of the top half. If it is smaller, try the bottom half. Keep trying and dividing the list in half until you find the number or the list gets down to a single number.

17.5.1 The First Bug, a Segmentation Fault

Example 17-2 uses a binary search to see whether a number can be found in the file numbers.dat.

Example 17-2. search/search0.cpp
 1: /********************************************************
 2:  * search -- Search a set of numbers.                   *
 3:  *                                                      *
 4:  * Usage:                                               *
 5:  *      search                                          *
 6:  *              you will be asked numbers to lookup     *
 7:  *                                                      *
 8:  * Files:                                               *
 9:  *      numbers.dat -- numbers 1 per line to search     *
10:  *                      (Numbers must be ordered)       *
11:  ********************************************************/
12: #include <iostream>
13: #include <fstream>
14: #include <cstdlib>
15: #include <cstdio>
16: #include <assert.h>
17: 
18: const int MAX_NUMBERS = 1000;   // Max numbers in file 
19: const char *const DATA_FILE = "numbers.dat";// File with numbers 
20: 
21: int data[MAX_NUMBERS];  // Array of numbers to search 
22: int max_count;          // Number of valid elements in data 
23: 
24: int main(  )
25: {
26:     std::ifstream in_file;      // Input file 
27:     int middle;         // Middle of our search range 
28:     int low, high;      // Upper/lower bound 
29:     int search;         // number to search for 
30: 
31:     in_file.open(DATA_FILE, std::ios::in);
32:     if (in_file.bad(  )) {
33:         std::cerr << "Error:Unable to open " << DATA_FILE << '\n';
34:         exit (8);
35:     }
36: 
37:     /*
38:      * Read in data 
39:      */
40: 
41:     max_count = 0;
42:     while (true) {
43:         char line[30];  // Line from the input file
44: 
45:         if (in_file.eof(  ))
46:             break;
47: 
48:         in_file.getline(line, sizeof(line));
49: 
50:         assert(max_count >= 0);
51:         assert(max_count < sizeof(data)/sizeof(data[0]));
52:         std::sscanf(line, "0", data[max_count]);
53:         if (data[max_count] == -1)
54:             break;
55: 
56:         ++max_count;
57:     }
58: 
59:     while (true) {
60:         std::cout << "Enter number to search for or -1 to quit:" ;
61:         std::cin >> search;
62: 
63:         if (search == -1)
64:             break;
65: 
66:         low = 0;
67:         high = max_count;
68: 
69:         while (true) {
70:             middle = (low + high) / 2;
71: 
72:             assert(middle >= 0);
73:             assert(middle < sizeof(data)/sizeof(data[0]));
74:             if (data[middle] == search) {
75:                 std::cout << "Found at index " << middle << '\n';
76:             }
77: 
78:             if (low == high) {
79:                 std::cout << "Not found\n";
80:                 break;
81:             }
82: 
83:             assert(middle >= 0);
84:             assert(middle < sizeof(data)/sizeof(data[0]));
85:             if (data[middle] < search)
86:                 low = middle;
87:             else
88:                 high = middle;
89:         }
90:    }
91:    return (0);
92: }

Here's our data file:

File: numbers.dat

4 
6 
14 
16 
17 
-1

When we run this program in Unix, the results are:

% search 
Segmentation fault (core dumped) 

When we run this program under Microsoft Windows, we get an application error (if we're lucky).

Either way this is not good. It means something went wrong in our program and the program tried to read memory that wasn't there. The debugger GDB can read this file and help us determine what happened:

% gdb search
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.12 (m68k-sun-sunos4.0.3), 
Copyright 1994 Free Software Foundation, Inc...
(gdb) run
Starting program: /usr/sdo/search/search 

Program received signal SIGSEGV, Segmentation fault.
0xec46320 in number (  )
(gdb)

The debugger tells us we have been killed by a segmentation fault generated from the procedure number. But we don't have a procedure number! The routine must belong to the C++ library.

We now use the where command to find out which function called which function. This report is called a stack trace.

(gdb) where
#0  0xec46320 in number (  )
#1  0xec45cc2 in _doscan (  )
#2  0xec45b34 in sscanf (  )
#3  0x2400 in main (  ) at search.cpp:52
(gdb)

The current function is printed first, then the function that called it, and so on until we reach the outer function main. From this we see that number was called by _doscan, which was called by sscanf. We recognize sscanf as a library routine. The other functions must be subroutines called by sscanf. The last function that had control was the call of sscanf, which was made from line 52 of main.

Now we use the list command to take a look at the source for this line:

(gdb) list 52
45              if (in_file.eof(  ))
46                  break;
47      
48              in_file.getline(line, sizeof(line));
49      
50              assert(max_count >= 0);
51              assert(max_count < sizeof(data)/sizeof(data[0]));
52              sscanf(line, "%d", data[max_count]);
53              if (data[max_count] == -1)
54                  break;
55      
56              ++max_count;
(gdb) quit
The program is running.  Quit anyway (and kill it)? (y or n) Y

Line 52 caused the problem.

Another way of finding the problem is to single-step through the program until the error occurs. First list a section of the program to find a convenient place to put the breakpoint, and then start the execution and single-step process:

Script started on Mon Oct 31 10:07:19 1994
% gdb search
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.12 (m68k-sun-sunos4.0.3), 
Copyright 1994 Free Software Foundation, Inc...
(gdb) list main
20      const char *const DATA_FILE = "numbers.dat"; // File with nums
21      
22      int data[MAX_NUMBERS];  // Array of numbers to search 
23      int max_count;          // Number of valid elements in data 
24      int main(  )
25      {
26          std::ifstream in_file;   // Input file 
27          int middle;         // Middle of our search range 
28          int low, high;      // Upper/lower bound 
29          int search;         // Number to search for 
(gdb) break main
Breakpoint 1 at 0x2318: file search.cpp, line 25.
(gdb) run
Starting program: /usr/sdo/search/search

Breakpoint 1, main (  ) at search.cpp:25
26          ifstream in_file;   // Input file 
(gdb) step
31          in_file.open(DATA_FILE, ios::in);
(gdb) step
32          if (in_file.bad(  )) {
(gdb) step
41          max_count = 0;
(gdb) step
45              if (in_file.eof(  ))
(gdb) step
48              in_file.getline(line, sizeof(line));
(gdb) step
50              assert(max_count >= 0);
(gdb) step
51              assert(max_count < sizeof(data)/sizeof(data[0]));
(gdb) step
52              sscanf(line, "%d", data[max_count]);
(gdb) step

Program received signal SIGSEGV, Segmentation fault.
0xec46320 in number (  )
(gdb) quit
The program is running.  Quit anyway (and kill it)? (y or n) y

This method, too, points at line 52 as the culprit. On inspection we notice that we forgot to put an ampersand (&) in front of the third parameter for std::sscanf. So we change line 52 from:

           std::sscanf(line, "%d", data[max_count]); 

to:

           std::sscanf(line, "%d", &data[max_count]); 

and try again.

You might wonder why we use the function sscanf when the line:

    in_file >> data[max_count];

performs the same function.

The answer is simple. We used sscanf to cause problems. Without the pointer error, we would have nothing to debug. The in_file statement is more reliable, and reliable code has no place in a chapter on debugging.

17.5.2 The Unintended Infinite Loop

Rather than fix the std::scanf call, we've decided to enter the 21st century and use C++ style I/O calls. The first number in our list is 4, so we try it. This time our output looks like:

Enter number to search for or -1 to quit: 4 
Found at index 0 
Found at index 0 
Not found 
Enter number to search for or -1 to quit: ^C

The program should find the number, let us know it's at index 0, and then ask for another number. Instead we get two found messages and one not found message. We know that everything is running smoothly up to the time we get the first found message. After that things go downhill.

Getting back into the debugger, we use the list command to locate the found message and put a breakpoint there:

% gdb search
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.12 (m68k-sun-sunos4.0.3), 
Copyright 1994 Free Software Foundation, Inc...
(gdb) list 69,81
69              while (true) {
70                  middle = (low + high) / 2;
71
72                  assert(middle >= 0);
73                  assert(middle <sizeof(data)/sizeof(data[0]));      
74                  if (data[middle] == search) {
75                    std::cout << "Found at index " << middle << '\n';
76                  }
77      
78                  if (low == high) {
79                      std::cout << "Not found\n";
80                      break;
81                  }
(gdb) break 75
Breakpoint 1 at 0x249e: file search.cpp, line 71.
(gdb) run
Starting program: /usr/sdo/search/search 
Enter number to search for or -1 to quit: 4

Breakpoint 1, main (  ) at search.cpp:71
75                    std::cout << "Found at index " << middle << '\n';
(gdb) step
Found at index 0
78                  if (low == high) {
(gdb) step
83                  assert(middle >= 0);
(gdb) step
84                  assert(middle <sizeof(data)/sizeof(data[0]));      
(gdb) step
85                  if (data[middle] < search)
(gdb) step
88                      high = middle;
(gdb) step
70                  middle = (low + high) / 2;
(gdb) step
72                  assert(middle >= 0);
(gdb) step
73                  assert(middle <sizeof(data)/sizeof(data[0]));      
(gdb) step
74                  if (data[middle] == search) {
(gdb) step
75                    std::cout << "Found at index " << middle << '\n';
(gdb) step
Found at index 0
78                  if (low == high) {
(gdb) quit
The program is running.  Quit anyway (and kill it)? (y or n) y

The program doesn't exit the loop. Instead it continues with the search. Because the number has already been found, this search results in strange behavior. We are missing a break after the cout.

We need to change:

    if (data[middle] == search) { 
        std::cout << "Found at index " << middle << '\n'; 
    } 

to:

    if (data[middle] == search) { 
        std::cout << "Found at index " << middle << '\n'; 
        break; 
    } 

Making this fix, we try the program again:

% search  
Enter number to search for or -1 to quit: 4  
Found at index 0 
Enter number to search for or -1 to quit: 6  
Found at index 1 
Enter number to search for or -1 to quit: 3  
Not found 
Enter number to search for or -1 to quit: 5  
program runs forever (or until we abort it)  

We have a runaway program. This time, instead of setting a breakpoint, we just start running the program. After a few seconds pass and we believe that we are stuck in the infinite loop, we stop the program with a control-C (^C). Normally this would abort the program and return us to the shell prompt. Since we are running with the debugger, it returns control to GDB:

% gdb search
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.12 (m68k-sun-sunos4.0.3), 
Copyright 1994 Free Software Foundation, Inc...
(gdb) run
Starting program: /usr/sdo/search/search 
Enter number to search for or -1 to quit: 5
^C
Program received signal SIGINT, Interrupt.
0x2500 in main (  ) at search.cpp:79
79                  if (data[middle] < search)

Now we can use the single-step command to step through the infinite loop, looking at key values along the way.

87                  if (data[middle] < search)
(gdb) print middle
$1 = 0
(gdb) print data[middle]
$2 = 4
(gdb) print search
$3 = 5
(gdb) step
88                      low = middle;
(gdb) step
71                  middle = (low + high) / 2;
(gdb) step
73                  assert(middle >= 0);
(gdb) step
74                  assert(middle <sizeof(data)/sizeof(data[0]));      
(gdb) step
75                  if (data[middle] == search) {
(gdb) step
80                  if (low == high) {
(gdb) step
85                  assert(middle >= 0);
(gdb) step
86                  assert(middle <sizeof(data)/sizeof(data[0]));      
(gdb) step
87                  if (data[middle] < search)
(gdb) step
88                      low = middle;
(gdb) step
71                  middle = (low + high) / 2;
(gdb) step
73                  assert(middle >= 0);
(gdb) step
74                  assert(middle <sizeof(data)/sizeof(data[0]));      
(gdb) step
75                  if (data[middle] == search) {
(gdb) step
80                  if (low == high) {
(gdb) step
85                  assert(middle >= 0);
(gdb) step
86                  assert(middle <sizeof(data)/sizeof(data[0]));      
(gdb) step
87                  if (data[middle] < search)
(gdb) step
88                      low = middle;
(gdb) step
71                  middle = (low + high) / 2;
(gdb) step
73                  assert(middle >= 0);
(gdb) step
74                  assert(middle <sizeof(data)/sizeof(data[0]));      
(gdb) step
75                  if (data[middle] == search) {
(gdb) step
80                  if (low == high) {
(gdb) step
85                  assert(middle >= 0);
(gdb) step
86                  assert(middle <sizeof(data)/sizeof(data[0]));      
(gdb) step
87                  if (data[middle] < search)
(gdb) step
88                      low = middle;
(gdb) step
71                  middle = (low + high) / 2;
(gdb) step
73                  assert(middle >= 0);
(gdb) step
74                  assert(middle <sizeof(data)/sizeof(data[0]));      
(gdb) step
75                  if (data[middle] == search) {
(gdb) print low
$5 = 0
(gdb) print middle
$6 = 0
(gdb) print high
$7 = 1
(gdb) print search
$8 = 5
(gdb) print data[0]
$9 = 4
(gdb) print data[1]
$10 = 6
(gdb) quit
The program is running.  Quit anyway (and kill it)? (y or n) y

The problem is that we have reached a point where the following is true:

low = 0  middle = 0  high = 1 

The item we are searching for falls exactly between elements 0 and 1. Our algorithm has an off-by-one error. Obviously the middle element does not match. If it did, we'd exit with a found at message. So there is no point including the middle element in our new search range. Our code to adjust the interval is:

            if (data[middle] < search) 
                low = middle; 
            else 
                high = middle; 

It should be:

            if (data[middle] < search) 
                low = middle + 1; 
            else 
                high = middle - 1; 

The full version of the corrected program is shown in Example 17-3.

Example 17-3. search/search4.cpp
 1: /********************************************************
 2:  * search -- Search a set of numbers.                   *
 3:  *                                                      *
 4:  * Usage:                                               *
 5:  *      search                                          *
 6:  *              you will be asked numbers to lookup     *
 7:  *                                                      *
 8:  * Files:                                               *
 9:  *      numbers.dat -- numbers 1 per line to search     *
10:  *                      (Numbers must be ordered)       *
11:  ********************************************************/
12: #include <iostream>
13: #include <fstream>
14: #include <cstdlib>
15: #include <cstdio>
16: #include <assert.h>
17: 
18: const int MAX_NUMBERS = 1000;   // Max numbers in file 
19: const char *const DATA_FILE = "numbers.dat";// File with numbers 
20: 
21: int data[MAX_NUMBERS];  // Array of numbers to search 
22: int max_count;          // Number of valid elements in data 
23: int main(  )
24: {
25:     std::ifstream in_file;      // Input file 
26:     int middle;         // Middle of our search range 
27:     int low, high;      // Upper/lower bound 
28:     int search;         // number to search for 
29: 
30:     in_file.open(DATA_FILE, std::ios::in);
31:     if (in_file.bad(  )) {
32:         std::cerr << "Error:Unable to open " << DATA_FILE << '\n';
33:         exit (8);
34:     }
35: 
36:     /*
37:      * Read in data 
38:      */
39: 
40:     max_count = 0;
41: 
42:     while (true) {
43:         char line[30];  // Line from the input file
44: 
45:         if (in_file.eof(  ))
46:             break;
47: 
48:         in_file.getline(line, sizeof(line));
49: 
50:         assert(max_count >= 0);
51:         assert(max_count < sizeof(data)/sizeof(data[0]));
52:         std::sscanf(line, "0", &data[max_count]);
53:         if (data[max_count] == -1)
54:             break;
55: 
56:         ++max_count;
57:     }
58: 
59:     while (true) {
60:         std::cout << "Enter number to search for or -1 to quit:" ;
61:         std::cin >> search;
62: 
63:         if (search == -1)
64:             break;
65: 
66:         low = 0;
67:         high = max_count;
68: 
69:         while (true) {
70:             if (low >= high) {
71:                 std::cout << "Not found\n";
72:                 break;
73:             }
74:             middle = (low + high) / 2;
75: 
76:             assert(middle >= 0);
77:             assert(middle < sizeof(data)/sizeof(data[0]));
78:             if (data[middle] == search) {
79:                 std::cout << "Found at index " << middle << '\n';
80:                 break;
81:             }
82: 
83:             assert(middle >= 0);
84:             assert(middle < sizeof(data)/sizeof(data[0]));
85:             if (data[middle] < search)
86:                 low = middle +1;
87:             else
88:                 high = middle -1;
89:         }
90:    }
91:    return (0);
92: }
    I l@ve RuBoard Previous Section Next Section