Processing SQL Statements

The purpose of connecting to the server is to conduct a conversation with it while the connection is open. This section shows how to communicate with the server to process statements. Each statement you execute involves the following steps:

1.	Construct the statement. The way you do this depends on the contents of the statementin particular, whether it contains binary data.
2.	Issue the statement by sending it to the server. The server will execute the statement and generate a result.
3.	Process the statement result. This depends on what type of statement you issued. For example, a `SELECT` statement returns rows of data for you to process. An `INSERT` statement does not.

Prior to MySQL 4.1, the client library included a single set of routines for statement execution. These are based on sending each statement as a string to the server and retrieving the results with all columns returned in string format. MySQL 4.1 introduces a binary protocol that allows non-string data values to be sent and returned in native format without conversion to and from string format.

This section discusses the original method for processing SQL statements. The section "Using Server-Side Prepared Statements" later in the chapter covers the newer binary protocol.

One factor to consider in constructing statements is which function to use for sending them to the server. The more general statement-issuing routine is mysql_real_query(). With this routine, you provide the statement as a counted string (a string plus a length). You must keep track of the length of your statement string and pass that to mysql_real_query(), along with the string itself. Because the statement is treated as a counted string rather than as a null-terminated string, it may contain anything, including binary data or null bytes.

The other statement-issuing function, mysql_query(), is more restrictive in what it allows in the statement string but often is easier to use. Any statement passed to mysql_query() should be a null-terminated string. This means the statement text cannot contain null bytes because those would cause it to be interpreted erroneously as shorter than it really is. Generally speaking, if your statement can contain arbitrary binary data, it might contain null bytes, so you shouldn't use mysql_query(). On the other hand, when you are working with null-terminated strings, you have the luxury of constructing statements using standard C library string functions that you're probably already familiar with, such as strcpy() and sprintf().

Another factor to consider in constructing statements is whether you need to perform any character-escaping operations. This is necessary if you want to construct statements using values that contain binary data or other troublesome characters, such as quotes or backslashes. This is discussed in "Working with Strings That Contain Special Characters."

A simple outline of statement handling looks like this:

if (mysql_query (conn, stmt_str) != 0)
{
    /* failure; report error */
}
else
{
    /* success; find out what effect the statement had */
}

mysql_query() and mysql_real_query() both return zero for statements that succeed and non-zero for failure. To say that a statement "succeeded" means the server accepted it as legal and was able to execute it. It does not indicate anything about the effect of the statement. For example, it does not indicate that a SELECT statement selected any rows or that a DELETE statement deleted any rows. Checking what effect the statement actually had involves additional processing.

A statement may fail for a variety of reasons. Common causes of failure include the following:

It contains a syntax error.
It's semantically illegalfor example, a statement that refers to a non-existent table.
You don't have sufficient privileges to access a table referred to by the statement.

Statements may be grouped into two broad categories: those that do not return a result set (a set of rows) and those that do. Statements such as INSERT, DELETE, and UPDATE fall into the "no result set returned" category. They don't return any rows, even for statements that modify your database. What you get back is a count of the number of rows affected.

Statements such as SELECT and SHOW fall into the "result set returned" category; after all, the purpose of issuing those statements is to get something back. In the MySQL C API, the result set returned by such statements is represented by the MYSQL_RES data type. This is a structure that contains the data values for the rows, and also metadata about the values (such as the column names and data value lengths). Is it legal for a result set to be empty (that is, to contain zero rows).

Handling Statements That Return No Result Set

To process a statement that does not return a result set, issue it with mysql_query() or mysql_real_query(). If the statement succeeds, you can find out how many rows were inserted, deleted, or updated by calling mysql_affected_rows().

The following example shows how to handle a statement that returns no result set:

if (mysql_query (conn, "INSERT INTO my_tbl SET name = 'My Name'") != 0)
{
    print_error (conn, "INSERT statement failed");
}
else
{
    printf ("INSERT statement succeeded: %lu rows affected\n",
                (unsigned long) mysql_affected_rows (conn));
}

Note how the result of mysql_affected_rows() is cast to unsigned long for printing. This function returns a value of type my_ulonglong, but attempting to print a value of that type directly does not work on some systems. (For example, I have observed it to work under FreeBSD but to fail under Solaris.) Casting the value to unsigned long and using a print format of %lu solves the problem. The same principle applies to any other functions that return my_ulonglong values, such as mysql_num_rows() and mysql_insert_id(). If you want your client programs to be portable across different systems, keep this in mind.

mysql_affected_rows() returns the number of rows affected by the statement, but the meaning of "rows affected" depends on the type of statement. For INSERT, REPLACE, or DELETE, it is the number of rows inserted, replaced, or deleted. For UPDATE, it is the number of rows updated, which means the number of rows that MySQL actually modified. MySQL does not update a row if its contents are the same as what you're updating it to. This means that although a row might be selected for updating (by the WHERE clause of the UPDATE statement), it might not actually be changed.

This meaning of "rows affected" for UPDATE actually is something of a controversial point because some people want it to mean "rows matched"that is, the number of rows selected for updating, even if the update operation doesn't actually change their values. If your application requires such a meaning, you can request that behavior when you connect to the server by passing a value of CLIENT_FOUND_ROWS in the flags parameter to mysql_real_connect().

Handling Statements That Return a Result Set

Statements that return data do so in the form of a result set that you retrieve after issuing the statement by calling mysql_query() or mysql_real_query(). It's important to realize that in MySQL, SELECT is not the only statement that returns rows. Statements such as SHOW, DESCRIBE, EXPLAIN, and CHECK TABLE do so as well. For all of these statements, you must perform additional row-handling processing after issuing the statement.

Handling a result set involves these steps:

1.	Generate the result set by calling`mysql_store_result()` or`mysql_use_result()` . These functions return a `MYSQL_RES` pointer for success or `NULL` for failure. Later, we'll go over the differences between `mysql_store_result()` and `mysql_use_result()`, as well as the conditions under which you would choose one over the other. For now, our examples use `mysql_store_result()`, which retrieves the rows from the server immediately and stores them in memory on the client side.
2.	Call`mysql_fetch_row()` for each row of the result set. This function returns a `MYSQL_ROW` value, or `NULL` when there are no more rows. A `MYSQL_ROW` value is a pointer to an array of strings representing the values for each column in the row. What you do with the row depends on your application. You might simply print the column values, perform some statistical calculation on them, or do something else altogether.
3.	When you are done with the result set, call`mysql_free_result()` to de-allocate the memory it uses. If you neglect to do this, your application will leak memory. It's especially important to dispose of result sets properly for long-running applications; otherwise, you will notice your system slowly being taken over by processes that consume ever-increasing amounts of system resources.

The following example outlines how to process a statement that returns a result set:

MYSQL_RES *res_set;

if (mysql_query (conn, "SHOW TABLES FROM sampdb") != 0)
    print_error (conn, "mysql_query() failed");
else
{
    res_set = mysql_store_result (conn);    /* generate result set */
    if (res_set == NULL)
            print_error (conn, "mysql_store_result() failed");
    else
    {
        /* process result set, and then deallocate it */
        process_result_set (conn, res_set);
        mysql_free_result (res_set);
    }
}

The example hides the details of result set processing within another function, process_result_set(), which we have not yet defined. Generally, operations that handle a result set are based on a loop that looks something like this:

MYSQL_ROW row;

while ((row = mysql_fetch_row (res_set)) != NULL)
{
    /* do something with row contents */
}

mysql_fetch_row() returns a MYSQL_ROW value, which is a pointer to an array of values. If the return value is assigned to a variable named row, each value within the row may be accessed as row[i], where i ranges from 0 to one less than the number of columns in the row. There are several important points about the MYSQL_ROW data type to note:

MYSQL_ROW is a pointer type, so you declare a variable of that type as MYSQL_ROW row, not as MYSQL_ROW *row.
Values for all data types, even numeric types, are returned in the MYSQL_ROW array as strings. If you want to treat a value as a number, you must convert the string yourself.
The strings in a MYSQL_ROW array are null-terminated. However, if a column can contain binary data, it might contain null bytes, so you should not treat the value as a null-terminated string. Get the column length to find out how long the column value is. (The section "Using Result Set Metadata" later in the chapter discusses how to determine column lengths.)
SQL NULL values are represented by C NULL pointers in the MYSQL_ROW array. Unless you have declared a column NOT NULL, you should always check whether values for that column are NULL, or your program may crash by attempting to dereference a NULL pointer.

What you do with each row depends on the purpose of your application. For purposes of illustration, let's just print each row as a set of column values separated by tabs. To do that, it's necessary to know how many column values rows contain. That information is returned by another client library function, mysql_num_fields().

Here's the code for process_result_set():

void
process_result_set (MYSQL *conn, MYSQL_RES *res_set)
{
MYSQL_ROW       row;
unsigned int    i;

    while ((row = mysql_fetch_row (res_set)) != NULL)
    {
        for (i = 0; i < mysql_num_fields (res_set); i++)
        {
            if (i > 0)
                fputc ('\t', stdout);
            printf ("%s", row[i] != NULL ? row[i] : "NULL");
        }
        fputc ('\n', stdout);
    }
    if (mysql_errno (conn) != 0)
        print_error (conn, "mysql_fetch_row() failed");
    else
        printf ("%lu rows returned\n",
                (unsigned long) mysql_num_rows (res_set));
}

process_result_set() displays the contents of each row in tab-delimited format (displaying NULL values as the word "NULL"), and then prints a count of the number of rows retrieved. That count is available by calling mysql_num_rows(). Like mysql_affected_rows(), mysql_num_rows() returns a my_ulonglong value, so you should cast its value to unsigned long and use a %lu format to print it. But note that unlike mysql_affected_rows(), which takes a connection handler argument, mysql_num_rows() takes a result set pointer as its argument.

The code that follows the loop includes an error test as a precautionary measure. If you create the result set with mysql_store_result(), a NULL return value from mysql_fetch_row() always means "no more rows." However, if you create the result set with mysql_use_result(), a NULL return value from mysql_fetch_row() can mean "no more rows" or that an error occurred. Because process_result_set() has no idea whether its caller used mysql_store_result() or mysql_use_result() to generate the result set, the error test allows it to detect errors properly either way.

The version of process_result_set() just shown takes a rather minimalist approach to printing column valuesone that has certain shortcomings. Suppose that you execute this query:

SELECT last_name, first_name, city, state FROM president
ORDER BY last_name, first_name

You will receive the following output, which is not so easy to read:

Adams   John    Braintree   MA
Adams   John Quincy Braintree   MA
Arthur  Chester A.  Fairfield   VT
Buchanan    James   Mercersburg PA
Bush    George H.W. Milton  MA
Bush    George W.   New Haven   CT
Carter  James E.    Plains  GA
...

We could make the output prettier by providing information such as column labels and making the values line up vertically. To do that, we need the labels, and we need to know the widest value in each column. That information is available, but not as part of the column data valuesit's part of the result set's metadata (data about the data). After we generalize our statement handler a bit, we'll write a nicer display formatter in the section "Using Result Set Metadata."

Printing Binary Data

Columns containing binary values that include null bytes will not print properly using the %s printf() format specifier. printf() expects a null-terminated string and will print the column value only up to the first null byte. For binary data, it's best to use a function that accepts a column length argument so that you can print the full value. For example, you could use fwrite().

A General Purpose Statement Handler

The preceding statement-handling examples were written using knowledge of whether the statement should return any data. That was possible because the statements were hardwired into the code: We used an INSERT statement, which does not return a result set, and a SHOW TABLES statement, which does.

However, you might not always know what kind of statement a given statement represents. For example, if you execute a statement that you read from the keyboard or from a file, it might be anything. You won't know ahead of time whether to expect it to return rows, or even whether it's legal. What then? You certainly don't want to try to parse the statement to determine what kind of statement it is. That's not as simple as it might seem. For example, it's not sufficient to check whether the first word is SELECT because the statement might begin with a comment, as follows:

/* comment */ SELECT ...

Fortunately, you don't have to know the statement type in advance to be able to handle it properly. The MySQL C API makes it possible to write a general purpose statement handler that correctly processes any kind of statement, whether it returns a result set, and whether it executes successfully or fails. Before writing the code for this handler, let's outline the procedure that it implements:

1.	Issue the statement. If it fails, we're done.
2.	If the statement succeeds, call `mysql_store_result()` to retrieve the rows from the server and create a result set.
3.	If `mysql_store_result()` succeeds, the statement returned a result set. Process the rows by calling `mysql_fetch_row()` until it returns `NULL`, and then free the result set.
4.	If `mysql_store_result()` fails, it could be that the statement does not return a result set, or that it should have but an error occurred while trying to retrieve the set. You can distinguish between these outcomes by passing the connection handler to `mysql_field_count()` and checking its return value: If `mysql_field_count()` returns 0, it means the statement returned no columns, and thus no result set. (This indicates that it was a statement such as `INSERT`, `DELETE`, or `UPDATE`.) If `mysql_field_count()` returns a non-zero value, it means that an error occurred, because the statement should have returned a result set but didn't. This can happen for various reasons. For example, the result set may have been so large that memory allocation failed, or a network outage between the client and the server may have occurred while fetching rows.

The following listing shows a function that processes any statement, given a connection handler and a null-terminated statement string:

void
process_statement (MYSQL *conn, char *stmt_str)
{
MYSQL_RES *res_set;

    if (mysql_query (conn, stmt_str) != 0)  /* the statement failed */
    {
        print_error (conn, "Could not execute statement");
        return;
    }

    /* the statement succeeded; determine whether it returned data */
    res_set = mysql_store_result (conn);
    if (res_set)            /* a result set was returned */
    {
        /* process rows and then free the result set */
        process_result_set (conn, res_set);
        mysql_free_result (res_set);
    }
    else                    /* no result set was returned */
    {
        /*
         * does the lack of a result set mean that the statement didn't
         * return one, or that it should have but an error occurred?
         */
        if (mysql_field_count (conn) == 0)
        {
            /*
             * statement generated no result set (it was not a SELECT,
             * SHOW, DESCRIBE, etc.); just report rows-affected value.
             */
            printf ("%lu rows affected\n",
                        (unsigned long) mysql_affected_rows (conn));
        }
        else    /* an error occurred */
        {
            print_error (conn, "Could not retrieve result set");
        }
    }
}

Alternative Approaches to Statement Processing

The version of process_statement() just shown has these three properties:

It uses mysql_query() to issue the statement.
It uses mysql_store_query() to retrieve the result set.
When no result set is obtained, it uses mysql_field_count() to distinguish occurrence of an error from a result set not being expected.

Alternative approaches are possible for all three of these aspects of statement handling:

You can use a counted statement string and mysql_real_query() rather than a null-terminated statement string and mysql_query().
You can create the result set by calling mysql_use_result() rather than mysql_store_result().
You can call mysql_error() or mysql_errno() rather than mysql_field_count() to determine whether result set retrieval failed or whether there was simply no set to retrieve.

Any or all of these approaches can be used instead of those used in process_statement(). Here is a process_real_statement() function that is analogous to process_statement() but that uses all three alternatives:

void
process_real_statement (MYSQL *conn, char *stmt_str, unsigned int len)
{
MYSQL_RES *res_set;

    if (mysql_real_query (conn, stmt_str, len) != 0) /* the statement failed */
    {
        print_error (conn, "Could not execute statement");
        return;
    }

    /* the statement succeeded; determine whether it returned data */
    res_set = mysql_use_result (conn);
    if (res_set)            /* a result set was returned */
    {
        /* process rows and then free the result set */
        process_result_set (conn, res_set);
        mysql_free_result (res_set);
    }
    else                    /* no result set was returned */
    {
        /*
         * does the lack of a result set mean that the statement didn't
         * return one, or that it should have but an error occurred?
         */
        if (mysql_errno (conn) == 0)
        {
            /*
             * statement generated no result set (it was not a SELECT,
             * SHOW, DESCRIBE, etc.); just report rows-affected value.
             */
            printf ("%lu rows affected\n",
                        (unsigned long) mysql_affected_rows (conn));
        }
        else    /* an error occurred */
        {
            print_error (conn, "Could not retrieve result set");
        }
    }
}

`mysql_store_result()` and `mysql_use_result()` Compared

The mysql_store_result() and mysql_use_result() functions are similar in that both take a connection handler argument and return a result set. However, the differences between them actually are quite extensive. The primary difference between the two functions lies in the way rows of the result set are retrieved from the server. mysql_store_result() retrieves all the rows immediately when you call it. mysql_use_result() initiates the retrieval but doesn't actually get any of the rows. These differing approaches to row retrieval give rise to all other differences between the two functions. This section compares them so that you'll know how to choose the one that's most appropriate for a given application.

When mysql_store_result() retrieves a result set from the server, it fetches the rows, allocates memory for them, and stores them in the client. Subsequent calls to mysql_fetch_row() never return an error because they simply pull a row out of a data structure that already holds the result set. Consequently, a NULL return from mysql_fetch_row() always means you've reached the end of the result set.

By contrast, mysql_use_result() doesn't retrieve any rows itself. Instead, it simply initiates a row-by-row retrieval, which you must complete yourself by calling mysql_fetch_row() for each row. In this case, although a NULL return from mysql_fetch_row() normally still means the end of the result set has been reached, it may mean instead that an error occurred while communicating with the server. You can distinguish the two outcomes by calling mysql_errno() or mysql_error().

mysql_store_result() has higher memory and processing requirements than does mysql_use_result() because the entire result set is maintained in the client. The overhead for memory allocation and data structure setup is greater, and a client that retrieves large result sets runs the risk of running out of memory. If you're going to retrieve a lot of rows in a single result set, you might want to use mysql_use_result() instead.

mysql_use_result() has lower memory requirements because only enough space to handle a single row at a time need be allocated. This can be faster because you're not setting up as complex a data structure for the result set. On the other hand, mysql_use_result() places a greater burden on the server, which must hold rows of the result set until the client sees fit to retrieve all of them. This makes mysql_use_result() a poor choice for certain types of clients:

Interactive clients that advance from row to row at the request of the user. (You don't want the server waiting to send the next row just because the user decides to take a coffee break.)
Clients that do a lot of processing between row retrievals.

In both of these types of situations, the client fails to retrieve all rows in the result set quickly. This ties up the server and can have a negative impact on other clients, particularly if you are using a storage engine like MyISAM that uses table locks: Tables from which you retrieve data are read-locked for the duration of the query. Other clients that are trying to update those tables will be blocked.

Offsetting the additional memory requirements incurred by mysql_store_result() are certain benefits of having access to the entire result set at once. All rows of the set are available, so you have random access into them: The mysql_data_seek(), mysql_row_seek(), and mysql_row_tell() functions allow you to access rows in any order you want. With mysql_use_result(), you can access rows only in the order in which they are retrieved by mysql_fetch_row(). If you intend to process rows in any order other than sequentially as they are returned from the server, you must use mysql_store_result() instead. For example, if you have an application that allows the user to browse back and forth among the rows selected by a query, you'd be best served by using mysql_store_result().

With mysql_store_result(), you have access to certain types of column information that are unavailable when you use mysql_use_result(). The number of rows in the result set is obtained by calling mysql_num_rows(). The maximum widths of the values in each column are stored in the max_width member of the MYSQL_FIELD column information structures. With mysql_use_result(), mysql_num_rows() doesn't return the correct value until you've fetched all the rows; similarly, max_width is unavailable because it can be calculated only after every row's data have been seen.

Because mysql_use_result() does less work than mysql_store_result(), it imposes a requirement that mysql_store_result() does not: The client must call mysql_fetch_row() for every row in the result set. If you fail to do this before issuing another statement, any remaining records in the current result set become part of the next statement's result set and an "out of sync" error occurs. (You can avoid this by calling mysql_free_result() before issuing the second statement. mysql_free_result() will fetch and discard any pending rows for you.) One implication of this processing model is that with mysql_use_result() you can work only with a single result set at a time.

Sync errors do not happen with mysql_store_result() because when that function returns, there are no rows yet to be fetched from the server. In fact, with mysql_store_result(), you need not call mysql_fetch_row() explicitly at all. This can sometimes be useful if all that you're interested in is whether you got a non-empty result, rather than what the result contains. For example, to find out whether a table mytbl exists, you can execute this statement:

SHOW TABLES LIKE 'mytbl'

If, after calling mysql_store_result(), the value of mysql_num_rows() is non-zero, the table exists. mysql_fetch_row() need not be called.

Result sets generated with mysql_store_result() should be freed with mysql_free_result() at some point, but this need not necessarily be done before issuing another statement. This means that you can generate multiple result sets and work with them simultaneously, in contrast to the "one result set at a time" constraint imposed when you're working with mysql_use_result().

If you want to provide maximum flexibility, give users the option of selecting either result set processing method. mysql and mysqldump are two programs that do this. They use mysql_store_result() by default but switch to mysql_use_result() if the --quick option is given.

Using Result Set Metadata

Result sets contain not only the column values for data rows but also information about the data. This information is called the result set "metadata," which includes:

The number of rows and columns in the result set, available by calling mysql_num_rows() and mysql_num_fields().
The length of each column value in the current row, available by calling mysql_fetch_lengths().
Information about each column, such as the column name and type, the maximum width of each column's values, and the table the column comes from. This information is stored in MYSQL_FIELD structures, which typically are obtained by calling mysql_fetch_field(). Appendix G, "C API Reference," describes the MYSQL_FIELD structure in detail and lists all functions that provide access to column information.

Metadata availability is partially dependent on your result set processing method. As indicated in the previous section, if you want to use the row count or maximum column length values, you must create the result set with mysql_store_result(), not with mysql_use_result().

Result set metadata is helpful for making decisions about how to process result set data:

Column names and widths are useful for producing nicely formatted output that has column titles and that lines up vertically.
You use the column count to determine how many times to iterate through a loop that processes successive column values for data rows.
You can use the row or column counts if you need to allocate data structures that depend on knowing the dimensions of the result set.
You can determine the data type of a column. This allows you to tell whether a column represents a number, whether it might contain binary data, and so forth.

Earlier, in the section "Handling Statements That Return a Result Set," we wrote a version of process_result_set() that printed columns from result set rows in tab- delimited format. That's good for certain purposes (such as when you want to import the data into a spreadsheet), but it's not a nice display format for visual inspection or for printouts. Recall that our earlier version of process_result_set() produced this output:

Adams   John    Braintree   MA
Adams   John Quincy Braintree   MA
Arthur  Chester A.  Fairfield   VT
Buchanan    James   Mercersburg PA
Bush    George H.W. Milton  MA
Bush    George W.   New Haven   CT
Carter  James E.    Plains  GA
...

Let's write a different version of process_result_set() that produces tabular output instead by titling and "boxing" each column. This version will display those same results in a format that's easier to interpret:

+------------+---------------+---------------------+-------+
| last_name  | first_name    | city                | state |
+------------+---------------+---------------------+-------+
| Adams      | John          | Braintree           | MA    |
| Adams      | John Quincy   | Braintree           | MA    |
| Arthur     | Chester A.    | Fairfield           | VT    |
| Buchanan   | James         | Mercersburg         | PA    |
| Bush       | George H.W.   | Milton              | MA    |
| Bush       | George W.     | New Haven           | CT    |
| Carter     | James E.      | Plains              | GA    |
...
+------------+---------------+---------------------+-------+

The general outline of the display algorithm is as follows:

1.	Determine the display width of each column.
2.	Print a row of boxed column labels (delimited by vertical bars and preceded and followed by rows of dashes).
3.	Print the values in each row of the result set, with each column boxed (delimited by vertical bars) and lined up vertically. In addition, print numbers right justified and print the word "NULL" for `NULL` values.
4.	At the end, print a count of the number of rows retrieved.

This exercise provides a good demonstration showing how to use result set metadata because it requires knowledge of quite a number of things about the result set other than just the values of the data contained in its rows.

You may be thinking to yourself, "Hmm, that description sounds suspiciously similar to the way mysql displays its output." Yes, it does, and you're welcome to compare the source for mysql to the code we end up with for process_result_set(). They're not the same, and you might find it instructive to compare the two approaches to the same problem.

First, it's necessary to determine the display width of each column. The following listing shows how to do this. Observe that the calculations are based entirely on the result set metadata, and make no reference whatsoever to the row values:

MYSQL_FIELD     *field;
unsigned long   col_len;
unsigned int    i;

/* determine column display widths -- requires result set to be */
/* generated with mysql_store_result(), not mysql_use_result() */
mysql_field_seek (res_set, 0);
for (i = 0; i < mysql_num_fields (res_set); i++)
{
    field = mysql_fetch_field (res_set);
    col_len = strlen (field->name);
    if (col_len < field->max_length)
        col_len = field->max_length;
    if (col_len < 4 && !IS_NOT_NULL (field->flags))
        col_len = 4;    /* 4 = length of the word "NULL" */
    field->max_length = col_len;    /* reset column info */
}

This code calculates column widths by iterating through the MYSQL_FIELD structures for the columns in the result set. We position to the first structure by calling mysql_field_seek(). Subsequent calls to mysql_fetch_field() return pointers to the structures for successive columns. The width of a column for display purposes is the maximum of three values, each of which depends on metadata in the column information structure:

The length of field->name, the column title.
field->max_length, the length of the longest data value in the column.
The length of the string "NULL" if the column can contain NULL values. field->flags indicates whether the column can contain NULL.

Notice that after the display width for a column is known, we assign that value to max_length, which is a member of a structure that we obtain from the client library. Is that allowable, or should the contents of the MYSQL_FIELD structure be considered read-only? Normally, I would say "read-only," but some of the client programs in the MySQL distribution change the max_length value in a similar way, so I assume that it's okay. (If you prefer an alternative approach that doesn't modify max_length, allocate an array of unsigned long values and store the calculated widths in that array.)

The display width calculations involve one caveat. Recall that max_length has no meaning when you create a result set using mysql_use_result(). Because we need max_length to determine the display width of the column values, proper operation of the algorithm requires that the result set be generated using mysql_store_result(). In programs that use mysql_use_result() rather than mysql_store_result(), one possible workaround is to use the length member of the MYSQL_FIELD structure, which tells you the maximum length that column values can be.

When we know the column widths, we're ready to print. Titles are easy to handle. For a given column, we simply use the column information structure pointed to by field and print the name member, using the width calculated earlier:

printf (" %-*s |", (int) field->max_length, field->name);

For the data, we loop through the rows in the result set, printing column values for the current row during each iteration. Printing column values from the row is a bit tricky because a value might be NULL, or it might represent a number (in which case we print it right justified). Column values are printed as follows, where row[i] holds the data value and field points to the column information:

if (row[i] == NULL)             /* print the word "NULL" */
    printf (" %-*s |", (int) field->max_length, "NULL");
else if (IS_NUM (field->type))  /* print value right-justified */
    printf (" %*s |", (int) field->max_length, row[i]);
else                            /* print value left-justified */
    printf (" %-*s |", (int) field->max_length, row[i]);

The value of the IS_NUM() macro is true if the column data type indicated by field->type is one of the numeric types, such as INT, FLOAT, or DECIMAL.

The final code to display the result set is as follows. Because we're printing lines of dashes multiple times, it's easier to write a print_dashes() function to do so rather than to repeat the dash-generation code several places:

void
print_dashes (MYSQL_RES *res_set)
{
MYSQL_FIELD     *field;
unsigned int    i, j;

    mysql_field_seek (res_set, 0);
    fputc ('+', stdout);
    for (i = 0; i < mysql_num_fields (res_set); i++)
    {
        field = mysql_fetch_field (res_set);
        for (j = 0; j < field->max_length + 2; j++)
            fputc ('-', stdout);
        fputc ('+', stdout);
    }
    fputc ('\n', stdout);
}

void
process_result_set (MYSQL *conn, MYSQL_RES *res_set)
{
MYSQL_ROW       row;
MYSQL_FIELD     *field;
unsigned long   col_len;
unsigned int    i;

    /* determine column display widths -- requires result set to be */
    /* generated with mysql_store_result(), not mysql_use_result() */
    mysql_field_seek (res_set, 0);
    for (i = 0; i < mysql_num_fields (res_set); i++)
    {
        field = mysql_fetch_field (res_set);
        col_len = strlen (field->name);
        if (col_len < field->max_length)
            col_len = field->max_length;
        if (col_len < 4 && !IS_NOT_NULL (field->flags))
            col_len = 4;    /* 4 = length of the word "NULL" */
        field->max_length = col_len;    /* reset column info */
    }

    print_dashes (res_set);
    fputc ('|', stdout);
    mysql_field_seek (res_set, 0);
    for (i = 0; i < mysql_num_fields (res_set); i++)
    {
        field = mysql_fetch_field (res_set);
        printf (" %-*s |", (int) field->max_length, field->name);
    }
    fputc ('\n', stdout);
    print_dashes (res_set);

    while ((row = mysql_fetch_row (res_set)) != NULL)
    {
        mysql_field_seek (res_set, 0);
        fputc ('|', stdout);
        for (i = 0; i < mysql_num_fields (res_set); i++)
        {
            field = mysql_fetch_field (res_set);
            if (row[i] == NULL)             /* print the word "NULL" */
                printf (" %-*s |", (int) field->max_length, "NULL");
            else if (IS_NUM (field->type))  /* print value right-justified */
                printf (" %*s |", (int) field->max_length, row[i]);
            else                            /* print value left-justified */
                printf (" %-*s |", (int) field->max_length, row[i]);
        }
        fputc ('\n', stdout);
    }
    print_dashes (res_set);
    printf ("%lu rows returned\n",
            (unsigned long) mysql_num_rows (res_set));
}

The MySQL client library provides several ways of accessing the column information structures. For example, the code in the preceding example accesses these structures several times using loops of the following general form:

mysql_field_seek (res_set, 0);
for (i = 0; i < mysql_num_fields (res_set); i++)
{
    field = mysql_fetch_field (res_set);
    ...
}

However, the mysql_field_seek() / mysql_fetch_field() combination is only one way of getting MYSQL_FIELD structures. See the entries for the mysql_fetch_fields() and mysql_fetch_field_direct() functions in Appendix G, for other ways of accessing column information structures.

Use the metadata Program to Display Result Set Metadata

The sampdb distribution contains the source for a program named metadata that you can compile and run to see what metadata various kinds of statements produce. It prompts for and executes SQL statements, but displays result set metadata rather than result set contents.

Encoding Special Characters and Binary Data

If a program executes statements entered by the user, you can assume either that those statements are legal or that the program can simply report an error to the user. For example, a user who wants to include a quote character within a quoted string must either double the quote or precede it by a backslash:

'O''Malley'
'O\'Malley'

Applications that construct their own statements must take the same precautions. This section describes how to handle quoting issues in string values and how to work with binary data.

Working with Strings That Contain Special Characters

If inserted literally into a statement, data values containing quotes, nulls, or backslashes can cause problems when you try to execute the statement. The following discussion describes the nature of the difficulty and how to solve it.

Suppose that you want to construct a SELECT statement based on the contents of the null-terminated string pointed to by the name_val variable:

char stmt_buf[1024];

sprintf (stmt_buf, "SELECT * FROM mytbl WHERE name='%s'", name_val);

If the value of name_val is something like O'Malley, Brian, the resulting statement is illegal because a quote appears inside a quoted string:

SELECT * FROM mytbl WHERE name='O'Malley, Brian'

You need to treat the quote specially so that the server doesn't interpret it as the end of the name. The standard SQL convention for doing this is to double the quote within the string. MySQL understands that convention, and also allows the quote to be preceded by a backslash, so you can write the statement using either of the following formats:

SELECT * FROM mytbl WHERE name='O''Malley, Brian'
SELECT * FROM mytbl WHERE name='O\'Malley, Brian'

To deal with this problem, use mysql_real_escape_string(), which encodes special characters to make them usable in quoted strings. Characters that mysql_real_escape_string() considers special are the null character, single quote, double quote, backslash, newline, carriage return, and Ctrl-Z. (The last one is special on Windows, where it often signifies end-of-file.)

When should you use mysql_real_escape_string()? The safest answer is "always." However, if you're sure of the format of your data and know that it's okayperhaps because you have performed some prior validation check on ityou need not encode it. For example, if you are working with strings that you know represent legal phone numbers consisting entirely of digits and dashes, you don't need to call mysql_real_escape_string(). Otherwise, you probably should.

mysql_real_escape_string() encodes problematic characters by turning them into two-character sequences that begin with a backslash. For example, a null byte becomes '\0', where the '0' is a printable ASCII zero, not a null. Backslash, single quote, and double quote become '\\', '\'', and `\"'.

To use mysql_real_escape_string(), invoke it like this:

to_len = mysql_real_escape_string (conn, to_str, from_str, from_len);

mysql_real_escape_string() encodes from_str and writes the result into to_str. It also adds a terminating null, which is convenient because you can use the resulting string with functions such as strcpy(), strlen(), or printf().

from_str points to a char buffer containing the string to be encoded. This string may contain anything, including binary data. to_str points to an existing char buffer where you want the encoded string to be written; do not pass an uninitialized or NULL pointer, expecting mysql_real_escape_string() to allocate space for you. The length of the buffer pointed to by to_str must be at least (from_len*2)+1 bytes long. (It's possible that every character in from_str will need encoding with two characters; the extra byte is for the terminating null.)

from_len and to_len are unsigned long values. from_len indicates the length of the data in from_str; it's necessary to provide the length because from_str may contain null bytes and cannot be treated as a null-terminated string. to_len, the return value from mysql_real_escape_string(), is the actual length of the resulting encoded string, not counting the terminating null.

When mysql_real_escape_string() returns, the encoded result in to_str can be treated as a null-terminated string because any nulls in from_str are encoded as the printable '\0' sequence.

To rewrite the SELECT-constructing code so that it works even for name values that contain quotes, we could do something like this:

char stmt_buf[1024], *p;

p = strcpy (stmt_buf, "SELECT * FROM mytbl WHERE name='");
p += strlen (p);
p += mysql_real_escape_string (conn, p, name_val, strlen (name_val));
*p++ = '\'';
*p = '\0';

Yes, that's ugly. To simplify the code a bit, at the cost of using a second buffer, do this instead:

char stmt_buf[1024], buf[1024];

(void) mysql_real_escape_string (conn, buf, name_val, strlen (name_val));
sprintf (stmt_buf, "SELECT * FROM mytbl WHERE name='%s'", buf);

It's important to make sure that the buffers you pass to mysql_real_escape_string() really exist. Consider the following example, which violates that principle:

char *from_str = "some string";
char *to_str;
unsigned long len;

len = mysql_real_escape_string (conn, to_str, from_str, strlen (from_str));

What's the problem? to_str must point to an existing buffer, and it doesn'tit's not initialized and may point to some random location. Don't pass an uninitialized pointer as the to_str argument to mysql_real_escape_string() unless you want it to stomp merrily all over some random piece of memory.

Working with Binary Data

Another problematic situation involves the use of arbitrary binary data in a statement. This happens, for example, in applications that store images in a database. Because a binary value may contain any character (including null bytes, quotes, or backslashes), it cannot be considered safe to put into a statement as is.

mysql_real_escape_string() is essential for working with binary data. This section shows how to do so, using image data read from a file. The discussion applies to any other form of binary data as well.

Suppose that you want to read images from files and store them in a table named picture, along with a unique identifier. The MEDIUMBLOB type is a good choice for binary values less than 16MB in size, so you could use a table specification like this:

CREATE TABLE picture
(
    pict_id     INT NOT NULL PRIMARY KEY,
    pict_data   MEDIUMBLOB
);

To actually get an image from a file into the picture table, the following function, load_image(), does the job, given an identifier number and a pointer to an open file containing the image data:

int
load_image (MYSQL *conn, int id, FILE *f)
{
char            stmt_buf[1024*1000], buf[1024*10], *p;
unsigned long   from_len;
int             status;

    /* begin creating an INSERT statement, adding the id value */
    sprintf (stmt_buf,
            "INSERT INTO picture (pict_id,pict_data) VALUES (%d,'",
            id);
    p = stmt_buf + strlen (stmt_buf);
    /* read data from file in chunks, encode each */
    /* chunk, and add to end of statement */
    while ((from_len = fread (buf, 1, sizeof (buf), f)) > 0)
    {
        /* don't overrun end of statement buffer! */
        if (p + (2*from_len) + 3 > stmt_buf + sizeof (stmt_buf))
        {
            print_error (NULL, "image is too big");
            return (1);
        }
        p += mysql_real_escape_string (conn, p, buf, from_len);
    }
    *p++ = '\'';
    *p++ = ')';
    status = mysql_real_query (conn, stmt_buf, (unsigned long) (p - stmt_buf));
    return (status);
}

load_image() doesn't allocate a very large statement buffer (1MB), so it works only for relatively small images. In a real-world application, you might allocate the buffer dynamically based on the size of the image file.

Getting an image value (or any binary value) back out of a database isn't nearly as much of a problem as putting it in to begin with. The data value is available in raw form in the MYSQL_ROW variable, and the length is available by calling mysql_fetch_lengths(). Just be sure to treat the value as a counted string, not as a null-terminated string.