Using Aggregate FunctionsIt is often necessary to summarize data without actually retrieving it all, and MySQL provides special functions for this purpose. Using these functions, MySQL queries are often used to retrieve data for analysis and reporting purposes. Examples of this type of retrieval are
In each of these examples, you want a summary of the data in a table, not the actual data itself. Therefore, returning the actual table data would be a waste of time and processing resources (not to mention bandwidth). To repeat, all you really want is the summary information. To facilitate this type of retrieval, MySQL features a set of aggregate functions, some of which are listed in Table 12.1. These functions enable you to perform all the types of retrieval just enumerated.
New Term Aggregate Functions Functions that operate on a set of rows to calculate and return a single value. The use of each of these functions is explained in the following sections. Note Standard Deviation A series of standard deviation aggregate functions are also supported by MySQL, but they are not covered in this book. The AVG() FunctionAVG() is used to return the average value of a specific column by counting both the number of rows in the table and the sum of their values. AVG() can be used to return the average value of all columns or of specific columns or rows. This first example uses AVG() to return the average price of all the products in the products table: • Input SELECT AVG(prod_price) AS avg_price FROM products; • Output +-----------+ | avg_price | +-----------+ | 16.133571 | +-----------+ • Analysis The previous SELECT statement returns a single value, avg_price, that contains the average price of all products in the products table. avg_price is an alias as explained in Chapter 10, "Creating Calculated Fields." AVG() can also be used to determine the average value of specific columns or rows. The following example returns the average price of products offered by a specific vendor: • Input SELECT AVG(prod_price) AS avg_price FROM products WHERE vend_id = 1003; • Output +-----------+ | avg_price | +-----------+ | 13.212857 | +-----------+ • Analysis This SELECT statement differs from the previous one only in that this one contains a WHERE clause. The WHERE clause filters only products with a vend_id of 1003, and, therefore, the value returned in avg_price is the average of just that vendor's products. Caution Individual Columns Only AVG() may only be used to determine the average of a specific numeric column, and that column name must be specified as the function parameter. To obtain the average value of multiple columns, multiple AVG() functions must be used. The COUNT() FunctionCOUNT() does just that: It counts. Using COUNT(), you can determine the number of rows in a table or the number of rows that match a specific criterion. COUNT() can be used two ways:
This first example returns the total number of customers in the customers table: • Input SELECT COUNT(*) AS num_cust FROM customers; • Output +----------+ | num_cust | +----------+ | 5 | +----------+ • Analysis In this example, COUNT(*) is used to count all rows, regardless of values. The count is returned in num_cust. The following example counts just the customers with an email address: • Input SELECT COUNT(cust_email) AS num_cust FROM customers; • Output +----------+ | num_cust | +----------+ | 3 | +----------+ • Analysis This SELECT statement uses COUNT(cust_email) to count only rows with a value in the cust_email column. In this example, cust_email is 3 (meaning that only three of the five customers have email addresses). Note NULL Values Column rows with NULL values in them are ignored by the COUNT() function if a column name is specified, but not if the asterisk (*) is used. The MAX() FunctionMAX() returns the highest value in a specified column. MAX() requires that the column name be specified, as seen here: • Input SELECT MAX(prod_price) AS max_price FROM products; • Output +-----------+ | max_price | +-----------+ | 55.00 | +-----------+ • Analysis Here MAX() returns the price of the most expensive item in products table. Tip Using MAX() with Non-Numeric Data Although MAX() is usually used to find the highest numeric or date values, MySQL allows it to be used to return the highest value in any column including textual columns. When used with textual data, MAX() returns the row that would be the last if the data were sorted by that column. The MIN() FunctionMIN() does the exact opposite of MAX(); it returns the lowest value in a specified column. Like MAX(), MIN() requires that the column name be specified, as seen here: • Input SELECT MIN(prod_price) AS min_price FROM products; • Output +-----------+ | min_price | +-----------+ | 2.50 | +-----------+ • Analysis Here MIN() returns the price of the least expensive item in products table. Tip Using MIN() with Non-Numeric Data As with the MAX() function, MySQL allows MIN() to be used to return the lowest value in any columns including textual columns. When used with textual data, MIN() returns the row that would be first if the data were sorted by that column. The SUM() FunctionSUM() is used to return the sum (total) of the values in a specific column. Here is an example to demonstrate this. The orderitems table contains the actual items in an order, and each item has an associated quantity. The total number of items ordered (the sum of all the quantity values) can be retrieved as follows: • Input SELECT SUM(quantity) AS items_ordered FROM orderitems WHERE order_num = 20005; • Output +---------------+ | items_ordered | +---------------+ | 19 | +---------------+ • Analysis The function SUM(quantity) returns the sum of all the item quantities in an order, and the WHERE clause ensures that just the right order items are included. SUM() can also be used to total calculated values. In this next example the total order amount is retrieved by totaling item_price*quantity for each item: • Input SELECT SUM(item_price*quantity) AS total_price FROM orderitems WHERE order_num = 20005; • Output +-------------+ | total_price | +-------------+ | 149.87 | +-------------+ • Analysis The function SUM(item_price*quantity) returns the sum of all the expanded prices in an order, and again the WHERE clause ensures that just the correct order items are included. Tip Performing Calculations on Multiple Columns All the aggregate functions can be used to perform calculations on multiple columns using the standard mathematical operators, as shown in the example. |