Using Different Join TypesSo far, you have used only simple joins known as inner joins or equijoins. You'll now take a look at three additional join types: the self join, the natural join, and the outer join. Self JoinsAs mentioned earlier, one of the primary reasons to use table aliases is to be able to refer to the same table more than once in a single SELECT statement. An example will demonstrate this. Suppose that a problem was found with a product (item id DTNTR), and you therefore wanted to know all of the products made by the same vendor so as to determine if the problem applied to them, too. This query requires that you first find out which vendor creates item DTNTR, and next find which other products are made by the same vendor. The following is one way to approach this problem: • Input SELECT prod_id, prod_name FROM products WHERE vend_id = (SELECT vend_id FROM products WHERE prod_id = 'DTNTR'); • Output +---------+----------------+ | prod_id | prod_name | +---------+----------------+ | DTNTR | Detonator | | FB | Bird seed | | FC | Carrots | | SAFE | Safe | | SLING | Sling | | TNT1 | TNT (1 stick) | | TNT2 | TNT (5 sticks) | +---------+----------------+ • Analysis This first solution uses subqueries. The inner SELECT statement does a simple retrieval to return the vend_id of the vendor that makes item DTNTR. That ID is the one used in the WHERE clause of the outer query so all items produced by that vendor are retrieved. (You learned all about subqueries in Chapter 14, "Working with Subqueries." Refer to that chapter for more information.) Now look at the same query using a join: • Input SELECT p1.prod_id, p1.prod_name FROM products AS p1, products AS p2 WHERE p1.vend_id = p2.vend_id AND p2.prod_id = 'DTNTR'; • Output +---------+----------------+ | prod_id | prod_name | +---------+----------------+ | DTNTR | Detonator | | FB | Bird seed | | FC | Carrots | | SAFE | Safe | | SLING | Sling | | TNT1 | TNT (1 stick) | | TNT2 | TNT (5 sticks) | +---------+----------------+ • Analysis The two tables needed in this query are actually the same table, and so the products table appears in the FROM clause twice. Although this is perfectly legal, any references to table products would be ambiguous because MySQL could not know to which instance of the products table you are referring. To resolve this problem, table aliases are used. The first occurrence of products has an alias of p1, and the second has an alias of p2. Now those aliases can be used as table names. The SELECT statement, for example, uses the p1 prefix to explicitly state the full name of the desired columns. If it did not, MySQL would return an error because there are two columns named prod_id and prod_name. It cannot know which one you want (even though, in truth, they are one and the same). The WHERE clause first joins the tables (by matching vend_id in p1 to vend_id in p2), and then it filters the data by prod_id in the second table to return only the desired data. Tip Self Joins Instead of Subqueries Self joins are often used to replace statements using subqueries that retrieve data from the same table as the outer statement. Although the end result is the same, sometimes these joins execute far more quickly than they do subqueries. It is usually worth experimenting with both to determine which performs better. Natural JoinsWhenever tables are joined, at least one column appears in more than one table (the columns being joined). Standard joins (the inner joins you learned about in the previous chapter) return all data, even multiple occurrences of the same column. A natural join simply eliminates those multiple occurrences so only one of each column is returned. How does it do this? The answer is it doesn'tyou do it. A natural join is a join in which you select only columns that are unique. This is typically done using a wildcard (SELECT *) for one table and explicit subsets of the columns for all other tables. The following is an example: • Input SELECT c.*, o.order_num, o.order_date, oi.prod_id, oi.quantity, OI.item_price FROM customers AS c, orders AS o, orderitems AS oi WHERE c.cust_id = o.cust_id AND oi.order_num = o.order_num AND prod_id = 'FB'; • Analysis In this example, a wildcard is used for the first table only. All other columns are explicitly listed so no duplicate columns are retrieved. The truth is, every inner join you have created thus far is actually a natural join, and you will probably never even need an inner join that is not a natural join. Outer JoinsMost joins relate rows in one table with rows in another. But occasionally, you want to include rows that have no related rows. For example, you might use joins to accomplish the following tasks:
In each of these examples, the join includes table rows that have no associated rows in the related table. This type of join is called an outer join. The following SELECT statement is a simple inner join. It retrieves a list of all customers and their orders: • Input SELECT customers.cust_id, orders.order_num FROM customers INNER JOIN orders ON customers.cust_id = orders.cust_id; Outer join syntax is similar. To retrieve a list of all customers, including those who have placed no orders, you can do the following: • Input SELECT customers.cust_id, orders.order_num FROM customers LEFT OUTER JOIN orders ON customers.cust_id = orders.cust_id; • Output +---------+-----------+ | cust_id | order_num | +---------+-----------+ | 10001 | 20005 | | 10001 | 20009 | | 10002 | NULL | | 10003 | 20006 | | 10004 | 20007 | | 10005 | 20008 | +---------+-----------+ • Analysis Like the inner join seen in the previous chapter, this SELECT statement uses the keywords OUTER JOIN to specify the join type (instead of specifying it in the WHERE clause). But unlike inner joins, which relate rows in both tables, outer joins also include rows with no related rows. When using OUTER JOIN syntax you must use the RIGHT or LEFT keywords to specify the table from which to include all rows (RIGHT for the one on the right of OUTER JOIN, and LEFT for the one on the left). The previous example uses LEFT OUTER JOIN to select all the rows from the table on the left in the FROM clause (the customers table). To select all the rows from the table on the right, you use a RIGHT OUTER JOIN as seen in this example: • Input SELECT customers.cust_id, orders.order_num FROM customers RIGHT OUTER JOIN orders ON orders.cust_id = customers.cust_id; Note No *= MySQL does not support the use of the simplified *= and =* syntax popularized by other DBMSs. Tip Outer Join Types There are two basic forms of outer joinsthe left outer join and the right outer join. The only difference between them is the order of the tables they are relating. In other words, a left outer join can be turned into a right outer join simply by reversing the order of the tables in the FROM or WHERE clause. As such, the two types of outer join can be used interchangeably, and the decision about which one is used is based purely on convenience. |