< Day Day Up > |
8.1 Writing Inner JoinsA join that identifies combinations of rows from two tables is called an inner join. Inner joins may be written using two different syntaxes. One syntax lists the tables to be joined separated by a comma. The other uses the INNER JOIN keywords. 8.1.1 Writing Inner Joins with the Comma OperatorA simple question you might ask about the information in the world database is, "What languages are spoken in each country?" That question has a trivial answer if you don't mind listing countries by code. Just select the information from the CountryLanguage table. Two of its columns list three-letter country codes and language names:
mysql> SELECT CountryCode, Language FROM CountryLanguage;
+-------------+------------------+
| CountryCode | Language |
+-------------+------------------+
| ABW | Dutch |
| ABW | English |
| ABW | Papiamento |
| ABW | Spanish |
| AFG | Balochi |
| AFG | Dari |
| AFG | Pashto |
| AFG | Turkmenian |
| AFG | Uzbek |
| AGO | Ambo |
| AGO | Chokwe |
| AGO | Kongo |
| AGO | Luchazi |
| AGO | Luimbe-nganguela |
| AGO | Luvale |
...
That result would be more meaningful and easier to understand if it displayed countries identified by full name. However, that cannot be done using just the CountryLanguage table, which contains country codes and not names. Country names are available in the world database, but they're stored in the Country table that contains both the three-letter codes and the names:
mysql> SELECT Code, Name FROM Country;
+------+----------------------+
| Code | Name |
+------+----------------------+
| AFG | Afghanistan |
| NLD | Netherlands |
| ANT | Netherlands Antilles |
| ALB | Albania |
| DZA | Algeria |
| ASM | American Samoa |
| AND | Andorra |
| AGO | Angola |
| AIA | Anguilla |
| ATG | Antigua and Barbuda |
| ARE | United Arab Emirates |
| ARG | Argentina |
| ARM | Armenia |
| ABW | Aruba |
| AUS | Australia |
...
A query to display languages and full country names can be written as a join that matches the country codes in the CountryLanguage table with those in the Country table. To do that, modify the CountryLanguage query in the following ways:
The query that results from these changes is as follows: mysql> SELECT Name, Language FROM CountryLanguage, Country -> WHERE CountryCode = Code; +----------------------+------------+ | Name | Language | +----------------------+------------+ | Afghanistan | Balochi | | Afghanistan | Dari | | Afghanistan | Pashto | | Afghanistan | Turkmenian | | Afghanistan | Uzbek | | Netherlands | Arabic | | Netherlands | Dutch | | Netherlands | Fries | | Netherlands | Turkish | | Netherlands Antilles | Dutch | | Netherlands Antilles | English | | Netherlands Antilles | Papiamento | | Albania | Albaniana | | Albania | Greek | | Albania | Macedonian | ... Essentially what this query does is treat Country as a lookup table. For any given country code in the CountryLanguage table, the query uses that code to find the corresponding row in the Country table and retrieves the country name from that row. Note several things about this query and the result that it produces:
Syntactically, the WHERE clause in a join is optional. However, it's usually necessary in practice to include a WHERE clause to keep the join from producing output far in excess of what you really want to see and to make sure that the output contains only information that's meaningful for the question you're asking. A join can produce every combination of rows from the two tables, which is in fact what you'll get from an unrestricted join that includes no WHERE clause. This is called a Cartesian product, and the number of rows in the result is the product of the number of rows in the individual tables. For example, the Country and CountryLanguage tables contain approximately 240 and 1,000 rows, respectively, so a Cartesian product between them produces about 240,000 rows. But much of such output is irrelevant because most of the combinations aren't meaningful. The following query shows what happens if you join records in the CountryLanguage and Country tables without a WHERE clause. The query displays the code from both tables to show that even nonmatching combinations are produced by an unrestricted join: mysql> SELECT Code, Name, CountryCode, Language -> FROM CountryLanguage, Country; +------+-------------+-------------+------------------+ | Code | Name | CountryCode | Language | +------+-------------+-------------+------------------+ | AFG | Afghanistan | ABW | Dutch | | AFG | Afghanistan | ABW | English | | AFG | Afghanistan | ABW | Papiamento | | AFG | Afghanistan | ABW | Spanish | | AFG | Afghanistan | AFG | Balochi | | AFG | Afghanistan | AFG | Dari | | AFG | Afghanistan | AFG | Pashto | | AFG | Afghanistan | AFG | Turkmenian | | AFG | Afghanistan | AFG | Uzbek | | AFG | Afghanistan | AGO | Ambo | | AFG | Afghanistan | AGO | Chokwe | | AFG | Afghanistan | AGO | Kongo | | AFG | Afghanistan | AGO | Luchazi | | AFG | Afghanistan | AGO | Luimbe-nganguela | | AFG | Afghanistan | AGO | Luvale | ... If you're using the mysql client program and want to guard against the possibility of generating huge result sets due to forgetting a WHERE clause, invoke the program with the --safe-updates option (which, despite its name, also affects output from joins). See section 3.2.8, "Using the --safe-updates Option," for more information. The WHERE clause for a join specifies how to match records in the joined tables and eliminates noncorresponding combinations of rows from the output. The WHERE clause also can include additional conditions to further restrict the output and answer more specific questions. Here are some examples:
Joins can use any of the constructs allowed for single-table SELECT statements. The following join uses the COUNT() function and a GROUP BY clause to summarize the number of languages spoken per country, and a HAVING clause to restrict the output to include only those countries where more than 10 languages are spoken: mysql> SELECT COUNT(*), Name -> FROM CountryLanguage, Country -> WHERE CountryCode = Code -> GROUP BY Name -> HAVING COUNT(*) > 10; +----------+--------------------+ | COUNT(*) | Name | +----------+--------------------+ | 12 | Canada | | 12 | China | | 12 | India | | 12 | Russian Federation | | 11 | South Africa | | 11 | Tanzania | | 12 | United States | +----------+--------------------+ 8.1.2 Writing Inner Joins with INNER JOINAs described in the previous section, one form of inner join syntax uses the comma operator to name the joined tables. Another inner join syntax uses the INNER JOIN keywords. With this syntax, those keywords replace the comma operator between table names in the FROM clause. Also, with INNER JOIN, the conditions that indicate how to perform record matching for the tables move from the WHERE clause to become part of the FROM clause. There are two syntaxes for specifying matching conditions with INNER JOIN queries:
|
< Day Day Up > |