What type of join is used when you have to combine each row from the first table with each row of the second table?
JoinsTo combine data from two tables we use the SQL JOIN command, which comes after the FROM command. Database tables are used to organize and group data by common characteristics or principles. The JOIN command combined with ON is used to combine fields from separate tables. The JOIN command on its own will result in a cross product, where each row in the first table is paired with each row in the second table. Usually this is not what is desired when combining two tables with data that is related in some way. For that, we need to tell the computer which columns provide the link between the two tables using the word ON. What we want is to join the data with the same species id. SELECT * FROM surveys JOIN species ON surveys.species_id = species.species_id; ON is like WHERE. It filters things out according to a test condition. We use the table.colname format to tell the manager what column in which table we are referring to. The output of the JOIN command will have columns from the first table plus the columns from the second table. For the above command, the output will be a table that has the following column names:
Alternatively, we can use the word USING, as a short-hand. USING only works on columns which share the same name. In this case we are telling the manager that we want to combine surveys with species and that the common column is species_id. SELECT * FROM surveys JOIN species USING (species_id); The output will only have one species_id column
We often won’t want all of the fields from both tables, so anywhere we would have used a field name in a non-join query, we can use table.colname. For example, what if we wanted information on when individuals of each species were captured, but instead of their species ID we wanted their actual species names. SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species FROM surveys JOIN species ON surveys.species_id = species.species_id;
Many databases, including SQLite, also support a join through the WHERE clause of a query. SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species FROM surveys, species WHERE surveys.species_id = species.species_id; For the remainder of this lesson, we’ll stick with the explicit use of the JOIN keyword for joining tables in SQL.
Different join typesWe can count the number of records returned by our original join query. SELECT COUNT(*) FROM surveys JOIN species USING (species_id); Notice that this number is smaller than the number of records present in the survey data. SELECT COUNT(*) FROM surveys; This is because, by default, SQL only returns records where the joining value is present in the joined columns of both tables (i.e. it takes the intersection of the two join columns). This joining behaviour is known as an INNER JOIN. In fact the JOIN command is simply shorthand for INNER JOIN and the two terms can be used interchangably as they will produce the same result. We can also tell the computer that we wish to keep all the records in the first table by using the command LEFT OUTER JOIN, or LEFT JOIN for short.
Remember: In SQL a NULL value in one table can never be joined to a NULL value in a second table because NULL is not equal to anything, not even itself. Combining joins with sorting and aggregationJoins can be combined with sorting, filtering, and aggregation. So, if we wanted average mass of the individuals on each different type of treatment, we could do something like SELECT plots.plot_type, AVG(surveys.weight) FROM surveys JOIN plots ON surveys.plot_id = plots.plot_id GROUP BY plots.plot_type;
Functions COALESCE and NULLIF and moreSQL includes numerous functions for manipulating data. You’ve already seen some of these being used for aggregation (SUM and COUNT) but there are functions that operate on individual values as well. Probably the most important of these are COALESCE and NULLIF. COALESCE allows us to specify a value to use in place of NULL. We can represent unknown sexes with 'U' instead of NULL: SELECT species_id, sex, COALESCE(sex, 'U') FROM surveys; The lone “sex” column is only included in the query above to illustrate where COALESCE has changed values; this isn’t a usage requirement.
COALESCE can be particularly useful in JOIN. When joining the species and surveys tables earlier, some results were excluded because the species_id was NULL in the surveys table. We can use COALESCE to include them again, re-writing the NULL to a valid joining value: SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species FROM surveys JOIN species ON COALESCE(surveys.species_id, 'AB') = species.species_id;
The inverse of COALESCE is NULLIF. This returns NULL if the first argument is equal to the second argument. If the two are not equal, the first argument is returned. This is useful for “nulling out” specific values. We can “null out” plot 7: SELECT species_id, plot_id, NULLIF(plot_id, 7) FROM surveys; Some more functions which are common to SQL databases are listed in the table below:
Finally, some useful functions which are particular to SQLite are listed in the table below:
As we saw before, aliases make things clearer, and are especially useful when joining tables. SELECT surv.year AS yr, surv.month AS mo, surv.day AS day, sp.genus AS gen, sp.species AS sp FROM surveys AS surv JOIN species AS sp ON surv.species_id = sp.species_id; To practice we have some optional challenges for you.
Which type of join returns the matching rows from the tables that are being joined?SQL outer join
On joining tables with a SQL inner join, the output returns only matching rows from both the tables. When using a SQL outer join, not only it will list the matching rows, it will also list the unmatched rows from the other tables.
Which join type combine rows from one or more tables?Cross joins
In a cross join, each row from one table is combined with each row from another table, resulting in what is called a cross product or a Cartesian product.
Which type of join returns rows from two table only if there is a match between column in both table?Outer joins are joins that return matched values and unmatched values from either or both tables. There are a few types of outer joins: LEFT JOIN returns only unmatched rows from the left table, as well as matched rows in both tables.
Which of the given join will combine rows from different tables if the join condition is true?The INNER JOIN keyword selects all rows from both the tables as long as the condition is satisfied. This keyword will create the result-set by combining all rows from both the tables where the condition satisfies i.e value of the common field will be the same.
|