Union SQL: Combining Result Sets

Union SQL: Combining Result Sets

When working with databases, it’s common to retrieve data not just from a single query, but from multiple queries that need to be combined. This is especially useful when the data you’re analyzing is stored in different tables or when you want to bring together similar records that satisfy different conditions. In SQL, the UNION command serves this exact purpose—it allows you to combine the result sets of two or more SELECT queries into a single, unified output.

TL;DR:

All Heading

SQL’s UNION operator is used to combine the results of multiple SELECT statements into one result set while ensuring that the final output has no duplicate rows. Each SELECT statement must have the same number of columns and compatible data types. If you need to include duplicates, you can use UNION ALL instead. It’s a powerful query tool that promotes cleaner and more efficient data retrieval across various domains.

What Is UNION in SQL?

The SQL UNION command combines the result sets of two or more SELECT queries. This can be particularly useful in situations where different datasets contain similar kinds of information but are stored in separate tables or parsed through different criteria.

For example, consider two tables: North_Employees and South_Employees. If we want to create a master list of all unique employee names, UNION can help us merge these lists efficiently.

Syntax of UNION


SELECT column1, column2, ...
FROM table1
UNION
SELECT column1, column2, ...
FROM table2;

Important points to remember:

  • Each SELECT statement must have the same number of columns.
  • Corresponding columns must have similar data types.
  • Column names in the result are decided by the first SELECT clause.
  • By default, UNION removes duplicates. Use UNION ALL if you want to retain them.

UNION vs. UNION ALL

While UNION removes duplicate records between combined SELECT statements, UNION ALL includes every row—even if duplicates exist. Which one to use depends on your objectives and whether you need a distinct merged output.


-- Removes duplicates
SELECT name FROM North_Employees
UNION
SELECT name FROM South_Employees;

-- Keeps duplicates
SELECT name FROM North_Employees
UNION ALL
SELECT name FROM South_Employees;

Practical Uses of UNION

UNION is incredibly versatile. Here are some real-world examples where it plays a critical role:

  • Combining reports from different departments (Sales, Marketing, HR, etc.).
  • Consolidating historical data spread across multiple archived tables.
  • Pulling similar records based on different search conditions (e.g., active vs. retired users).
  • Merging user access logs from multiple applications or regions.

Case Study: Combining Sales Data

Imagine a business that stores online and in-store sales separately in the tables Online_Sales and Retail_Sales. Both tables have columns like OrderID, CustomerID, ProductID, and Amount.

If management wants a unified summary of all sales regardless of channel, you’d write:


SELECT OrderID, CustomerID, ProductID, Amount
FROM Online_Sales
UNION
SELECT OrderID, CustomerID, ProductID, Amount
FROM Retail_Sales;

To include every transaction, even if some are duplicates:


SELECT OrderID, CustomerID, ProductID, Amount
FROM Online_Sales
UNION ALL
SELECT OrderID, CustomerID, ProductID, Amount
FROM Retail_Sales;

What Happens with Mismatched Columns?

If you try to UNION two SELECT statements that return a different number of columns, SQL will throw an error. For example:


-- Errors out due to mismatch in column counts
SELECT name, age FROM Table_A
UNION
SELECT name FROM Table_B;

Even if the column count matches, make sure the data types of matching columns are compatible. Otherwise, implicit conversions might occur, or errors may pop up.

Ordering Your Combined Results

After combining queries using UNION, you can sort the final result set with ORDER BY. But be cautious—sorting applies to the entire OUTPUT, not to individual SELECT statements.


SELECT name, department FROM North_Employees
UNION
SELECT name, department FROM South_Employees
ORDER BY name ASC;

Advanced Use: UNION with Filters

You can apply WHERE clauses to each SELECT query independently before combining them via UNION. This gives you more control over what data gets pulled from each source.


SELECT name, salary FROM North_Employees
WHERE salary > 50000
UNION
SELECT name, salary FROM South_Employees
WHERE salary < 30000;

The final result will include employees from the North earning more than $50,000 and employees from the South earning less than $30,000—without any duplicates.

Potential Pitfalls and Performance

While UNION is powerful, it’s not without its caveats:

  • Performance Issues: Removing duplicates (the default behavior) can be computationally intensive for large datasets.
  • Implicit Sorting: UNION imposes an implicit sort to eliminate duplicates, and that can slow things down.
  • Use of UNION ALL: If you don’t need duplicate elimination, UNION ALL is almost always faster.

When to Use UNION vs. JOIN

It’s easy to confuse UNION with JOIN, but they serve different purposes. UNION combines rows vertically, stacking one on top of the other when tables are structurally similar. JOINs combine rows horizontally, matching columns from multiple tables based on relationships.

Use UNION when:

  • You want to display results from similar structured tables stacked as one.
  • You have similar records with different query constraints.

Use JOIN when:

  • You need to connect different attributes about an entity spread across multiple tables.
  • You want to expand the dataset horizontally by adding columns.

Tips for Writing Effective UNION Queries

  • Always double-check that each SELECT returns the same number of columns and in the same order.
  • Use aliases to standardize column names across queries.
  • Pay attention to data types—implicit typecasting can lead to bugs or inaccurate results.
  • If performance is a concern, test queries with both UNION and UNION ALL to measure differences.
  • Use ORDER BY only on the final result set for sorting accuracy.

Conclusion

The SQL UNION operator is a critical feature for developers and data analysts who wrangle complex datasets. Its ability to combine various result sets into one clean output makes it invaluable in situations ranging from simple data merging to enterprise-grade reporting systems. Whether you’re unifying archival tables, building reports that span locations, or simplifying user logs from multiple domains, UNION and UNION ALL give you the flexibility to do it swiftly and accurately.

Next time you find yourself writing multiple SELECT statements and manually combining results outside SQL, think again—UNION might just be the cleanest, most efficient path forward.