Course
In the SQL universe, the ability to combine data from multiple queries is a fundamental requirement, and SQL offers powerful functions like UNION and UNION ALL to achieve this.
As you will see, understanding the subtle differences between UNION and UNION ALL functions in SQL is imperative for efficient data querying and management. This tutorial explains their key similarities, differences, and usage scenarios using companion datasets to help you optimize your SQL queries.
The Short Answer: SQL UNION vs. UNION ALL
The key difference is that UNION removes duplicate records, whereas UNION ALL includes all duplicates. This distinction not only changes the number of rows in the query result, but it also impacts performance.
Let's consider two sample tables, employees_2023 and employees_2024, representing employee records for two different years.
SELECT *
FROM employees_2023
|
employee_id |
name |
department |
|
1 |
Alice |
HR |
|
2 |
Bob |
IT |
|
3 |
Charlie |
Finance |
SELECT *
FROM employees_2024
|
employee_id |
name |
department |
|
2 |
Bob |
IT |
|
3 |
Charlie |
Finance |
|
4 |
David |
Marketing |
Now, let’s join our two tables using the UNION function.
SELECT employee_id, name, department FROM employees_2023
UNION
SELECT employee_id, name, department FROM employees_2024;
|
employee_id |
name |
department |
|
1 |
Alive |
HR |
|
2 |
Bob |
IT |
|
3 |
Charlie |
Finance |
|
4 |
David |
Marketing |
In this result, duplicate records are removed. Now let’s compare this with the result we obtain after using UNION ALL.
SELECT employee_id, name, department FROM employees_2023
UNION ALL
SELECT employee_id, name, department FROM employees_2024;
|
employee_id |
name |
department |
|
1 |
Alive |
HR |
|
2 |
Bob |
IT |
|
3 |
Charlie |
Finance |
|
2 |
Bob |
Bob |
|
3 |
Charlie |
Finance |
|
4 |
David |
Marketing |
In this result, all records are included, and duplicates are not removed.
What is UNION in SQL?
The UNION function combines the results of two or more SELECT queries into a single result set, removing duplicate rows. Each SELECT statement within the UNION must have the same number of columns. Also, they have to have similar data types, and the columns must also be in the same order. To understand the function, let’s create two sample tables below.
CREATE TABLE sales_team (
employee_id INT,
employee_name VARCHAR(50)
);
INSERT INTO sales_team (employee_id, employee_name) VALUES
(1, 'Alice'),
(2, 'Bob'),
(3, 'Charlie');
CREATE TABLE support_team (
employee_id INT,
employee_name VARCHAR(50)
);
INSERT INTO support_team (employee_id, employee_name) VALUES
(3, 'Charlie'),
(4, 'David'),
(5, 'Eve');
The above code will create the two tables we’ll use in this tutorial, sales_team and support_team, respectively. You can view the tables using the code below.
SELECT *
FROM sales_team

Records in the sales_team table. Image by Author
SELECT *
FROM support_team

Records in the support_team table. Image by Author
These are small tables with three records each, and the column names are self-explanatory. Now that we have our two tables created, let’s apply the UNION function.
SELECT employee_id, employee_name
FROM sales_team
UNION
SELECT employee_id, employee_name
FROM support_team;

Output from the UNION function. Image by Author
You can see that the duplicate entry of employee_id, which takes the value of 3, is removed.
What Are the Use Cases for UNION in SQL?
Here are some common use cases for the UNION function.
-
Combining Results From Different Tables: When you want to combine data from multiple tables and ensure no duplicate records,
UNIONis the go-to function. -
Handling Different Data Sources:
UNIONis useful when combining tables from different data sources. -
Removing Duplicates Across Queries: When you want to ensure the uniqueness of the combined result set, use
UNION.
What is UNION ALL in SQL?
The UNION ALL function combines the results of two or more SELECT queries, including all duplicate rows. This function is faster than UNION because it doesn’t bother removing duplicates.
SELECT employee_id, employee_name
FROM sales_team
UNION ALL
SELECT employee_id, employee_name
FROM support_team;

Output from the UNION ALL function. Image by Author
You can see that, in this case, the duplicate entries are repeated and not removed.
What Are the Use Cases for UNION ALL in SQL?
Here are some common use cases for the UNION ALL function.
- Combining Results With Duplicates: Use
UNION ALLwhen you need to combine results from multiple queries and preserve all duplicate rows. - Performance Considerations:
UNION ALLis more time-efficient than theUNIONfunction, because it doesn't require the additional step of removing duplicates. - Aggregating Data From Different Periods: When aggregating data from different periods or sources, and you need to preserve the duplicate entries,
UNION ALLis preferred. - Reporting and Analysis: For reporting purposes where every record, including duplicates, is necessary,
UNION ALLis suitable.
UNION and UNION ALL: Similarities and Differences
Let's create a summary table of the similarities and differences.
|
Feature |
UNION |
UNION ALL |
|
Duplicate rows |
Removed |
Included |
|
Performance |
Slower |
Faster |
|
Use Case |
When you need unique records |
When you need all records |
|
Result Size |
Smaller |
Larger |
UNION and UNION ALL Syntax Differences Across SQL Dialects
While the basic syntax remains the same across different SQL dialects, there are slight variations depending on the platform. If you look at the code then in most of the platforms, such as SQL Server, Oracle, MySQL, PostgreSQL, or BigQuery, the code will look something similar to what is shown below.
SELECT employee_id, employee_name
FROM sales_team
UNION ALL
SELECT employee_id, employee_name
FROM support_team;
Basic commands in these SQL dialects remain the same, with some nuances and differences:
- Handling NULLs and Collations: Different SQL dialects may have different rules for handling
NULLvalues and string collation. - Performance Optimizations: The performance optimizations and execution efficiency might vary.
- Error Handling and Warnings: Different dialects may handle errors and warnings differently when using these functions.
- Platform-Specific Functions and Features: Some SQL dialects might offer additional functions or features that can be used in conjunction with
UNIONandUNION ALLfor more complex scenarios.
The command is significantly different in the case of PySpark, which operates in a distributed environment. The code is given below, assuming df1 and df2 are the names of the two data frames consisting of the two tables we created above. :
df1.union(df2)
Final Thoughts
It is essential to understand the application of UNION as opposed to UNION ALL in managing data efficiently using SQL. Where UNION selects only distinct records, UNION ALL selects all of them, affecting performance and result set size. Use this knowledge to choose the appropriate function for your specific requirements.
For further learning, consider exploring the following sources:
- Introduction to SQL: Learn how to create and query relational databases using SQL.
- SQL Fundamentals: Gain the fundamental SQL skills you need to interact with and query your data.
- SQL Associate Certification: Demonstrate your SQL ability by extracting appropriate data from a database, and using it to answer common data questions.
- Intermediate SQL: With this hands-on course, you’ll learn everything you need to know to analyze data using your own SQL code today.

Seasoned professional in data science, artificial intelligence, analytics, and data strategy.
Frequently Asked Questions
What is the primary difference between SQL UNION and UNION ALL?
UNION removes duplicate records, while UNION ALL includes all duplicates.
Is there a performance difference between UNION and UNION ALL?
Yes, UNION is generally slower because it removes duplicates.
Are there specific use cases where UNION ALL is preferred over UNION?
UNION ALL is preferred when you need all records, including duplicates, such as in reporting and analysis.
When should I use SQL UNION instead of UNION ALL?
Use UNION when you need a combined result set without duplicates.
How do different SQL dialects affect the use of UNION and UNION ALL?
While the basic syntax is consistent, there might be slight variations in performance, handling of NULLs, and collation across SQL dialects.
