Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124

I always find joy in understanding concepts at a deep and fundamental level. This stems from a love of mathematics and the mental journeys you can embark on. Being able to understand set theory, this fundamental branch of mathematics, is one of those journeys, providing the theoretical underpinnings of relational databases. This post explores how concepts from set theory, such as sets, unions, intersections, and differences are manifested in SQL operations and the relational model.
By thinking about tables as sets of tuples and SQL commands as set operations, I can illustrate the deep links between these fields. Through examples and analysis, it’s possible to demonstrate how understanding set theory enhances SQL proficiency. This is a high-level summary of the key concepts only and not a deep exposé of George Cantors career.
In set theory, a set is a collection of distinct elements. In SQL, a table is a collection of distinct rows (when considering set operations).
Set Theory:
SQL:
CREATE TABLE set_a (value INTEGER);
INSERT INTO set_a VALUES (1), (2), (3), (4), (5);
CREATE TABLE set_b (value INTEGER);
INSERT INTO set_b VALUES (4), (5), (6), (7), (8);Set Theory:
The union of two sets A and B, denoted:
contains all elements that are in A, in B, or in both.
Note: Pure set theory doesn’t have duplicates, but SQL’s UNION ALL operates on multisets (bags) where duplicates are allowed.
Set Theory:
The intersection of sets A and B, denoted:
contains only elements present in both sets.
SQL Intersect:
SELECT DISTINCT a.value
FROM set_a a
INNER JOIN set_b b ON a.value = b.value;
-- Result: 4, 5Set Theory:
The difference (or relative complement):
contains elements in A but not in B.
SQL Difference:
-- left join
SELECT a.value
FROM set_a a
LEFT JOIN set_b b ON a.value = b.value
WHERE b.value IS NULL;
-- Result: 1, 2, 3
--not in
SELECT value FROM set_b
WHERE value NOT IN (SELECT value FROM set_a);
-- Result: 6, 7, 8Set Theory:
The Cartesian product:
creates ordered pairs with all possible combinations.
SQL Cross Join:
-- Sample tables
CREATE TABLE colors (color VARCHAR(10));
INSERT INTO colors VALUES ('Red'), ('Blue');
CREATE TABLE sizes (size VARCHAR(10));
INSERT INTO sizes VALUES ('Small'), ('Large');
-- Cartesian product
SELECT c.color, s.size
FROM colors c
CROSS JOIN sizes s;
-- Result:
-- Red, Small
-- Red, Large
-- Blue, Small
-- Blue, LargeSet Theory:
means A is a subset of B (every element of A is in B).
SQL Check:
-- Check if all values in C exist in A
SELECT CASE
WHEN COUNT(*) = 0 THEN 'C is a subset of A'
ELSE 'C is NOT a subset of A'
END as result
FROM (
SELECT value FROM set_c
EXCEPT
SELECT value FROM set_a
) not_in_a;
-- use not exists
SELECT CASE
WHEN NOT EXISTS (
SELECT 1 FROM set_c c
WHERE NOT EXISTS (
SELECT 1 FROM set_a a WHERE a.value = c.value
)
) THEN 'C is a subset of A'
ELSE 'C is NOT a subset of A'
END as result;Set Theory:
The power set:
is the set of all subsets of A.
SQL Power Set:
WITH RECURSIVE power_set AS (
-- Start with empty set
SELECT ARRAY[]::INTEGER[] as subset, 0 as idx
UNION ALL
-- Add each element
SELECT
ps.subset || a.value,
ps.idx + 1
FROM power_set ps
CROSS JOIN (SELECT value, ROW_NUMBER() OVER (ORDER BY value) as rn FROM set_a) a
WHERE a.rn > ps.idx
)
SELECT subset FROM power_set;
-- Result: all possible subsetsSet Theory:
The complement of A with respect to universal set U, denoted:
contains all elements in U but not in A.$
SQL Compliment:
CREATE TABLE universal_set (value INTEGER);
INSERT INTO universal_set VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10);
-- Complement of A
SELECT value FROM universal_set
EXCEPT
SELECT value FROM set_a;
-- Result: 6, 7, 8, 9, 10Set Theory:
The symmetric difference:
contains elements in either A or B but not in both.
SQL Exammple:
-- Method 1: Union of differences
SELECT value FROM set_a
EXCEPT
SELECT value FROM set_b
UNION
SELECT value FROM set_b
EXCEPT
SELECT value FROM set_a;
-- Result: 1, 2, 3, 6, 7, 8
-- Method 2: Union minus intersection
SELECT value FROM (
SELECT value FROM set_a
UNION
SELECT value FROM set_b
) combined
EXCEPT
SELECT value FROM (
SELECT value FROM set_a
INTERSECT
SELECT value FROM set_b
) common;
-- Result: 1, 2, 3, 6, 7, 8Set Theory:
The cardinality:
is the number of elements in set A.
SQL Example:
-- Cardinality of A
SELECT COUNT(*) as cardinality FROM set_a;
-- Result: 5
-- Cardinality of intersection
SELECT COUNT(*) as cardinality FROM (
SELECT value FROM set_a
JOIN
SELECT value FROM set_b
) intersection;
-- Result: 1
-- Cardinality of union
SELECT COUNT(*) as cardinality FROM (
SELECT value FROM set_a
UNION
SELECT value FROM set_b
) union_set;
-- Result: 8Set Theory:
The empty set :
contains no elements.
Set Theory:
-- Query that returns empty set
SELECT value FROM set_a WHERE value > 100;
-- Result: (empty)
-- Check for empty set
SELECT CASE
WHEN COUNT(*) = 0 THEN 'Empty set'
ELSE 'Non-empty set'
END as result
FROM (SELECT value FROM set_a WHERE value > 100) empty_check;
-- Result: 'Empty set'De Morgan’s Laws describe how negation distributes over set operations:
First Law: The complement of a union equals the intersection of complements
Second Law: The complement of an intersection equals the union of complements
In simple terms: when you negate a combined set operation, you flip both the operation (union ↔ intersection) and negate each individual set.
Set Theory:
De Morgan’s Laws state:
SQL Example:
-- NOT (A OR B) = (NOT A) AND (NOT B)
-- Find values not in A and not in B
SELECT value FROM universal_set
WHERE value NOT IN (SELECT value FROM set_a)
AND value NOT IN (SELECT value FROM set_b);
-- Equivalent to:
SELECT value FROM universal_set
EXCEPT
(
SELECT value FROM set_a
UNION
SELECT value FROM set_b
);Set Theory: Sets have no duplicates by definition.
SQL: Tables can have duplicate rows (multisets/bags).
-- SQL allows duplicates in tables
INSERT INTO set_a VALUES (5), (5), (5);
-- Now set_a has multiple 5s
-- UNION removes duplicates (set semantics)
SELECT value FROM set_a UNION SELECT value FROM set_b;
-- UNION ALL keeps duplicates (bag semantics)
SELECT value FROM set_a UNION ALL SELECT value FROM set_b;Set Theory: No concept of NULL or unknown.
SQL: NULL represents unknown/missing values with special logic.
-- NULL in set operations
SELECT NULL as value
UNION
SELECT NULL as value;
-- Result: One NULL (duplicates removed)
-- NULL in comparisons
SELECT * FROM set_a WHERE value = NULL; -- Returns nothing
SELECT * FROM set_a WHERE value IS NULL; -- Correct waySet Theory: Sets are unordered.
SQL: Results can be ordered with ORDER BY (returns a list, not a set).
-- Set theory: {3, 1, 2} = {1, 2, 3}
-- SQL without ORDER BY: order is undefined
SELECT value FROM set_a; -- Order not guaranteed
-- SQL with ORDER BY: returns ordered list
SELECT value FROM set_a ORDER BY value; -- Guaranteed orderUnderstanding the set theory foundations of SQL helps you:
Remember: SQL is applied set theory. Every table is a set (or multiset), every query is a combination of set operations, and every JOIN is a form of Cartesian product with filtering. Master set theory, and you’ll master SQL.