Statements (PostgreSQL)

SELECT

LIKE and ILIKE

DISTINCT

LIMIT and OFFSET

LIMIT and OFFSET are used to return a portion of data from a resultset generated by a query; the LIMIT clause is used to limit the number of records in output and the OFFSET clause is used to provide PostgreSQL with the position in the resultset from which to start returning data.

(Ferrari and Pirozzi 2023, 111)

Subqueries

  • Subqueries return a set.

Subqueries as Expressions

Scalar Subquery

Results in a single row.

Multi-Row Subquery

Results in multiple rows, tipically used in the where clause with operators such as IN, NOT, EXISTS, ANY and ALL.

Correlated Subquery

A uncorrelated sub-query is a type of sub-query where the inner query doesn’t depend upon the outer query for its execution.

  SELECT
    foo,
    (SELECT
      AVG(...)
     FROM B
     WHERE B.foo = A.foo) AS bar
  FROM A;
Uncorrelated Subquery

A correlated sub-query is a type of query where the inner query depends upon the outcome of the outer query in order to perform its execution.

  SELECT
    foo,
    (SELECT
      AVG(...)
     FROM B) AS bar
  FROM A;

Subqueries as Tables

Semi-Joins & Subquery Elimination

A semi-join between two tables R and S returns rows from table R for which there is at least one row from table S with matching values in the joining columns.

(Dombrovskaya, Novikov, and Bailliekova 2021, 138)

JOIN

Cross Join

  SELECT <...>
  FROM table1, table2
  -- or
  SELECT <...>
  FROM table1 CROSS JOIN table2

Inner Join

Outter Join

  • FULL
  • LEFT
  • RIGHT

Lateral Join

A lateral join is a type of join in SQL that allows you to join a table with a subquery, where the subquery is run for each row of the main table. (Ferrari and Pirozzi 2023, 129)

Aggregate Functions

GROUP BY

Aggregate functions are used in conjunction with the group by clause. A group by clause splits a resultset into groups of rows and aggregate functions perform calculations on them. (Ferrari and Pirozzi 2023, 131)

Function Description
AVG()  
COUNT()  
MAX()  
MIN()  
SUM()  
ARRAY_AGG  
JSON_AGG or JSONB_AGG  
BOOL_AND(Predicate)  
BOOL_OR(Predicate)  
  SELECT category, COUNT(*)
  FROM posts
  GROUP BY category;
  -- Outputs:
   category | count
  ----------+-------
          3 | 1
          1 | 2

You can also filter the result of a GROUP BY by using HAVING:

  SELECT category, COUNT(*)
  FROM posts
  GROUP BY category
  HAVING COUNT(*) > 1;

and also pack multiple aggregations in a single SELECT statement:

  SELECT
    category,
    COUNT(*),
    COUNT(*) FILTER (WHERE replies > 2) as number_of_replies
  FROM posts
  GROUP BY category
  HAVING COUNT(*) > 1;

GROUPING SETS, CUBE, and ROLLUP

More complex grouping operations than those described above are possible using the concept of grouping sets. The data selected by the FROM and WHERE clauses is grouped separately by each specified grouping set, aggregates computed for each group just as for simple GROUP BY clauses, and then the results returned.

Combining Queries

UNION / UNION ALL

The UNION operator is used to combine the resultset of two or more SELECT statements. We can use the UNION statement only if the following rules are respected:

  • Each SELECT statement within UNION must have the same number of columns.
  • The columns must have similar data types.
  • The columns in each SELECT statement must be in the same order.

(Ferrari and Pirozzi 2023, 133)

  • UNION is an alias to UNION DISTINCT.

EXCEPT / INTERSECT

Upsert

In PostgreSQL, the UPSERT statement does not exist as in other DBMSes. An UPSERT statement is used when we want to insert a new record on top of the existing record or update an existing record.

The optional ON CONFLICT clause specifies an alternative action to raising a unique violation or exclusion constraint violation error:

  INSERT INTO table_name(c1, c2, ..., cn) VALUES(v1, v2, ..., vn)
  ON CONFLICT (...);

and the conflict action can be either:

  • DO NOTHING.
  • DO UPDATE + (SET and/or WHERE)
  INSERT INTO table_name(c1, c2, ..., cn) VALUES(v1, v2, ..., vn)
  ON CONFLICT
  DO UPDATE SET c1 = EXCLUDED.v1;

where EXCLUDED references the proposed new insertion row.

RETURNING

Reduces the need for an extra query as it enables you to output columns.

Inserts

Updates

Deletes

Common Table Expressions

A CTE, or a common table expression, is a temporary result taken from a SQL statement. This statement can contain SELECT, INSERT, UPDATE, or DELETE instructions. The lifetime of a CTE is equal to the lifetime of the query.

(Ferrari and Pirozzi 2023, 145)

CTEs in PostgreSQL 12+

Starting from PostgreSQL version 12, things have changed, and two new options have been introduced for the execution of a CTE, namely MATERIALIZED and NOT MATERIALIZED.

(Ferrari and Pirozzi 2023, 146)

Recursive

  WITH RECURSIVE fib AS (
    SELECT 
      0 as level,
      0 as n,
      1 as m
    UNION ALL
    SELECT
      level + 1 as level,
      m as n,
      (n + m) as m
    FROM fib WHERE n < 10
  )
  SELECT * FROM fib;
  -- Outputs:
   level | n  | m  
  -------+----+----
       0 |  0 |  1
       1 |  1 |  1
       2 |  1 |  2
       3 |  2 |  3
       4 |  3 |  5
       5 |  5 |  8
       6 |  8 | 13
       7 | 13 | 21

References:

Dombrovskaya, Henrietta, Boris Novikov, and Anna Bailliekova. 2021. Postgresql Query Optimization. Springer.
Ferrari, Luca, and Enrico Pirozzi. 2023. Learn Postgresql: Use, Manage, and Build Secure and Scalable Databases with Postgresql 16. 2nd ed. Packt Publishing Ltd.