Definitions (schema & SQL)

Materialized views are defined and managed through the MatViews::MatViewDefinition model. This page explains the definition schema, how we resolve schemas & quoting safely, and how to write production-ready SQL for views.

What a definition is

A definition is a persistent record that tells the engine:

  • what to name the materialized view,
  • the SQL that produces its rows,
  • how it should be refreshed (strategy and options),
  • what indexes (especially unique) it expects,
  • optional dependencies metadata for your own tooling.

Once a definition exists, you can create, refresh (regular/concurrent/swap), and delete the materialized view using services, jobs, or rake tasks.


Model schema

MatViews::MatViewDefinition has the following key attributes (columns):

Column Type Required Description
name string Unqualified relation name (e.g., mv_user_activity). Must be a simple SQL identifier (no dot).
sql text The SELECT statement used to build the view.
refresh_strategy enum One of regular, concurrent, swap. Dictates the service used by RefreshViewJob.
schedule_cron string   Optional: for your scheduler to read (engine does not schedule itself).
unique_index_columns jsonb[]   Columns expected to be uniquely indexed on the MV (e.g., ["user_id"]). Required for concurrent refresh.
dependencies jsonb[]   Free-form list (tables, views, other MVs) you consider inputs; purely informational.
last_refreshed_at datetime   Optional timestamp you can maintain in your app.

Naming rule: name must be a simple identifier (/^[A-Za-z_][A-Za-z0-9_]*$/). Do not put a schema in the name (no dots). The engine resolves the schema separately (see below).


Schema resolution & quoting

We always build a qualified relation like:

"schema"."relname"

Key points:

  • We determine the schema from the database session’s search_path.
    • Empty or invalid search_path → we fall back to public.
    • $user tokens are supported; if the user schema doesn’t exist we fall back to public.
    • Quoted tokens in search_path are handled.
  • The definition’s name must not include a dot; it’s treated as a bare identifier and validated (public.mv would be rejected).

  • Quoting:
    • Schema and relation parts are each identifier-quoted to avoid injection and case/keyword issues.
    • Internally we centralize this in helpers that produce a qualified_rel string suitable for SQL like CREATE MATERIALIZED VIEW, REFRESH MATERIALIZED VIEW, and DROP MATERIALIZED VIEW.

If you ever hand-roll SQL, never interpolate raw strings; always use the engine’s helpers (e.g., the service’s qualified_rel) or ActiveRecord quoting methods to build "schema"."rel" safely.


Refresh strategies (recap)

  • regularREFRESH MATERIALIZED VIEW (locks reads during refresh).
  • concurrentREFRESH MATERIALIZED VIEW CONCURRENTLY (keeps reads available) requires a unique index covering all rows.
  • swap — create a temp MV from the definition SQL, atomically swap names, recreate indexes/permissions; ideal for low downtime when concurrent is not feasible.

Choose in the definition:

refresh_strategy: :regular   # or :concurrent, :swap

Writing production-ready SQL

General guidelines

  • SELECT only: Your sql should be a single SELECT (CTEs are fine).
  • Deterministic output: Prefer immutable/strict functions; avoid volatile functions (random(), clock_timestamp()) unless you truly need them.
  • No SELECT *: List columns explicitly. It stabilizes indexes, consumers, and future changes.
  • Group/aggregate carefully: Every non-aggregated column must be in GROUP BY. Keep it minimal to avoid accidental row duplication.
  • Filter early: Push filters into the base CTE/joins rather than on top of an already massive result.
  • Use CTEs for clarity: They help readability and make swap/concurrent troubleshooting easier.
  • Indexes: If you plan to refresh concurrently, design a unique key in the result set and index it after creation:

    CREATE UNIQUE INDEX CONCURRENTLY idx_mv_user_activity_user_id
    ON "public"."mv_user_activity"(user_id);
    
  • Permissions: Grant read privileges to app roles as needed after creation.

Example: simple one-table view

MatViews::MatViewDefinition.create!(
  name: 'mv_users',
  sql: <<~SQL,
    SELECT u.id, u.email, u.created_at
    FROM users u
  SQL
  refresh_strategy: :regular
)

Example: two-table aggregate

MatViews::MatViewDefinition.create!(
  name: 'mv_user_accounts',
  sql: <<~SQL,
    WITH accounts_by_user AS (
      SELECT a.user_id, COUNT(*) AS accounts_count
      FROM accounts a
      GROUP BY a.user_id
    )
    SELECT u.id AS user_id,
           COALESCE(ab.accounts_count, 0) AS accounts_count
    FROM users u
    LEFT JOIN accounts_by_user ab ON ab.user_id = u.id
  SQL
  refresh_strategy: :concurrent,
  unique_index_columns: ['user_id']
)

Example: three- and four-table aggregates (pattern)

MatViews::MatViewDefinition.create!(
  name: 'mv_user_activity',
  sql: <<~SQL,
    WITH a AS (
      SELECT user_id, COUNT(*) AS accounts_count
      FROM accounts GROUP BY user_id
    ),
    e AS (
      SELECT user_id, COUNT(*) AS events_count
      FROM events GROUP BY user_id
    ),
    s AS (
      SELECT user_id, COUNT(*) AS sessions_count
      FROM sessions GROUP BY user_id
    )
    SELECT u.id AS user_id,
           COALESCE(a.accounts_count, 0)  AS accounts_count,
           COALESCE(e.events_count, 0)    AS events_count,
           COALESCE(s.sessions_count, 0)  AS sessions_count
    FROM users u
    LEFT JOIN a ON a.user_id = u.id
    LEFT JOIN e ON e.user_id = u.id
    LEFT JOIN s ON s.user_id = u.id
  SQL
  refresh_strategy: :concurrent,
  unique_index_columns: ['user_id']
)

Unique indexes & concurrent refresh

For concurrent refresh to succeed, Postgres requires a unique index that uniquely identifies every row in the MV.

  • Pick a stable, deterministic key from your result set (e.g., user_id, or a composite).
  • Create the index after the MV is created and before you attempt a concurrent refresh:

    CREATE UNIQUE INDEX CONCURRENTLY idx_mv_user_activity_user_id
    ON "public"."mv_user_activity"(user_id);
    
  • If you use the swap strategy, ensure the swap step recreates indexes on the new view before renaming or immediately after, per your service’s implementation.

Dependencies (optional metadata)

dependencies is a JSON array you can use to document upstream tables/views/MVs:

dependencies: [
  { "type": "table", "name": "users" },
  { "type": "table", "name": "accounts" },
  { "type": "mv",    "name": "mv_users" }
]

This has no functional effect in the engine today; it’s useful for your dashboards, CI checks, or doc generation.


Validations & safety checks

  • Name validation: name must be a simple identifier (no schema). If you need a different schema, change the DB search_path (the engine resolves and validates it).
  • Existence checks: Services verify whether the MV exists when appropriate (e.g., creation vs refresh vs delete with IF EXISTS).
  • Quoting: All SQL issued by services uses safe identifier quoting for schema and relation separately.
  • Row-count strategy (refresh): :estimated (fast, via reltuples) or :exact (via COUNT(*)).

Common pitfalls & how to avoid them

  • Using a dotted name in definition.name Don’t. The engine will reject public.mv_name. Use mv_name; schema is resolved from search_path.

  • Concurrent refresh without unique index You’ll get ERROR: cannot refresh materialized view concurrently without a unique index. Design your result set to include a unique key, then index it.

  • Volatile functions in SQL Avoid random(), clock_timestamp(), etc. They can cause non-repeatable results and surprise diffs.

  • Relying on ORDER BY without LIMIT ORDER BY in a view is usually wasted work unless it’s part of a limited/top-N pattern. Sort at read time if needed.

  • Forgetting permissions Grant SELECT to your app role after creation or swap:

    GRANT SELECT ON "public"."mv_user_activity" TO app_user;
    

Creating, refreshing, deleting

Use the services programmatically, or the rake tasks:

# Create
MatViews::Services::CreateView.new(defn, force: true).run

# Refresh (strategy chosen by definition.refresh_strategy)
MatViews::RefreshViewJob.perform_later(defn.id, row_count_strategy: :estimated)

# Delete
MatViews::DeleteViewJob.perform_later(defn.id, cascade: false)

All enqueue paths go through: MatViews::Jobs::Adapter.enqueue(job_class, queue: MatViews.configuration.job_queue, args: [...]) The adapter uses your configured backend (ActiveJob / Sidekiq / Resque). It doesn’t guess.


See also