GitHub - sergeyprokhorenko/6NF_DSL: DSL for Bitemporal Sixth Normal Form with UUIDv7

11 min read Original article ↗

DSL for Bitemporal Sixth Normal Form with UUIDv7

Table of Contents

  1. Introduction
  2. Approximate Correspondence of Terms
  3. Create Entities
  4. Create Reference
  5. Create Simple Attribute
  6. Create Attribute with Reference
  7. Create Struct of Attributes
  8. Create Relationship
  9. Attributes Snapshot
  10. Relationship Snapshot
  11. Table Normalization
  12. EBNF Grammar
  13. DSL Implementation Example of a Simple Accounting System

1. Introduction

Sixth Normal Form (6NF) plays a crucial role in data warehouses (DWH) by breaking data down into its smallest temporal parts, enabling accurate tracking of historical and bitemporal details — capturing not only when events actually occurred but also when they were recorded in the system. 6NF easily adapts to changes in data structure without modifying existing records and reduces the volume of data that needs to be processed during updates and queries.

Here is a new, concise, Excel-friendly and autogeneratable domain-specific language (DSL) for a bitemporal sixth normal form (6NF) data warehouse (DWH) with UUIDv7 primary keys, along with equivalent PostgreSQL 18 SQL code and EBNF.

This project is inspired by Anchor Modeling, Data Vault and Activity Schema.

The DSL addresses the challenge of handling large and complex 6NF data schemas that are difficult to visualize and maintain using both traditional modeling tools and Anchor Modeler. It also eliminates the need to generate SQL code using Python or understand the confusing SQL Server code generated by Anchor Modeler.

Artificial intelligence systems should preferably use the syntax of this DSL rather than the more general and universal SQL syntax, as DSLs are designed with clear, strict rules tailored specifically to domain tasks. This helps avoid ambiguity and errors.

The author has no opportunity to develop a compiler for this DSL, and relies on the community.

2. Approximate Correspondence of Terms

DSL ERD Anchor Modeling Data Vault
Entity Entity Anchor Hub
Reference - Knot -
Simple Attribute Attribute Attribute Satellite
Attribute with Reference - Knotted Attribute -
Struct of Attributes - - Satellite
Relationship Relationship Tie Link

3. Create Entities

-- DSL
CREATE ENTITIES <entity1>, <entity2>, etc.;

-- Equivalent PostgreSQL 18 SQL
CREATE TABLE <entity1> (
    id UUID PRIMARY KEY DEFAULT uuidv7()
);

CREATE TABLE <entity2> (
    id UUID PRIMARY KEY DEFAULT uuidv7()
);

etc.

4. Create Reference

Use a Reference with caution because it is not temporal. It is safer to use Entity and Simple Attribute.

-- DSL
CREATE REFERENCE <reference> <data_type>;

-- Equivalent PostgreSQL 18 SQL
CREATE TABLE <reference> (
    id UUID PRIMARY KEY DEFAULT uuidv7(),
    value <data_type> UNIQUE NOT NULL
);

5. Create Simple Attribute

-- DSL
ENTITY <entity> HAS ATTRIBUTE <attribute> <data_type>;

-- Equivalent PostgreSQL 18 SQL
CREATE TABLE <attribute> (
    entity_id UUID NOT NULL REFERENCES <entity>(id),
    value <data_type> NOT NULL,
    valid_from TIMESTAMPTZ DEFAULT NOW(),
    recorded_at TIMESTAMPTZ DEFAULT NOW(),
    PRIMARY KEY (entity_id, valid_from, recorded_at)
);

6. Create Attribute with Reference

-- DSL
ENTITY <entity> HAS ATTRIBUTE <attribute> REFERENCE <reference>;

-- Equivalent PostgreSQL 18 SQL
CREATE TABLE <attribute> (
    entity_id UUID NOT NULL REFERENCES <entity>(id),
    reference_id UUID NOT NULL REFERENCES <reference>(id),
    valid_from TIMESTAMPTZ DEFAULT NOW(),
    recorded_at TIMESTAMPTZ DEFAULT NOW(),
    PRIMARY KEY (entity_id, valid_from, recorded_at)
);

7. Create Struct of Attributes

Use a Struct of Attributes for input attributes that change simultaneously - such as document or message attributes - or for output attributes of Activity Stream or other normalized data mart. For large numbers of attributes, the jsonb data type is recommended.

-- DSL
CREATE STRUCT <struct> FOR ENTITY <entity> (
<attribute> <data_type>,
-- etc.
<attribute> REFERENCE <reference>
);

-- Equivalent PostgreSQL 18 SQL
CREATE TABLE <struct> (
    entity_id UUID NOT NULL REFERENCES <entity>(id), -- for example, event_id
    <attribute> <data_type> NOT NULL, -- for example, metadata from the source
    -- etc.
    <attribute> UUID NOT NULL REFERENCES <reference>(id),
    valid_from TIMESTAMPTZ DEFAULT NOW(),
    recorded_at TIMESTAMPTZ DEFAULT NOW(),
    PRIMARY KEY (entity_id, valid_from, recorded_at)
);

8. Create Relationship

-- DSL
CREATE RELATIONSHIP <relationship> OF
    <entity_or_reference_1>, 
    <entity_or_reference_2>,
    -- etc.
    <entity_or_reference_n>;

-- Equivalent PostgreSQL 18 SQL
CREATE TABLE <relationship> (
    id UUID DEFAULT uuidv7() UNIQUE,
    /*It is not recommended to create external references to this auxiliary key (id) for implementing business logic.*/
    /*Use this key only for technical purposes: logging, API, data exchange, debugging, auditing, manual analysis*/
    <entity_or_reference_1_id> UUID NOT NULL REFERENCES <entity_or_reference_1>(id),
    --For example:     user_id UUID NOT NULL REFERENCES user(id),
    <entity_or_reference_2_id> UUID NOT NULL REFERENCES <entity_or_reference_2>(id),
    -- etc.
    <entity_or_reference_n_id> UUID NOT NULL REFERENCES <entity_or_reference_n>(id),
    valid_from TIMESTAMPTZ DEFAULT NOW(),
    recorded_at TIMESTAMPTZ DEFAULT NOW(),
    PRIMARY KEY (
        <entity_or_reference_1_id>, 
        <entity_or_reference_2_id>,
        -- etc.
        <entity_or_reference_n_id>,
        valid_from,
        recorded_at
    )
);

9. Attributes Snapshot

The primary key entity_id is preserved. Structs of Attributes can be used as sources alongside Simple Attributes and Attributes with Reference.

-- DSL
SELECT <attributes> FROM ATTRIBUTES OF <entity> VALID AT <valid_at> LAST RECORDED BEFORE <last_recorded_before>;

-- Equivalent PostgreSQL 18 SQL
SELECT 
    <entity>.id,
    <attribute1>.value,
    <attribute2>.value,
    <attribute3>.value
FROM <entity>
LEFT JOIN LATERAL (
    SELECT value
    FROM <attribute1>
    WHERE <attribute1>.entity_id = <entity>.id
      AND <attribute1>.valid_from <= <valid_at>
      AND <attribute1>.recorded_at <= <last_recorded_before>
      AND NOT EXISTS (
          SELECT 1
          FROM <attribute1>
          WHERE <attribute1>.entity_id = <attribute1>.entity_id
            AND <attribute1>.valid_from > <attribute1>.valid_from
            AND <attribute1>.valid_from <= <valid_at>
            AND <attribute1>.recorded_at <= <last_recorded_before>
      )
    ORDER BY <attribute1>.valid_from DESC, <attribute1>.recorded_at DESC
    LIMIT 1
) ON true
LEFT JOIN LATERAL (
    SELECT value
    FROM <attribute2>
    WHERE <attribute2>.entity_id = <entity>.id
      AND <attribute2>.valid_from <= <valid_at>
      AND <attribute2>.recorded_at <= <last_recorded_before>
      AND NOT EXISTS (
          SELECT 1
          FROM <attribute2>
          WHERE <attribute2>.entity_id = <attribute2>.entity_id
            AND <attribute2>.valid_from > <attribute2>.valid_from
            AND <attribute2>.valid_from <= <valid_at>
            AND <attribute2>.recorded_at <= <last_recorded_before>
      )
    ORDER BY <attribute2>.valid_from DESC, <attribute2>.recorded_at DESC
    LIMIT 1
) ON true
LEFT JOIN LATERAL (
    SELECT value
    FROM <attribute3>
    WHERE <attribute3>.entity_id = <entity>.id
      AND <attribute3>.valid_from <= <valid_at>
      AND <attribute3>.recorded_at <= <last_recorded_before>
      AND NOT EXISTS (
          SELECT 1
          FROM <attribute3>
          WHERE <attribute3>.entity_id = <attribute3>.entity_id
            AND <attribute3>.valid_from > <attribute3>.valid_from
            AND <attribute3>.valid_from <= <valid_at>
            AND <attribute3>.recorded_at <= <last_recorded_before>
      )
    ORDER BY <attribute3>.valid_from DESC, <attribute3>.recorded_at DESC
    LIMIT 1
) ON true
ORDER BY <entity>.id;

10. Relationship Snapshot

-- DSL
SELECT <entities_and_references> FROM RELATIONSHIP <relationship> VALID AT <valid_at> LAST RECORDED BEFORE <last_recorded_before>;

-- Equivalent PostgreSQL 18 SQL
SELECT
    id,
    <entity_or_reference_1_id>,
    <entity_or_reference_2_id>,
    -- etc.
    <entity_or_reference_n_id>,
    valid_from,
    recorded_at
FROM (
    SELECT
        id,
        <entity_or_reference_1_id>,
        <entity_or_reference_2_id>,
        -- etc.
        <entity_or_reference_n_id>,
        valid_from,
        recorded_at,
        ROW_NUMBER() OVER (
            PARTITION BY
                <entity_or_reference_1_id>,
                <entity_or_reference_2_id>
                -- etc.
                <entity_or_reference_n_id>
            ORDER BY
                valid_from DESC,
                recorded_at DESC
        ) AS rn
    FROM <relationship>
    WHERE
        valid_from <= <valid_at>
        AND recorded_at <= <last_recorded_before>
) 
WHERE rn = 1
ORDER BY <relationship>.id;

11. Table Normalization

-- DSL
NORMALIZE
    INTO <entity1> (<attribute_11>, <attribute_12>, <attribute_13>) SELECT col1, col2, col3 FROM <source_table>
    INTO <entity2> (<attribute_21>, <attribute_22>) SELECT col4, col5 FROM <source_table>
    INTO <entity3> (<attribute_31>) SELECT col6 FROM <source_table>
    etc.
RELATIONSHIPS  
    <relationship_1>, -- OF <entity1>, <entity2>
    <relationship_2>, -- OF <entity2>, <entity3>
    etc.
VALID FROM <valid_from> -- Timestamp column in <source_table>
FROM <source_table>
WHERE <condition>;


-- Equivalent PostgreSQL 18 SQL

BEGIN;

-- Insert distinct records into entity1
INSERT INTO <entity1> (id, <attribute_11>, <attribute_12>, <attribute_13>, valid_from, recorded_at)
SELECT 
    uuidv7(),
    col1,
    col2,
    col3,
    valid_from,
    NOW()
FROM (
    SELECT DISTINCT col1, col2, col3, valid_from
    FROM <source_table>
    WHERE <condition>
) AS distinct_entity1;

-- Insert distinct records into entity2
INSERT INTO <entity2> (id, <attribute_21>, <attribute_22>, valid_from, recorded_at)
SELECT 
    uuidv7(),
    col4,
    col5,
    valid_from,
    NOW()
FROM (
    SELECT DISTINCT col4, col5, valid_from
    FROM <source_table>
    WHERE <condition>
) AS distinct_entity2;

-- Insert distinct records into entity3
INSERT INTO <entity3> (id, <attribute_31>, valid_from, recorded_at)
SELECT 
    uuidv7(),
    col6,
    valid_from,
    NOW()
FROM (
    SELECT DISTINCT col6, valid_from
    FROM <source_table>
    WHERE <condition>
) AS distinct_entity3;

-- Insert records into relationship_1 by joining source_table with entities
INSERT INTO <relationship_1> (id, <entity1_id>, <entity2_id>, valid_from, recorded_at)
SELECT
    uuidv7(),
    <entity1>.id,
    <entity2>.id,
    <source_table>.valid_from,
    NOW()
FROM <source_table>
JOIN <entity1> ON
    <entity1>.<attribute_11> = <source_table>.col1 AND
    <entity1>.<attribute_12> = <source_table>.col2 AND
    <entity1>.<attribute_13> = <source_table>.col3 AND
    <entity1>.valid_from = <source_table>.valid_from
JOIN <entity2> ON
    <entity2>.<attribute_21> = <source_table>.col4 AND
    <entity2>.<attribute_22> = <source_table>.col5 AND
    <entity2>.valid_from = <source_table>.valid_from
WHERE <condition>;

-- Insert records into relationship_2 by joining source_table with entities
INSERT INTO <relationship_2> (id, <entity2_id>, <entity3_id>, valid_from, recorded_at)
SELECT
    uuidv7(),
    <entity2>.id,
    <entity3>.id,
    <source_table>.valid_from,
    NOW()
FROM <source_table>
JOIN <entity2> ON
    <entity2>.<attribute_21> = <source_table>.col4 AND
    <entity2>.<attribute_22> = <source_table>.col5 AND
    <entity2>.valid_from = <source_table>.valid_from
JOIN <entity3> ON
    <entity3>.<attribute_31> = <source_table>.col6 AND
    <entity3>.valid_from = <source_table>.valid_from
WHERE <condition>;

COMMIT;

12. EBNF Grammar

Terminals: Basic Characters and Literals

letter
    : 'A' | 'B' | 'C' | 'D' | 'E' | 'F' | 'G' | 'H' | 'I' | 'J' | 'K' | 'L' | 'M'
    | 'N' | 'O' | 'P' | 'Q' | 'R' | 'S' | 'T' | 'U' | 'V' | 'W' | 'X' | 'Y' | 'Z'
    | 'a' | 'b' | 'c' | 'd' | 'e' | 'f' | 'g' | 'h' | 'i' | 'j' | 'k' | 'l' | 'm'
    | 'n' | 'o' | 'p' | 'q' | 'r' | 's' | 't' | 'u' | 'v' | 'w' | 'x' | 'y' | 'z'
    ;

digit
    : '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
    ;

string
    : '\'' [^']* '\''
    ;

timestamp
    : string
    ;

number
    : digit+
    ;

Identifier and Related Lists

identifier
    : letter (letter | digit | '_')*
    ;

identifier_list
    : identifier
    | identifier_list ',' identifier
    ;

selection
    : '*'
    | identifier_list
    ;

Data Types

data_type
    : 'UUID'
    | 'INT'
    | 'BIGINT'
    | 'TEXT'
    | 'VARCHAR' '(' number ')'
    | 'NUMERIC' '(' number ',' number ')'
    | 'TIMESTAMPTZ'
    ;

Create Entities

create_entities
    : 'CREATE' 'ENTITIES' identifier_list ';'
    ;

Create Reference

create_reference
    : 'CREATE' 'REFERENCE' identifier data_type ';'
    ;

Create Attribute

create_attribute
    : 'ENTITY' identifier 'HAS' 'ATTRIBUTE' identifier data_type ';'
    | 'ENTITY' identifier 'HAS' 'ATTRIBUTE' identifier 'REFERENCE' identifier ';'
    ;

Create Struct of Attributes

create_struct
    : 'CREATE' 'STRUCT' identifier 'FOR' 'ENTITY' identifier '(' struct_items ')' ';'
    ;

struct_items
    : struct_item
    | struct_items ',' struct_item
    ;

struct_item
    : identifier data_type
    | identifier 'REFERENCE' identifier
    ;

Create Relationship

create_relationship
    : 'CREATE' 'RELATIONSHIP' identifier 'OF' identifier_list ';'
    ;

Attributes Snapshot

select_attributes
    : 'SELECT' selection 'FROM' 'ATTRIBUTES' 'OF' identifier
      'VALID' 'AT' timestamp 'LAST' 'RECORDED' 'BEFORE' timestamp ';'
    ;

Relationship Snapshot

select_relationship
    : 'SELECT' selection 'FROM' 'RELATIONSHIP' identifier
      'VALID' 'AT' timestamp 'LAST' 'RECORDED' 'BEFORE' timestamp ';'
    ;

Table Normalization

normalize
    : 'NORMALIZE' into_clauses 'RELATIONSHIPS' identifier_list
      'VALID' 'FROM' identifier 'FROM' identifier where_clause ';'
    ;

into_clauses
    : into_clause
    | into_clauses into_clause
    ;

into_clause
    : 'INTO' identifier '(' identifier_list ')' 'SELECT' identifier_list 'FROM' identifier
    ;

where_clause
    : /* empty */
    | 'WHERE' condition
    ;

condition
    : expression
    ;

expression
    : term comp_op term
    ;

comp_op
    : '='
    | '<'
    | '>'
    | '<='
    | '>='
    | '<>'
    | '!='
    | 'LIKE'
    | 'ILIKE'
    ;

term
    : identifier
    | string
    | number
    ;

13. DSL Implementation Example of a Simple Accounting System

Create Entities

CREATE ENTITIES account, currency, document, counterparty;

Create References

CREATE REFERENCE amount NUMERIC;  -- positive/negative values indicate debit/credit
CREATE REFERENCE date TIMESTAMPTZ;
CREATE REFERENCE description TEXT;

Create Attributes

-- Currency attributes
ENTITY currency HAS ATTRIBUTE currency_code TEXT;
ENTITY currency HAS ATTRIBUTE currency_name TEXT;

-- Document attributes  
ENTITY document HAS ATTRIBUTE document_type TEXT;
ENTITY document HAS ATTRIBUTE document_status TEXT;
ENTITY document HAS ATTRIBUTE document_desc TEXT;

-- Account attributes
ENTITY account HAS ATTRIBUTE account_code TEXT;
ENTITY account HAS ATTRIBUTE account_name TEXT;
ENTITY account HAS ATTRIBUTE account_type TEXT;

-- Counterparty attributes
ENTITY counterparty HAS ATTRIBUTE counterparty_name TEXT;
ENTITY counterparty HAS ATTRIBUTE counterparty_code TEXT;
ENTITY counterparty HAS ATTRIBUTE counterparty_type TEXT;

Create Struct of Attributes

-- Document metadata
CREATE STRUCT document_metadata FOR ENTITY document (
    document_number TEXT,
    document_date DATE
);

Create Relationship: Accounting Entry

CREATE RELATIONSHIP entry OF
    currency,
    document, 
    account,
    counterparty
    amount,
    date,
    description;

Attributes Snapshots

-- Snapshot of attributes of currency
SELECT * FROM ATTRIBUTES OF currency 
VALID AT '2024-12-31' 
LAST RECORDED BEFORE '2025-01-01';

-- Snapshot of attributes of date
SELECT document_number, document_type, document_status 
FROM ATTRIBUTES OF document 
VALID AT '2024-06-30'
LAST RECORDED BEFORE '2024-07-01';

-- Snapshot of attributes of account
SELECT account_code, account_name, account_type
FROM ATTRIBUTES OF account
VALID AT '2024-12-31'
LAST RECORDED BEFORE '2025-01-01';

Relationship Snapshot

SELECT *
FROM RELATIONSHIP entry
VALID AT '2024-12-31'
LAST RECORDED BEFORE '2025-01-01';

Table Normalization

NORMALIZE
    INTO currency (currency_code, currency_name) 
    SELECT src_currency_code, src_currency_name FROM transaction
    
    INTO document (document_number, document_date, document_type, document_status, document_desc)
    SELECT src_document_number, src_document_date, src_doc_type, src_doc_status, src_document_desc FROM transaction
    
    INTO account (account_code, account_name, account_type)
    SELECT src_account_code, src_account_name, src_account_type FROM transaction
    
    INTO counterparty (counterparty_name, counterparty_code, counterparty_type)
    SELECT src_counterparty_name, src_counterparty_code, src_counterparty_type FROM transaction

RELATIONSHIPS
    entry  -- OF currency, document, account, counterparty, amount, date, description

VALID FROM transaction.date  -- It is assumed that such a column exists
FROM transaction
WHERE transaction.status = 'VALIDATED';