Skip to main content

Data Connector model evolution policy

Core Principle: Limit Breaking Changes

Everything will be done to minimize "breaking changes": we aim to ensure that future updates do not force you to redo or "break" your Excel sheets or BI templates.

To achieve this, we will apply a few basic principles.

Rule 1 - Preserve Column Order in Tables

  • For a given table, if a new column needs to be added, it will always be added at the end of the table to avoid disrupting the order of existing data.
    • For example, if I have already imported a table into Excel with id, first name, last name, task id, and I want to add email, I will add it at the end of the table.
    • This may be less "readable" because the email won't come right after the name fields, but it preserves the column order of the imported data and prevents breaking existing formulas in the Excel file.
    • Thus, all data references based on absolute positions (like column C, or the 5th column) remain intact.

V1

KeyType
id
first_name
last-name
task_id

V2

KeyType
id
first_name
last-name
task_id
eMail

Rule 2 - Keep Row Count Unchanged for a Given Table (Despite Denormalization)

  • If a denormalization would increase the number of rows in a table:
    • The original table must keep its row count unchanged.
    • A second, fully denormalized table will be created instead.

Example (more illustrative than the rule itself):

Task

KeyType
id
description

Task

Iddescription
10Tache A
22Tache B
30Tache C

User

KeyType
id
first_name
last_name
task_id

User

idfirst_namelast_nametask_id
1PaulB.10
2ArthurD.10

Task (V2)

Iddescriptionuser-first_nameuser_last_name
10Tache APaulB.
10Tache AArthurD.

Exemple

  • In this example, both Paul and Arthur are assigned to task "10".

  • If we wanted to "link" users to tasks, we would end up with this table.

  • Two rows for the same task, because task 10 has two users assigned to it.

  • This kind of row multiplication for a single task (and having multiple technicians on the same task is quite common) would break several of your dashboards, indicators, graphs, etc.

    • For example, if task 10 is delayed, it would now count as two delayed tasks instead of one.
  • However, such a V2 table might still be needed for other use cases.

Here’s how we’ll solve this issue:

Task (V2)

Iddescriptionusers
10Tache APaul B., Arthur D.

The list of technicians is presented in an aggregated form: the number of rows remains unchanged despite the additional information.

Task_denormalized

Iddescriptionuser-first_nameuser_last_name
10Tache APaulB.
10Tache AArthurD.

A new denormalized table is added.