An excellent read on Bitemporal Data: tedamoh.com/en/konferenzen/181-flashback-to-the-data-vault-user-group-in-may-22
Link Hashkeys should be generated using the Business Keys participating in the Link, and not the Hashkeys of of the Hubs. Hashing a hash is just bad design.
A Satellite contains the descriptive attributes of a Business Key. As such, a Satellite HASHDIFF should be constructed using the only the descriptive attributes of the Business Key. The Business Key, itself, should not be part of the Satellite HASHDIFF. Note: While it is a common practice to include the Business Keys in the SAT … Data Vault Anti-pattern: Including Business Keys in the SAT HASHDIFF
All new columns added to the SAT must be added to the end of the Satellite HASHDIFF. Dan Linstedt describes this as, “Columns that are NULL and at the end of the table are not added to the input of the hash function”. Basically this pattern prevents reloading of the entire dataset when there is … Data Vault Anti-pattern: Adding column to the middle of the hashdiff
When using a Change Data Capture (CDC) tool it is possible to get multiple records for the same Business Key in a single micro-batch. Loading this data as-is will will result in multiple records for the same Business Key for the same LOAD_DATE. This is the incorrect loading pattern. A regular Satellite should have only … Data Vault Anti-pattern: Using Multi-Active SAT to model data with multiple records for the same Business Key that arrive one Micro-batch
A Satellite, by definition, should have only one record per Business Key per Load Date. The Business Key (or the hash of the Business Key) + LOAD_DATE is the unique key for the record. BK + LOAD_DATE is the Primary of the SatelliteThe exception is a Multi-Active Satellite where a Sequence Number is added to … Data Vault Anti-pattern: Having two or more records in a SAT for a Single Business Key with the same LOAD_DATE
Non-historised Links represent data can not change or be deleted, for e.g. Stock Trades, Medical Test results etc. This data, once recorded, should not change. As such, there is no need to capture the End date of these. If you find yourself having the need to add the End Date to a LINK table, then … Data Vault Anti-pattern: Adding Relationship Temporality to a Non-historised LINK
If your LINKs have Dependent Childs, for e.g. Order Line Item as show below, it is crucial that the Order Line Item is included the LINK Hashkey.If the Dependent Child is not included the generation of the LINK Hashkey, the JOINs to the LINK table will not be easy.
Transactional Data that does not change e.g. sensor data, stock trades, call center call data log, medical test results, event logs etc. should reside in a Non-historized Link (NHL) aka. Transaction Link. There is no point in using a Historized Link to store data that can not change. All of the attributes of the Transaction … Data Vault Anti-pattern: Using Historized Links to store Transactional data that does not change
While it is tempting to implement Business Rules at the Infomart Level, that is not where the Business Rules should reside. They should reside in Business Vault. This enables historisation of the Business Rules and introduces auditability. When the Business Rule changes, with historisation it is possible to go back in time and analyze the … Data Vault Anti-pattern: Implementing Business Rules at the Infomart Level