At times you need to model Many-to-Many relationship that changes over time in Data Vault. For e.g., you may have HUB_CUSTOMER and HUB_ADDRESS. Several Customer may be associated with the same Address, and also multiple Addresses can be associated with a single Customer. And this can change over time. For e.g.
Day 1:
HUB_CUSTOMER_HASHKEY | HUB_ADDRESS_HASHKEY |
---|---|
111 | aaa |
111 | bbb |
111 | ccc |
Day 2:
HUB_CUSTOMER_HASHKEY | HUB_ADDRESS_HASHKEY |
---|---|
111 | aaa |
111 | bbb |
222 | ccc |
Day 3:
HUB_CUSTOMER_HASHKEY | HUB_ADDRESS_HASHKEY |
---|---|
111 | aaa |
111 | bbb |
222 | ccc |
111 | ccc |
LINK_CUSTOMER_ADDRESS will only capture the first occurrence of the relationship, but will not capture the changing nature of the relations. LINK_CUSTOMER_ADDRESS will look as follows:
LINK_CUSTOMER_ADDRESS_HK | HUB_CUSTOMER_HK | HUB_ADDRESS_HK | LOAD_DATE |
---|---|---|---|
1qaz | 111 | aaa | Day 1 |
2wsx | 111 | bbb | Day 1 |
3edc | 111 | ccc | Day 1 |
4rfv | 222 | ccc | Day 2 |
Note that the Link is only capturing the first appearance of the relationship, but nothing about the effectivity of the relationship. This is where you need the Record Tracking Satellite. A Record Tracking Satellite is a narrow table that keeps tracks of LINK Hashkeys or SAT HASHKEY for each load. It basically captures all the HASHKEYs that are valid for that load. The Record Tracking Satellite for the above Many-to-Many relationship that changes over time will look as following
LINK_CUSTOMER_ADDRESS_HK | LOAD_DATE |
---|---|
1qaz | Day 1 |
2wsx | Day 1 |
3edc | Day 1 |
1qaz | Day 2 |
2wsx | Day 2 |
4rfv | Day 2 |
1qaz | Day 3 |
2wsx | Day 3 |
4rfv | Day 3 |
3edc | Day 3 |
This way you can capture the effectivity of Many-to-Many relationships that change over time.
Data Vault Record Tracking Satellite hanging from a Link |