Data Storage Standards¶
S3 layer¶
Below are the 6 Buckets created by Blotout in the AWS Account from the purpose of proper data management
Bucket Name | Description |
---|---|
b-[org_name]-[env]-landing | Raw Layer - All the landing/raw data will be first pushed into this bucket |
b-[org_name]-[env]-stg | Staging Layer - All the ELT data will be onboarded to this layer |
b-[org_name]-[env]-processed | Processed Layer - All the processed data or reporting models will be pushed into this bucket |
b-[org_name]-[env]-emr | EMR bucket to maintain Bootstrap files and EMR Logs |
b-[org_name]-[env]-athena-logs | Bucket to store the temporary output generated by Athena |
b-[org_name]-[env]-outbound | Bucket to store the data moving out of the lake |
Note
[org_name] refers to the organization name and [env] refers to the environment type (like prod, sandbox etc.)
Schema Standards¶
Source Type | Athena Schema Name |
---|---|
ELT Sources | [source_name]_[env] |
Click Stream & DBT Generated & Reporting Models | [org-name]_[env] |
Example¶
Let's take Organization Name is _foo_
and env is _prod_
and this Org has enabled 2 ELT pipelines namely Shopify
and Postgres
then below schemas are created and respective data will be onboarded under the same.
- DBT & click stream tables will be maintained under - foo_prod
- Postgres tables will be maintained under - postgres_prod
- Shopify tables will be maintained under - shopify_prod
Table Description¶
Table Name | Table Type | Description |
---|---|---|
view_core_events | Online | Table contains flattened near real time events generated from website on user behaviour |
view_users | Online + Offline | Contains a single uniform view for a user. System will automatically unify CRM profiles coming from multiple channels in system. |
unified_events | Online + Offline | Single source of truth to unify all of the time series events together with Stitched ID's and this data will be used for Segmentation |
view_id_graph | Online + Offline | Maintains the ID Graph between your Cookies, Map ID's and associate with Global ID |