Sample data · SimEDW

Amazing Ecommerce.

An Amazon-scale, omnichannel ecommerce demo warehouse: 45 vendor source systems and 79 bronze tables in a single ClickHouse database, with realistic trends and a live event stream.

45
Vendor source systems
79
Bronze tables
5 yrs
Rolling history
60s
Live stream cycle

Company profile

Fictional company notice: "Amazing" is an entirely fictitious company created solely for demonstration purposes. It is not affiliated with, endorsed by, or intended to represent Amazon.com, Inc. or any real company.

Amazing is a global omnichannel retailer selling tens of millions of products across its own first-party inventory, a third-party seller marketplace, and a Prime subscription program. Customers shop via the web storefront, mobile app, and physical retail locations; orders are fulfilled from a network of fulfillment centers and drop-ship suppliers worldwide.

The business spans advertising, financial services, and a B2B wholesale arm alongside its consumer commerce operations — generating roughly 2,800 order headers and 4,200 lines on a typical weekday, with a Prime Day demand surge (July 11–13) across retail, web, marketing, and marketplace channels.

Sample entities: The table below shows representative business entities and their source systems; the complete EDW includes additional entities, sub-types, and specialized tables (e.g., returns, subscriptions, recommendations, logistics tracking) across all 79 bronze tables.

EntitySource systemExample tables
Customer (CRM)salesforceaccount · contact · address
Prime / subscriptionzuoraprime_membership · subscription_event
SKU / productakeneo_pimsku · category · brand
Order (OMS)manhattan_omsorder_header · order_line · return_authorization
Warehouse (WMS)manhattan_wmsstock_balance · movement
Paymentsstripepayment · refund
GL / COAsap_s4gl_account · journal_line
Support / VoCzendesk · qualtricsticket · nps_response

Business value chain

Capability layers left-to-right; arrows are primary flows.

Sample view: This illustrates the primary value chain across the organization; the full generated EDW includes all 45 source systems with cross-functional integrations and secondary data flows not shown here.

%%{init: {"flowchart": {"htmlLabels": false}} }%%
flowchart LR
  subgraph cust["Customer & growth"]
    direction TB
    SF[salesforce]
    ZU[zuora Prime]
    HS[hubspot]
    GA[analytics]
    SF ~~~ ZU
    ZU ~~~ HS
    HS ~~~ GA
  end
  subgraph product["Product"]
    direction TB
    PIM[akeneo_pim]
    ADS[amazon_ads]
    PIM ~~~ ADS
  end
  subgraph commerce["Commerce"]
    direction TB
    OMS[manhattan_oms]
    STR[stripe]
    SC[seller_central]
    OMS ~~~ STR
    STR ~~~ SC
  end
  subgraph supply["Supply chain"]
    direction TB
    ARIBA[sap_ariba]
    WMS[manhattan_wms]
    OTM[oracle_otm]
    ARIBA ~~~ WMS
    WMS ~~~ OTM
  end
  subgraph corp["Corporate"]
    direction TB
    WD[workday]
    S4[sap_s4]
    WD ~~~ S4
  end
  cust --> product --> commerce --> supply --> corp
          

Functional org by department

Enterprise HQ and direct-report functions with representative source systems.

Sample org structure: This shows the primary reporting lines; the full Amazing EC organization includes additional teams, sub-departments, and matrix roles across all 45 vendor integrations.

flowchart TB
  CEO[Amazing EC]
  CEO --> COM[Commerce]
  CEO --> MKT[Marketing]
  CEO --> SC[Supply chain]
  CEO --> FIN[Finance]
  CEO --> HR[HR]
  CEO --> OPS[Operations]
  CEO --> LEG[Legal & security]
          

EDW maturity stages

Catalog stage groups tables from foundation through corporate.

Sample breakdown shown: This is a representative slice of the full 79-table inventory distributed across maturity layers; the actual distribution reflects the complete data lineage from foundation through corporate gold tables.

flowchart LR
  F["foundation
~21 tables"] C["commercial
~11 tables"] R["revenue
~15 tables"] O["operations
~14 tables"] CO["corporate
~18 tables"] F --> C --> R --> O F -.-> CO R -.-> CO

Core business entities

Logical relationships and anchor keys. Bronze values differ by source until VibeBI silver conforms them.

Sample diagram: This entity relationship diagram shows the core logical model; the full generated EDW contains 79 bronze tables across 45 vendor systems with thousands of additional detail tables and lineage branches.

erDiagram
  CUSTOMER ||--o{ ORDER : places
  CUSTOMER ||--o{ SUBSCRIPTION : prime
  CUSTOMER ||--o{ PAYMENT : pays
  PRODUCT ||--o{ ORDER_LINE : contains
  ORDER ||--|{ ORDER_LINE : lines
  ORDER ||--o{ RETURN : may
  ORDER ||--o{ SHIPMENT : fulfills
  SELLER ||--o{ ORDER_LINE : fulfills
  SUPPLIER ||--o{ PO : issues
  LOCATION ||--o{ INVENTORY : stocks
          

Source system landscape

45 vendor prefixes, 79 bronze tables.

Sample systems shown: The diagram below illustrates representative vendor systems; the complete generated dataset includes all 45 systems with their full table lineage and integration points across 5-year history and live streams.

Each table follows the {source_system_}__{entity} naming convention (e.g. salesforce__account, manhattan_oms__order_header), preserving the vendor-native column shape and foreign keys from each product's real export schema. VibeBI infers master-data conformance when it promotes records to silver and gold.

flowchart TB
  subgraph foundation["Foundation"]
    direction TB
    f1["salesforce · workday
akeneo_pim · zuora"] end subgraph commercial["Commercial"] direction TB c1["hubspot · marketo
amazon_ads · cpq"] end subgraph revenue["Revenue"] direction TB r1["manhattan_oms · stripe
oracle_billing"] end subgraph operations["Operations"] direction TB o1["manhattan_wms · sap_ariba
oracle_otm"] end subgraph corporate["Corporate"] direction TB co1["sap_s4 · adp
servicenow · okta"] end foundation --> commercial --> revenue --> operations foundation --> corporate

End-to-end data flow

SimEDW bronze in ClickHouse; VibeBI semantic → silver → gold on the same instance.

Simplified architecture: This diagram shows the primary data path; the actual pipeline includes all 45 upstream systems feeding into the 79 bronze tables, with branching silver and gold lineage for each business capability.

flowchart LR
  subgraph upstream["Upstream apps"]
    direction TB
    CRM["CRM · OMS · WMS"]
    ERP["ERP · Ads · POS"]
    CRM ~~~ ERP
  end
  subgraph simedw["SimEDW"]
    direction TB
    GEN["generate.py"]
    CHB[("ClickHouse amazing
79 bronze tables")] end subgraph vibebi["VibeBI"] direction TB SEM[("semantic")] SIL[("silver views")] GLD[("gold views")] end DASH["Dashboards & agents"] ERP --> GEN --> CHB CHB --> SEM --> SIL --> GLD --> DASH

Live bronze cycle

./live-amazing — 60s run_cycle loop.

Live frequency: Each cycle runs every 60 seconds, appending ~2,800 order headers, ~3,200 warehouse movements, and other high-frequency transaction and event tables (commerce realtime, payment, shipment, and warehouse streams); snapshots refresh once per calendar day.

flowchart LR
  START([tick]) --> RET[Retention purge]
  RET --> SCHEMA[ensure_schema]
  SCHEMA --> SNAP{Snapshot today?}
  SNAP -->|yes| REF[Reload snapshots]
  REF --> STR[Stream txn + event]
  SNAP -->|no| STR
  STR --> WAIT[Sleep] --> START
          

Live event graph

Stateful commerce chain: checkout, fulfillment, delivery, and service recovery.

Core order lifecycle: This shows the primary happy-path events; the full event graph includes edge cases, cancellations, replacements, partial fulfillment, and cross-channel scenarios tracked across live event streams.

flowchart LR
  OC[order_created] --> PO{payment?}
  PO -->|ok| PA[payment_authorized]
  PO -->|fail| PF[payment_failed]
  PA --> INV[invoice_created]
  PA --> PICK[pick_started]
  PICK --> PACK[pack_completed]
  PACK --> SHIP[shipment_created]
  SHIP --> DEL[delivered]
  DEL -.-> REVW[review_submitted]
  DEL -.-> RET[return_requested]
  RET --> RFD[refund_issued]
          

Bronze table categories

Same lineage envelope on every table; category drives live behavior.

Sample categories: All 79 bronze tables follow this three-category pattern (snapshot, transaction, event) with consistent metadata and state tracking; the diagram illustrates the pattern used across the full warehouse.

flowchart LR
  subgraph snap["snapshot"]
    direction TB
    S1["Master / dimension"]
    S2[Daily full refresh]
  end
  subgraph txn["transaction"]
    direction TB
    T1["Orders · payments · POs"]
    T2[Append each tick]
  end
  subgraph evt["event"]
    direction TB
    E1["Sessions · movements"]
    E2[Append each tick]
  end
  BRONZE[("amazing bronze row")]
  snap --> BRONZE
  txn --> BRONZE
  evt --> BRONZE