
In order to manage data – migrations and operational updates – it has been necessary to take raw data and parse or process it, into both OLTP (on line transactional processing data) and OLAP (on line analytical processed data). Medallion data processes are most effectively achieved in the cloud with AWS, Azure, Databricks and other platforms providing virtualised patterns and deployments to take raw data, ingest, transform and curate it with python, pyspark, or OOTB libraries and pipelines (ADF for eg). The curated data can be consumed downstream by end users (end user tools ranging from PowerBI to Jupyter/Colab) or applications, databases, systems and SaaS.
Data usage is accelerating in volume, variety, and velocity.
By segmenting data into distinct layers, a ‘Medallion Architecture’ aims to transform raw data into a strategic asset, enabling businesses to derive value from their data more efficiently and effectively.
Medallion Architecture organizes data into three distinct layers: the Bronze for raw data ingestion, Silver for cleaning and basic processing, and Gold for advanced analytics and business-ready insights. This layering reflects a process akin to refining raw metal into a valuable medallion, symbolizing the increasing value of data as it moves through these stages.
Definition of medallion architecture from the Databricks website.
A medallion architecture is a data design pattern used to logically organize data in a lakehouse, with the goal of incrementally and progressively improving the structure and quality of data as it flows through each layer of the architecture (from Bronze ⇒ Silver ⇒ Gold layer tables).
The Medallion Architecture is also referred to as a “multi-hop” architecture.
The traditional three-layer system (Bronze, Silver, and Gold) is expanded into a more complex, multi-tiered data processing framework by the idea of a “Multi-Hop” architecture in Medallion Architecture. This approach is designed to handle increasingly complex data workflows by adding additional layers, each tailored to specific analytical or business needs:
