
Relational, Document, Key Value Databases: summary, use cases
There are a wide variety of databases. With cloud and hybrid architectures often see the following: Relational DB A structured format with rows and columns, re…
Read More »

There are a wide variety of databases. With cloud and hybrid architectures often see the following: Relational DB A structured format with rows and columns, re…
Read More »
A useful architecture to move data from on-premises to AWS is to consider using AWS S3 outputs and move data directly over a Direct Connect to S3 in AWS. This …
Read More »
AWS Database Migration Service or DMS is a mature process to move on premises data to the AWS cloud, including to a S3 Data Lake. It is not recommended that fi…
Read More »
The problems with Data Pipelines and the hydration of a Data Lake include: Data teams often end with technical debt surrounding CI/CD, IaS, observability, and …
Read More »
Data Operations ‘DataOps’ has been inspired by the Agile-premised ‘Development Operations’ model. The ‘DevOps’ model which usually includes security (DevSecOps…
Read More »
The icebergth is hereth. Apache Iceberg is an open-source table format for large-scale data systems, designed to provide efficient and reliable management of s…
Read More »
Databricks LakeFlow is built on top of Databricks Workflows and Delta Live Tables. It is an implementation of Apache Airflow built into the Databricks eco syst…
Read More »
A straightfoward method to automate data ingestion from S3 buckets (data lake) to a Redshift (data warehouse) cluster; by using Glue. Create a Redshift cluster…
Read More »
[Data engineering lifecycle from “Fundamentals of Data Engineering” by Matt Housley] Data Ingestion Challenges Data ingestion can be complicated. There are usu…
Read More »