How to use Amazon EMR to process data using the broad ecosystem of Hadoop tools like Hive and Hue and work with Amazon DynamoDB, Amazon Redshift, Amazon QuickSight, Amazon Athena and Amazon Kinesis.
AWS DataSyncis a data transfer service that makes it easy for you to automate moving data between on-premises storage and Amazon S3 or Amazon Elastic File System (Amazon EFS).
Amazon FSx for Lustre provides a high-performance file system optimized for fast processing of workloads such as machine learning, high performance computing (HPC), video processing, financial modeling, and electronic design automation (EDA).
AWS Glue DataBrew visual data preparation tool to clean and normalize data to prepare it for analytics and machine learning
Amazon Managed Streaming for Kafka (MSK)is a fully managed service that makes it easy for you to build and run applications that use Apache Kafka to process streaming data.
Quick Start Data Lake with SnapLogicbuilds a data lake environment on AWS in about 15 minutes by deploying SnapLogic components and AWS services such as Amazon Simple Storage Service (Amazon S3) and Amazon Redshift.
About Amazon EMR ReleasesEach release comprises different big-data applications, components, and features that you select to have Amazon EMR install and configure when you create a cluster.