AWS DEA-C01 Preparation
Exam Parts and Weighting
- 34% Data Ingestion and Transformation
- 26% Data Store Management
- 22% Data Operation and Support
- 18% Data Security and Governance
Exam Details
- 130 minutes (2hr 10 minutes)
- 64 MCQ
- Ask additional time as English is second language
DE Fundamentals
- Data from Sources lands on to AWS Data Lake
- AWS Data Lake is object file storage like S3.
- Data landed as is is called RAW Data.
- Then it is read by AWS Glue using Crawler.
- Crawler crawls and builds Catalog.
- Catalog is meta-data which lets do more operations and is schema-on-read.
- Then data is
processed
andstored
as Transformed Data back into AWS Data Lake. - Again
crawler
is built on Transformed Data, and then it can be used for Data Analytics byQuickSight
,SageMaker
,Athena
orOpenSearch
flowchart LR
s1[Data Sources \n
Database
Object
IoT
Mobile]
s2[Raw Data \n
Amazon S3
]
s31[[Glue \n
Crawler
Catalog]]
s32[[Glue \n
Crawler
Catalog]]
s4[Transformed Data \n
Amazon S3
]
s5[Data Analytics \n
Amazon QuickSight
Amazon SageMaker
Amazon OpenSearch
Amazon Athena
]
s1 --> s2 --> s31 --> s4 --> s32 --> s5
subgraph sub1[Data Lake]
s2
s31
s4
s32
end
Data Lake vs Data Warehouse
Why Data Lake?
- Cost: DL is cheap as it is file storage whereas DW being RDBMS is expensive.
- DL is central repo for raw data, DW is central repo for historical data.
- Structure - DL supports any format, DW store organised and structured.
- Schema - DL is schema-on-read; DW is schema-on-write
- Type - DL can store text, images, videos; DW is good for transactional, CRM, ERP etc.
2025-01-12
January 12, 2025