Practice Free DAS-C01 Exam Online Questions
A company wants to build a real-time data processing and delivery solution for streaming data. The data is being streamed through an Amazon Kinesis data stream. The company wants to use an Apache Flink application to process the data before writing the data to another Kinesis data stream. The data must be stored in an Amazon S3 data lake every 60 seconds for further analytics.
Which solution will meet these requirements with the LEAST operational overhead?
- A . Host the Flink application on an Amazon EMR cluster. Use Amazon Kinesis Data Firehose to write the data to Amazon S3.
- B . Host the Flink application on Amazon Kinesis Data Analytics. Use AWS Glue to write the data to Amazon S3.
- C . Host the Flink application on an Amazon EMR cluster. Use AWS Glue to write the data to Amazon S3.
- D . Host the Flink application on Amazon Kinesis Data Analytics. Use Amazon Kinesis Data Firehose to write the data to Amazon S3.
A data analytics specialist is building an automated ETL ingestion pipeline using AWS Glue to ingest compressed files that have been uploaded to an Amazon S3 bucket. The ingestion pipeline should support incremental data processing.
Which AWS Glue feature should the data analytics specialist use to meet this requirement?
- A . Workflows
- B . Triggers
- C . Job bookmarks
- D . Classifiers
A company uses Amazon Redshift for its data warehouse. The company is running an ET L process that receives data in data parts from five third-party providers. The data parts contain independent records that are related to one specific job. The company receives the data parts at various times
throughout each day.
A data analytics specialist must implement a solution that loads the data into Amazon Redshift only after the company receives all five data parts.
Which solution will meet these requirements?
- A . Create an Amazon S3 bucket to receive the data. Use S3 multipart upload to collect the data from the different sources and to form a single object before loading the data into Amazon Redshift.
- B . Use an AWS Lambda function that is scheduled by cron to load the data into a temporary table in Amazon Redshift. Use Amazon Redshift database triggers to consolidate the final data when all five data parts are ready.
- C . Create an Amazon S3 bucket to receive the data. Create an AWS Lambda function that is invoked by S3 upload events. Configure the function to validate that all five data parts are gathered before the function loads the data into Amazon Redshift.
- D . Create an Amazon Kinesis Data Firehose delivery stream. Program a Python condition that will invoke a buffer flush when all five data parts are received.
A large marketing company needs to store all of its streaming logs and create near-real-time dashboards. The dashboards will be used to help the company make critical business decisions and must be highly available.
Which solution meets these requirements?
- A . Store the streaming logs in Amazon S3 with replication to an S3 bucket in a different Availability Zone. Create the dashboards by using Amazon QuickSight.
- B . Deploy an Amazon Redshift cluster with at least three nodes in a VPC that spans two Availability Zones. Store the streaming logs and use the Redshift cluster as a source to create the dashboards by using Amazon QuickSight.
- C . Store the streaming logs in Amazon S3 with replication to an S3 bucket in a different Availability Zone. Every time a new log is added in the bucket, invoke an AWS Lambda function to update the dashboards in Amazon QuickSight.
- D . Store the streaming logs in Amazon OpenSearch Service deployed across three Availability Zones and with three dedicated master nodes. Create the dashboards by using OpenSearch Dashboards.
A company is hosting an enterprise reporting solution with Amazon Redshift. The application provides reporting capabilities to three main groups: an executive group to access financial reports, a data analyst group to run long-running ad-hoc queries, and a data engineering group to run stored procedures and ETL processes. The executive team requires queries to run with optimal performance. The data engineering team expects queries to take minutes.
Which Amazon Redshift feature meets the requirements for this task?
- A . Concurrency scaling
- B . Short query acceleration (SQA)
- C . Workload management (WLM)
- D . Materialized views
An ecommerce company ingests a large set of clickstream data in JSON format and stores the data in Amazon S3. Business analysts from multiple product divisions need to use Amazon Athena to analyze the data. The company’s analytics team must design a solution to monitor the daily data usage for Athena by each product division. The solution also must produce a warning when a divisions exceeds its quota
Which solution will meet these requirements with the LEAST operational overhead?
- A . Use a CREATE TABLE AS SELECT (CTAS) statement to create separate tables for each product division Use AWS Budgets to track Athena usage Configure a threshold for the budget Use Amazon Simple Notification Service (Amazon SNS) to send notifications when thresholds are breached.
- B . Create an AWS account for each division Provide cross-account access to an AWS Glue Data Catalog to all the accounts. Set an Amazon CloudWatch alarm to monitor Athena usage. Use Amazon Simple Notification Service (Amazon SNS) to send notifications.
- C . Create an Athena workgroup for each division Configure a data usage control for each workgroup and a time period of 1 day Configure an action to send notifications to an Amazon Simple Notification Service (Amazon SNS) topic
- D . Create an AWS account for each division Configure an AWS Glue Data Catalog in each account Set an Amazon CloudWatch alarm to monitor Athena usage Use Amazon Simple Notification Service (Amazon SNS) to send notifications.
A company has developed several AWS Glue jobs to validate and transform its data from Amazon S3 and load it into Amazon RDS for MySQL in batches once every day. The ETL jobs read the S3 data using a DynamicFrame. Currently, the ETL developers are experiencing challenges in processing only the incremental data on every run, as the AWS Glue job processes all the S3 input data on each run.
Which approach would allow the developers to solve the issue with minimal coding effort?
- A . Have the ETL jobs read the data from Amazon S3 using a DataFrame.
- B . Enable job bookmarks on the AWS Glue jobs.
- C . Create custom logic on the ETL jobs to track the processed S3 objects.
- D . Have the ETL jobs delete the processed objects or data from Amazon S3 after each run.
A regional energy company collects voltage data from sensors attached to buildings. To address any known dangerous conditions, the company wants to be alerted when a sequence of two voltage drops is detected within 10 minutes of a voltage spike at the same building. It is important to ensure that all messages are delivered as quickly as possible. The system must be fully managed and highly available. The company also needs a solution that will automatically scale up as it covers additional cites with this monitoring feature. The alerting system is subscribed to an Amazon SNS topic for remediation.
Which solution meets these requirements?
- A . Create an Amazon Managed Streaming for Kafka cluster to ingest the data, and use an Apache Spark Streaming with Apache Kafka consumer API in an automatically scaled Amazon EMR cluster to process the incoming data. Use the Spark Streaming application to detect the known event sequence and send the SNS message.
- B . Create a REST-based web service using Amazon API Gateway in front of an AWS Lambda function. Create an Amazon RDS for PostgreSQL database with sufficient Provisioned IOPS (PIOPS). In the Lambda function, store incoming events in the RDS database and query the latest data to detect the known event sequence and send the SNS message.
- C . Create an Amazon Kinesis Data Firehose delivery stream to capture the incoming sensor data. Use an AWS Lambda transformation function to detect the known event sequence and send the SNS message.
- D . Create an Amazon Kinesis data stream to capture the incoming sensor data and create another stream for alert messages. Set up AWS Application Auto Scaling on both. Create a Kinesis Data Analytics for Java application to detect the known event sequence, and add a message to the message stream. Configure an AWS Lambda function to poll the message stream and publish to the SNS topic.
A company is creating a data lake by using AWS Lake Formation. The data that will be stored in the data lake contains sensitive customer information and must be encrypted at rest using an AWS Key Management Service (AWS KMS) customer managed key to meet regulatory requirements.
How can the company store the data in the data lake to meet these requirements?
- A . Store the data in an encrypted Amazon Elastic Block Store (Amazon EBS) volume. Register the Amazon EBS volume with Lake Formation.
- B . Store the data in an Amazon S3 bucket by using server-side encryption with AWS KMS (SSE-KMS). Register the S3 location with Lake Formation.
- C . Encrypt the data on the client side and store the encrypted data in an Amazon S3 bucket. Register the S3 location with Lake Formation.
- D . Store the data in an Amazon S3 Glacier Flexible Retrieval vault bucket. Register the S3 Glacier Flexible Retrieval vault with Lake Formation.
A company plans to store quarterly financial statements in a dedicated Amazon S3 bucket. The financial statements must not be modified or deleted after they are saved to the S3 bucket.
Which solution will meet these requirements?
- A . Create the S3 bucket with S3 Object Lock in governance mode.
- B . Create the S3 bucket with MFA delete enabled.
- C . Create the S3 bucket with S3 Object Lock in compliance mode.
- D . Create S3 buckets in two AWS Regions. Use S3 Cross-Region Replication (CRR) between the buckets.