Practice Free D-DS-FN-23 Exam Online Questions
Refer to the exhibit.
To predict whether or not a customer will renew their annual property insurance policy, an insurance company built and operationalized a naïve Bayes classification model.
In the model, there are two class labels, renewal and non-renewal, that are assigned to each customer based on their attributes. A subset of the key attributes, their values, and corresponding conditional probabilities are provided in the exhibit.
A customer has the following attributes:
– Age is greater than 65 years
– Owns their own home
– Renewal month is August
If 20% of customers do not renew their policies every year, what is the score for a non-renewal in the naïve Bayesian model for the customer described above?
- A . 0.0002
- B . 0.0004
- C . 0.0020
- D . 0.0040
What does the Receiver Operating Characteristic (ROC) curve show?
- A . Relationship between p-value and true positive rate
- B . Relationship between p-value and true negative rate
- C . Relationship between true positive rate and false positive rate
- D . Relationship between true positive rate and true negative rate
Refer to the exhibit.
In the exhibit, the x-axis represents the derived probability of a borrower defaulting on a loan. Also in the exhibit, the pink represents borrowers that are known to have not defaulted on their loan, and the blue represents borrowers that are known to have defaulted on their loan.
Which analytical method could produce the probabilities needed to build this exhibit?
- A . Logistic Regression
- B . Linear Regression
- C . Discriminant Analysis
- D . Association Rules
Refer to the exhibit.
In the exhibit, the x-axis represents the derived probability of a borrower defaulting on a loan. Also in the exhibit, the pink represents borrowers that are known to have not defaulted on their loan, and the blue represents borrowers that are known to have defaulted on their loan.
Which analytical method could produce the probabilities needed to build this exhibit?
- A . Logistic Regression
- B . Linear Regression
- C . Discriminant Analysis
- D . Association Rules
In a decision tree, what is an example of a pure node?
- A . 25 positives; 75 negatives
- B . 50 positives; 50 negatives
- C . 75 positives; 25 negatives
- D . 100 positives; 0 negatives
You submit a Map Reduce job to a Hadoop cluster. However, you notice that although the job was successfully submitted, it is not completing.
What should be done to identify the issue?
- A . Ensure TaskTracker is running
- B . Ensure JobTracker is running
- C . Ensure NameNode is running
- D . Ensure DataNode is running
What is a consideration when building decision trees?
- A . Cannot handle variables that affect the outcome in a discontinuous way
- B . Short decision trees are likely subject to overfit
- C . Correlated variables can cause double-counting
- D . Tree structure is sensitive to small changes in the training data
Which activity is performed in the Operationalize phase of the data analytics lifecycle?
- A . Try different variables
- B . Try different analytical techniques
- C . Assess the benefits
- D . Transform existing variables
Which word or phrase completes the statement? A data warehouse is to a centralized database for reporting as an analytic sandbox is to a _______?
- A . Collection of data assets for modeling
- B . Collection of low-volume databases
- C . Centralized database of KPIs
- D . Collection of data assets for ETL
You have been assigned to run a Logistic Regression model for 100 countries each. All data is currently stored in a PostgreSQL database.
Which tool/library should be used to produce these models with the least effort?
- A . MADlib
- B . Mahout
- C . RStudio
- D . HBase