Practice Free D-DS-FN-23 Exam Online Questions
During the data preparation phase, you notice a high correlation between average spend on video games, age of players, and number of science fiction shows watched.
Which technique could you use to address the three correlated variables?
- A . Square the three variables to remove the correlation
- B . Combine the three variables into one new variable
- C . Drop the three variables to improve the model
- D . Use scaling to make the three variables equivalent in size
Which data type value is used for the observed response variable in a logistic regression model?
- A . Any positive real number
- B . Any integer
- C . A binary value
- D . Any real number
Which data type value is used for the observed response variable in a logistic regression model?
- A . Any positive real number
- B . Any integer
- C . A binary value
- D . Any real number
You have two tables of customers in your database. Customers in cust_table_1 were sent an e-mail promotion last year, and customers in cust_table_2 received a newsletter last year.
Customers can only be entered in once per table. You want to create a table that includes all customers, and any of the communications they received last year.
Which type of join would you use for this table?
- A . Full outer join
- B . Inner join
- C . Left outer join
- D . Cross join
You have two tables of customers in your database. Customers in cust_table_1 were sent an e-mail promotion last year, and customers in cust_table_2 received a newsletter last year.
Customers can only be entered in once per table. You want to create a table that includes all customers, and any of the communications they received last year.
Which type of join would you use for this table?
- A . Full outer join
- B . Inner join
- C . Left outer join
- D . Cross join
In a fitted ARIMA(1,2,3) model, how many differences are applied?
- A . 0
- B . 1
- C . 2
- D . 3
Consider the following SQL statement:
SELECT employee_id, year, salary, avg(salary)
OVER
(PARTITION BY employee_id ORDER BY year ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) as result_1
FROM employee
ORDER BY employee_id, year
For each employee_id, what is returned as result_1?
- A . Three year rolling average salary
- B . Four year rolling average salary
- C . Average salary across all employee_id values
- D . Average employee_id
Only three variables―A, B, and C―have significant correlation with sales
You build a linear regression model on the dependent variable of sales with the independent variables of A, B, and C. The results of the regression are seen in the exhibit.
Which interpretation is supported by the analysis?
- A . Variables A, B, and C are significantly impacting sales, but are not effectively estimating sales
- B . Variables A, B, and C are significantly impacting sales and are effectively estimating sales
- C . Due to the R2 of 0.10, the model is not valid C the linear regression should be rerun with all 15 variables forced into the model to increase the R2
- D . Due to the R2 of 0.10, the model is not valid C a different analytical model should be attempted
For which class of problem is Map Reduce most suitable?
- A . Embarrassingly parallel
- B . Minimal result data
- C . Simple marginalization tasks
- D . Non-overlapping queries
What is an appropriate data visualization to use in a presentation for an analyst audience?
- A . Pie chart
- B . Area chart
- C . Stacked bar chart
- D . ROC curve