Practice Free D-DS-FN-23 Exam Online Questions
What is Hadoop?
- A . Java classes for HDFS types and Map Reduce job management and HDFS
- B . Java classes for HDFS types and Map Reduce job management and the Map Reduce paradigm
- C . Map Reduce paradigm and HDFS
- D . Map Reduce paradigm and massive unstructured data storage on commodity hardware
You have created a Linear Regression model to predict total sales based on variables M, N, P and Q as shown in the graphic. You originally expected all variables to have positive coefficients.
Which action would you take?
- A . Accept all variables and begin model validation steps against holdout data
- B . Accept only positive variables and investigate potential correlation with the dependent variable
- C . Accept only statistically significant variables and investigate correlated independent variables
- D . Accept none of the variables and investigate correlations between all variables
What is a property of window functions in SQL commands?
- A . They can be used to calculate moving averages over various intervals.
- B . They group rows into a single output row.
- C . They can be used between the keywords FROM and WHERE in a SELECT command.
- D . They don’t require ordering of data within a window.
A business colleague who is new to Hadoop approaches you with a question. The colleague wants to know the best approach to access their data. The colleague has previously worked extensively with SQL and databases.
Which query interface should be recommended?
- A . Hive
- B . Pig
- C . Howl
- D . HBase
Consider a database with 4 transactions:
Transaction 1: {cheese, bread, milk}
Transaction 2: {soda, bread, milk}
Transaction 3: {cheese, bread}
Transaction 4: {cheese, soda, juice}
You decide to run the association rules algorithm where minimum support is 50%.
Which rule has a confidence at least 50%?
- A . {cheese} => {bread}
- B . {juice} => {cheese}
- C . {milk} => {soda}
- D . {soda} => {milk}
In which lifecycle stage are appropriate analytical techniques determined?
- A . Model planning
- B . Model building
- C . Data preparation
- D . Discovery
In which lifecycle stage are appropriate analytical techniques determined?
- A . Model planning
- B . Model building
- C . Data preparation
- D . Discovery
Your company has 3 different sales teams. Each team’s sales manager has developed incentive offers to increase the size of each sales transaction.
Any sales manager whose incentive program can be shown to increase the size of the average sales transaction will receive a bonus. Data are available for the number and average sale amount for transactions offering one of the incentives as well as transactions offering no incentive.
The VP of Sales has asked you to determine analytically if any of the incentive programs has resulted in a demonstrable increase in the average sale amount.
Which analytical technique would be appropriate in this situation?
- A . One-way ANOVA
- B . Multi-way ANOVA
- C . Student’s t-test
- D . Wilcox son Rank Sum Test
You are using a Logistic Regression model to determine if an applicant’s gender is a factor in determining whether or not they receive a bank loan. When you plot the results, you notice that the regression coefficient is zero.
What can be determined?
- A . Sample size of the data is too small
- B . Applicant’s gender influences the loan decision
- C . Sample size of the data is too large
- D . Applicant’s gender does not influence the loan decision
You have been assigned to do a study of the daily revenue effect of a pricing model of online transactions. All the data currently available to you has been loaded into your analytics database; revenue data, pricing data, and online transaction data.
You find that all the data comes in different levels of granularity. The transaction data has timestamps (day, hour, minutes, seconds), pricing is stored at the daily level, and revenue data is only reported monthly.
What is your next step?
- A . Report back to the business owner that the current data model does not support the business question.
- B . Interpolate a daily model for revenue from the monthly revenue data.
- C . Aggregate all data to the monthly level in order to create a monthly revenue model.
- D . Disregard revenue as a driver in the pricing model, and create a daily model based on pricing and transactions only.