Data Science Fundamentals Exam Questions - 167 Verified Questions

Page 1


Data Science Fundamentals

Exam Questions

Course Introduction

Data Science Fundamentals introduces students to the foundational concepts, tools, and methods used in the field of data science. The course covers essential topics such as data collection and cleaning, exploratory data analysis, and basic statistical techniques. Students will gain practical experience with programming languages like Python or R and utilize libraries for data manipulation and visualization. Emphasis is placed on understanding the data science workflow, critical thinking about data-driven problems, and applying ethical considerations in data analysis. By the end of the course, students will be equipped with the skills necessary to tackle real-world data challenges and prepare for advanced study in data science.

Recommended Textbook

Data Mining A Tutorial Based Primer 1st Edition by Richard Roiger

Available Study Resources on Quizplus

14 Chapters

167 Verified Questions

167 Flashcards

Source URL: https://quizplus.com/study-set/3934 Page 2

Chapter 1: Data Mining: a First View

Available Study Resources on Quizplus for this Chatper

22 Verified Questions

22 Flashcards

Source URL: https://quizplus.com/quiz/78433

Sample Questions

Q1) A person trained to interact with a human expert in order to capture their knowledge.

A) knowledge programmer

B) knowledge developer

C) knowledge engineer

D) knowledge extractor

Answer: C

Q2) Supervised learning and unsupervised clustering both require at least one

A) hidden attribute.

B) output attribute.

C) input attribute.

D) categorical attribute.

Answer: C

Q3) Determine whether a credit card transaction is valid or fraudulent.

A)supervised learning

B)unsupervised clustering

C)data query

Answer: A

To view all questions and flashcards with answers, click on the resource link above. Page 3

Chapter 2: Data Mining: a Closer Look

Available Study Resources on Quizplus for this Chatper

16 Verified Questions

16 Flashcards

Source URL: https://quizplus.com/quiz/78434

Sample Questions

Q1) Compute the lift for Model Y.

Answer: 8/7

Q2) The average positive difference between computed and desired outcome values.

A) root mean squared error

B) mean squared error

C) mean absolute error

D) mean positive error

Answer: C

Q3) Given desired class C and population P, lift is defined as

A) the probability of class C given population P divided by the probability of C given a sample taken from the population.

B) the probability of population P given a sample taken from P.

C) the probability of class C given a sample taken from population P.

D) the probability of class C given a sample taken from population P divided by the probability of C within the entire population P.

Answer: D

Q4) How many instances were classified as an accept by Model X?

Answer: 35

To view all questions and flashcards with answers, click on the resource link above.

4

Chapter 3: Basic Data Mining Techniques

Available Study Resources on Quizplus for this Chatper

13 Verified Questions

13 Flashcards

Source URL: https://quizplus.com/quiz/78435

Sample Questions

Q1) A data mining algorithm is unstable if

A) test set accuracy depends on the ordering of test set instances.

B) the algorithm builds models unable to classify outliers.

C) the algorithm is highly sensitive to small changes in the training data.

D) test set accuracy depends on the choice of input attributes.

Answer: C

Q2) A genetic learning operation that creates new population elements by combining parts of two or more existing elements.

A) selection

B) crossover

C) mutation

D) absorption

Answer: B

Q3) Given a rule of the form IF X THEN Y, rule confidence is defined as the conditional probability that

A) Y is true when X is known to be true.

B) X is true when Y is known to be true.

C) Y is false when X is known to be false.

D) X is false when Y is known to be false.

Answer: A

To view all questions and flashcards with answers, click on the resource link above. Page 5

Chapter 4: An Excel-Based Data Mining Tool

Available Study Resources on Quizplus for this Chatper

12 Verified Questions

12 Flashcards

Source URL: https://quizplus.com/quiz/78436

Sample Questions

Q1) A particular categorical attribute value has a predictiveness score of 1.0 and a predictability score of 0.50. The attribute value is A) necessary but not sufficient for class membership. B) sufficient but not necessary for class membership. C) necessary and sufficient for class membership. D) neither necessary nor sufficient for class membership.

Q2) A dataset of 1000 instances contains one attribute specifying the color of an object. Suppose that 800 of the instances contain the value red for the color attribute. The remaining 200 instances hold green as the value of the color attribute. What is the domain predictability score for color = green?

A) 0.80

B) 0.20

C) 0.60

D) 0.40

Q3) What is the predictability score for the attribute value medium risk?

A) 0.10

B) 0.20

C) 0.25

D) 0.50

To view all questions and flashcards with answers, click on the resource link above. Page 6

Chapter 5: Knowledge Discovery in Databases

Available Study Resources on Quizplus for this Chatper

10 Verified Questions

10 Flashcards

Source URL: https://quizplus.com/quiz/78437

Sample Questions

Q1) The relational database model is designed to

A) promote data redundancy.

B) minimize data redundancy.

C) eliminate the need for data transformations.

D) eliminate the need for data preprocessing.

Q2) The choice of a data mining tool is made at this step of the KDD process.

A) goal identification

B) creating a target dataset

C) data preprocessing

D) data mining

Q3) A common method used by some data mining techniques to deal with missing data items during the learning process.

A) replace missing real-valued data items with class means

B) discard records with missing data

C) replace missing attribute values with the values found within other similar instances

D) ignore missing attribute values

To view all questions and flashcards with answers, click on the resource link above. Page 7

Chapter 6: The Data Warehouse

Available Study Resources on Quizplus for this Chatper

13 Verified Questions

13 Flashcards

Source URL: https://quizplus.com/quiz/78438

Sample Questions

Q1) This process removes redundancies that may be present in a data model.

A) abstraction

B) granularization

C) standardization

D) normalization

Q2) A variation of the star schema that allows more than one central fact table.

A) snowflake schema

B) linked strar schema

C) distributed star schema

D) constellation schema

Q3) Consider the OLAP cube shown above. The vertical arrow points to:

A) region four, travel

B) region two, travel

C) Q2, travel

D) Q1, travel

Q4) The purpose of an intersection entity is to replace

A) two one-to-one relationships with a one-to-many relationship

B) two one-to-many relationships with one many-to-many relationship

C) a many-to-many relationship with two one-to-many relationships

D) a one-to-many relationship with two one-to-one relationships

To view all questions and flashcards with answers, click on the resource link above. Page 8

Chapter 7: Formal Evaluation Techniques

Available Study Resources on Quizplus for this Chatper

13 Verified Questions

13 Flashcards

Source URL: https://quizplus.com/quiz/78439

Sample Questions

Q1) We have built and tested two supervised learner models-M1 and M2. We compare the test set accuracy of the models using the classical hypothesis testing paradigm using a 95% confidence setting.

The computed value of P is 2.53. What can we say about this result?

A) Model M<sub>1 </sub>performs significantly better than M<sub>2.</sub>

B) Model M<sub>2 </sub>performs significantly better than M<sub>1.</sub>

C) Both models perform at the same level of accuracy.

D) The models differ significantly in their performance.

E) More than one of a,b,c or d is correct.

Q2) Data used to optimize the parameter settings of a supervised learner model.

A) training

B) test

C) verification

D) validation

Q3) The average squared difference between classifier predicted output and actual output.

A) mean squared error

B) root mean squared error

C) mean absolute error

D) mean relative error

To view all questions and flashcards with answers, click on the resource link above. Page 9

Chapter 8: Neural Networks

Available Study Resources on Quizplus for this Chatper

10 Verified Questions

10 Flashcards

Source URL: https://quizplus.com/quiz/78440

Sample Questions

Q1) Which one of the following is not a major strength of the neural network approach?

A) Neural networks work well with datasets containing noisy data.

B) Neural networks can be used for both supervised learning and unsupervised clustering.

C) Neural network learning algorithms are guaranteed to converge to an optimal solution.

D) Neural networks can be used for applications that require a time element to be included in the data.

Q2) The values input into a feed-forward neural network

A) may be categorical or numeric.

B) must be either all categorical or all numeric but not both.

C) must be numeric.

D) must be categorical.

Q3) A feed-forward neural network is said to be fully connected when

A) all nodes are connected to each other.

B) all nodes at the same layer are connected to each other.

C) all nodes at one layer are connected to the nodes in the next higher layer.

D) all hidden layer nodes are connected to all output layer nodes.

To view all questions and flashcards with answers, click on the resource link above. Page 10

Chapter 9: Building Neural Networks With Ida

Available Study Resources on Quizplus for this Chatper

4 Verified Questions

4 Flashcards

Source URL: https://quizplus.com/quiz/78441

Sample Questions

Q1) This type of supervised network architecture does not contain a hidden layer.

A) backpropagation

B) perceptron

C) self-organizing map

D) genetic

Q2) The test set accuracy of a backpropagation neural network can often be improved by

A) increasing the number of epochs used to train the network.

B) decreasing the number of hidden layer nodes.

C) increasing the learning rate.

D) decreasing the number of hidden layers.

Q3) The total delta measures the total absolute change in network connection weights for each pass of the training data through a neural network. This value is most often used to determine the convergence of a

A) perceptron network.

B) feed-forward network.

C) backpropagation network.

D) self-organizing network.

To view all questions and flashcards with answers, click on the resource link above.

11

Chapter 10: Statistical Techniques

Available Study Resources on Quizplus for this Chatper

13 Verified Questions

13 Flashcards

Source URL: https://quizplus.com/quiz/78442

Sample Questions

Q1) This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.

A) agglomerative clustering

B) expectation maximization

C) conceptual clustering

D) K-Means clustering

Q2) Regression trees are often used to model _______ data.

A) linear

B) nonlinear

C) categorical

D) symmetrical

Q3) This technique associates a conditional probability value with each data instance.

A) linear regression

B) logistic regression

C) simple regression

D) multiple linear regression

To view all questions and flashcards with answers, click on the resource link above.

Page 12

Chapter 11: Specialized Techniques

Available Study Resources on Quizplus for this Chatper

10 Verified Questions

10 Flashcards

Source URL: https://quizplus.com/quiz/78443

Sample Questions

Q1) The automation of Web site adaptation involves creating and deleting

A) index pages

B) cookies

C) pageviews

D) clickstreams

Q2) A data mining algorithm designed to discover frequently accessed Web pages that occur in the same order.

A) serial miner

B) association rule miner

C) sequence miner

D) decision miner

Q3) Which of the following problems is best solved using time-series analysis?

A) Predict whether someone is a likely candidate for having a stroke.

B) Determine if an individual should be given an unsecured loan.

C) Develop a profile of a star athlete.

D) Determine the likelihood that someone will terminate their cell phone contract.

To view all questions and flashcards with answers, click on the resource link above.

13

Chapter 12: Rule-Based Systems

Available Study Resources on Quizplus for this Chatper

15 Verified Questions

15 Flashcards

Source URL: https://quizplus.com/quiz/78444

Sample Questions

Q1) Developing a rough version of a system that is suitable for testing.

A) validating

B) field reporting

C) verifying

D) prototyping

Q2) Any technique that helps limit the size of a search space.

A) top-down technique

B) conflict resolution strategy

C) bottom-up technique

D) heuristic

Q3) A problem that cannot be solved with a computer using a traditional algorithmic technique.

A) exponentially hard problem

B) recursive problem

C) non-transformable problem

D) combinatorial problem

Q4) Construct a goal tree using the following production rules. Assume the goal is g.

To view all questions and flashcards with answers, click on the resource link above.

14

Chapter 13: Managing Uncertainty in Rule-Based Systems

Available Study Resources on Quizplus for this Chatper

10 Verified Questions

10 Flashcards

Source URL: https://quizplus.com/quiz/78445

Sample Questions

Q1) A fuzzy set is associated with a A) linguistic variable.

B) certainty factor.

C) hypothesis to be tested.

D) linguistic value.

Q2) With Bayes theorem the probability of hypothesis H- specified by PH) - is referred to as

A) an a priori probability

B) a conditional probability

C) a posterior probability

D) a bidirectional probability

Q3) A car mechanic tells you that there is a 75% chance that your car will need major repair work within the next six months. This statement is an example of

A) an objective probability.

B) an experimental probability.

C) a subjective probability.

D) a fuzzy probability.

To view all questions and flashcards with answers, click on the resource link above.

15

Chapter 14: Intelligent Agents

Available Study Resources on Quizplus for this Chatper

6 Verified Questions

6 Flashcards

Source URL: https://quizplus.com/quiz/78446

Sample Questions

Q1) Autonomy is an agent's ability to

A) react to a changing environment.

B) act without direct intervention from others.

C) confer with other agents.

D) react to sensory information received from the environment.

Q2) This type of agent resides inside a data warehouse in an attempt to discover changes in business trends.

A) semiautonomous agent

B) cooperative agent

C) data mining agent

D) filtering agent

Q3) An expert system contains _________ knowledge whereas the knowledge processed by an intelligent agent is _____________

A) personal, general

B) general, personal

C) direct, indirect

D) indirect, direct

To view all questions and flashcards with answers, click on the resource link above. Page 16

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.