Big Data Analytics Solved Exam Questions - 167 Verified Questions

Page 1


Big Data Analytics Solved

Exam Questions

Course Introduction

Big Data Analytics is a course designed to introduce students to the concepts, tools, and techniques used to analyze and interpret massive datasets. Through lectures and hands-on projects, students learn to manage, process, and extract valuable insights from structured and unstructured data using platforms such as Hadoop and Spark. The course covers data mining, machine learning, data visualization, and ethical considerations in big data, enabling students to solve real-world problems in various industries by making data-driven decisions.

Recommended Textbook

Data Mining A Tutorial Based Primer 1st Edition by Richard Roiger

Available Study Resources on Quizplus

14 Chapters

167 Verified Questions

167 Flashcards

Source URL: https://quizplus.com/study-set/3934 Page 2

Chapter 1: Data Mining: a First View

Available Study Resources on Quizplus for this Chatper

22 Verified Questions

22 Flashcards

Source URL: https://quizplus.com/quiz/78433

Sample Questions

Q1) Database query is used to uncover this type of knowledge.

A) deep

B) hidden

C) shallow

D) multidimensional

Answer: C

Q2) A statement to be tested.

A) theory

B) procedure

C) principle

D) hypothesis

Answer: D

Q3) A nearest neighbor approach is best used

A) with large-sized datasets.

B) when irrelevant attributes have been removed from the data.

C) when a generalized model of the data is desireable.

D) when an explanation of what has been found is of primary importance.

Answer: B

To view all questions and flashcards with answers, click on the resource link above. Page 3

Chapter 2: Data Mining: a Closer Look

Available Study Resources on Quizplus for this Chatper

16 Verified Questions

16 Flashcards

Source URL: https://quizplus.com/quiz/78434

Sample Questions

Q1) Which statement about outliers is true?

A) Outliers should be identified and removed from a dataset.

B) Outliers should be part of the training dataset but should not be present in the test data.

C) Outliers should be part of the test dataset but should not be present in the training data.

D) The nature of the problem determines how outliers are used.

E) More than one of a,b,c or d is true.

Answer: D

Q2) How many class 2 instances are in the dataset?

Answer: 23

Q3) Given desired class C and population P, lift is defined as

A) the probability of class C given population P divided by the probability of C given a sample taken from the population.

B) the probability of population P given a sample taken from P.

C) the probability of class C given a sample taken from population P.

D) the probability of class C given a sample taken from population P divided by the probability of C within the entire population P.

Answer: D

To view all questions and flashcards with answers, click on the resource link above.

Page 4

Chapter 3: Basic Data Mining Techniques

Available Study Resources on Quizplus for this Chatper

13 Verified Questions

13 Flashcards

Source URL: https://quizplus.com/quiz/78435

Sample Questions

Q1) Given a rule of the form IF X THEN Y, rule confidence is defined as the conditional probability that

A) Y is true when X is known to be true.

B) X is true when Y is known to be true.

C) Y is false when X is known to be false.

D) X is false when Y is known to be false.

Answer: A

Q2) An evolutionary approach to data mining.

A) backpropagation learning

B) genetic learning

C) decision tree learning

D) linear regression

Answer: B

Q3) A genetic learning operation that creates new population elements by combining parts of two or more existing elements.

A) selection

B) crossover

C) mutation

D) absorption

Answer: B

To view all questions and flashcards with answers, click on the resource link above. Page 5

Chapter 4: An Excel-Based Data Mining Tool

Available Study Resources on Quizplus for this Chatper

12 Verified Questions

12 Flashcards

Source URL: https://quizplus.com/quiz/78436

Sample Questions

Q1) A particular categorical attribute value has a predictiveness score of 1.0 and a predictability score of 0.50. The attribute value is

A) necessary but not sufficient for class membership.

B) sufficient but not necessary for class membership.

C) necessary and sufficient for class membership.

D) neither necessary nor sufficient for class membership.

Q2) The single best representative of a class.

A) mean

B) centroid

C) signature

D) prototype

Q3) The first row of an iDAV formatted file contains attribute names. The second row reflects attribute types. What is specified in the third row of an iDAV formatted file?

A) attribute predictability

B) attribute tolerance

C) attribute similarity

D) attribute usage

To view all questions and flashcards with answers, click on the resource link above.

Chapter 5: Knowledge Discovery in Databases

Available Study Resources on Quizplus for this Chatper

10 Verified Questions

10 Flashcards

Source URL: https://quizplus.com/quiz/78437

Sample Questions

Q1) KDD has been described as the application of ___ to data mining.

A) the waterfall model

B) object-oriented programming

C) the scientific method

D) procedural intuition

Q2) Attibutes may be eliminated from the target dataset during this step of the KDD process.

A) creating a target dataset

B) data preprocessing

C) data transformation

D) data mining

Q3) This technique uses mean and standard deviation scores to transform real-valued attributes.

A) decimal scaling

B) min-max normalization

C) z-score normalization

D) logarithmic normalization

To view all questions and flashcards with answers, click on the resource link above. Page 7

Chapter 6: The Data Warehouse

Available Study Resources on Quizplus for this Chatper

13 Verified Questions

13 Flashcards

Source URL: https://quizplus.com/quiz/78438

Sample Questions

Q1) Operational databases are designed to support _____ whereas decision support systems are design to support __________.

A) transactional processing, data analysis

B) data analysis, transactional processing

C) independent data marts, dependent data marts

D) dependent data marts, independent data marts

Q2) The level of detail of the information stored in a data warehouse.

A) granularity

B) scope

C) functionality

D) level of query

Q3) Which of the following is not an example of a slice operation?

A) Select all cells where purchase category = retail.

B) Select all cells where purchase category = retail or vehicle.

C) Provide a spreadsheet of quarter and region information for all cells pertaining to restaurant.

D) Identify the region of peak travel expenditure for each quarter.

To view all questions and flashcards with answers, click on the resource link above. Page 8

Chapter 7: Formal Evaluation Techniques

Available Study Resources on Quizplus for this Chatper

13 Verified Questions

13 Flashcards

Source URL: https://quizplus.com/quiz/78439

Sample Questions

Q1) The correlation between the number of years an employee has worked for a company and the salary of the employee is 0.75. What can be said about employee salary and years worked?

A) There is no relationship between salary and years worked.

B) Individuals that have worked for the company the longest have higher salaries.

C) Individuals that have worked for the company the longest have lower salaries.

D) The majority of employees have been with the company a long time.

E) The majority of employees have been with the company a short period of time.

Q2) Selecting data so as to assure that each class is properly represented in both the training and test set.

A) cross validation

B) stratification

C) verification

D) bootstrapping

Q3) Data used to optimize the parameter settings of a supervised learner model.

A) training

B) test

C) verification

D) validation

To view all questions and flashcards with answers, click on the resource link above.

Page 9

Chapter 8: Neural Networks

Available Study Resources on Quizplus for this Chatper

10 Verified Questions

10 Flashcards

Source URL: https://quizplus.com/quiz/78440

Sample Questions

Q1) A feed-forward neural network is said to be fully connected when

A) all nodes are connected to each other.

B) all nodes at the same layer are connected to each other.

C) all nodes at one layer are connected to the nodes in the next higher layer.

D) all hidden layer nodes are connected to all output layer nodes.

Q2) This neural network explanation technique is used to determine the relative importance of individual input attributes.

A) sensitivity analysis

B) average member technique

C) mean squared error analysis

D) absolute average technique

Q3) Neural network training is accomplished by repeatedly passing the training data through the network while

A) individual network weights are modified.

B) training instance attribute values are modified.

C) the ordering of the training instances is modified.

D) individual network nodes have the coefficients on their corresponding functional parameters modified.

To view all questions and flashcards with answers, click on the resource link above.

10

Chapter 9: Building Neural Networks With Ida

Available Study Resources on Quizplus for this Chatper

4 Verified Questions

4 Flashcards

Source URL: https://quizplus.com/quiz/78441

Sample Questions

Q1) The test set accuracy of a backpropagation neural network can often be improved by

A) increasing the number of epochs used to train the network.

B) decreasing the number of hidden layer nodes.

C) increasing the learning rate.

D) decreasing the number of hidden layers.

Q2) The total delta measures the total absolute change in network connection weights for each pass of the training data through a neural network. This value is most often used to determine the convergence of a

A) perceptron network.

B) feed-forward network.

C) backpropagation network.

D) self-organizing network.

Q3) This type of supervised network architecture does not contain a hidden layer.

A) backpropagation

B) perceptron

C) self-organizing map

D) genetic

To view all questions and flashcards with answers, click on the resource link above.

11

Chapter 10: Statistical Techniques

Available Study Resources on Quizplus for this Chatper

13 Verified Questions

13 Flashcards

Source URL: https://quizplus.com/quiz/78442

Sample Questions

Q1) This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.

A) agglomerative clustering

B) expectation maximization

C) conceptual clustering

D) K-Means clustering

Q2) This supervised learning technique can process both numeric and categorical input attributes.

A) linear regression

B) Bayes classifier

C) logistic regression

D) backpropagation learning

Q3) Machine learning techniques differ from statistical techniques in that machine learning methods

A) typically assume an underlying distribution for the data.

B) are better able to deal with missing and noisy data.

C) are not able to explain their behavior.

D) have trouble with large-sized datasets.

To view all questions and flashcards with answers, click on the resource link above.

12

Chapter 11: Specialized Techniques

Available Study Resources on Quizplus for this Chatper

10 Verified Questions

10 Flashcards

Source URL: https://quizplus.com/quiz/78443

Sample Questions

Q1) A set of pageviews requested by a single user from a Web server.

A) index page

B) common log

C) session

D) page frame

Q2) The automation of Web site adaptation involves creating and deleting

A) index pages

B) cookies

C) pageviews

D) clickstreams

Q3) A data file that contains session information.

A) cookie

B) pageview

C) page frame

D) common log

Q4) Usage profiles for Web-based personalization contain several

A) pageviews

B) clickstreams

C) cookies

D) session files

Page 13

To view all questions and flashcards with answers, click on the resource link above.

Chapter 12: Rule-Based Systems

Available Study Resources on Quizplus for this Chatper

15 Verified Questions

15 Flashcards

Source URL: https://quizplus.com/quiz/78444

Sample Questions

Q1) An internal test of an expert system whose purpose is to determine if the system uses the same reasoning process as the experts) used to build the system.

A) validation

B) verification

C) reliability

D) suitability

Q2) A problem that cannot be solved with a computer using a traditional algorithmic technique.

A) exponentially hard problem

B) recursive problem

C) non-transformable problem

D) combinatorial problem

Q3) Knowledge about knowledge is known as

A) metaknowledge

B) class knowledge

C) structured knowledge

D) classified knowledge

Q4) Construct a goal tree using the following production rules. Assume the goal is g.

To view all questions and flashcards with answers, click on the resource link above.

Page 14

Chapter 13: Managing Uncertainty in Rule-Based Systems

Available Study Resources on Quizplus for this Chatper

10 Verified Questions

10 Flashcards

Source URL: https://quizplus.com/quiz/78445

Sample Questions

Q1) A fuzzy set is associated with a

A) linguistic variable.

B) certainty factor.

C) hypothesis to be tested.

D) linguistic value.

Q2) This technique is used to determine the height of a rule consequent membership function as determined by the truth of the rule's antecedent condition.

A) fuzzy set union

B) fuzzy set intersection

C) center of gravity

D) clipping

Q3) Computing the probability of picking a heart from a deck of 52 cards can be determined using ______ probability technique.

A) an objective

B) an experimental

C) a subjective

D) an inexact

To view all questions and flashcards with answers, click on the resource link above. Page 15

Chapter 14: Intelligent Agents

Available Study Resources on Quizplus for this Chatper

6 Verified Questions

6 Flashcards

Source URL: https://quizplus.com/quiz/78446

Sample Questions

Q1) This type of agent resides inside a data warehouse in an attempt to discover changes in business trends.

A) semiautonomous agent

B) cooperative agent

C) data mining agent

D) filtering agent

Q2) A fundamental difference between a data mining approach to problem solving and an expert systems approach is

A) the output of an expert system is a set of rules and the output of a data mining technique is a decision tree.

B) a data mining technique builds a model without the aid of a human expert.

C) a model built using a data mining technique can explain how decisions are made but an expert system cannot.

D) an expert system is built using inductive learning whereas a data mining model is built using one or several deductive techniques

To view all questions and flashcards with answers, click on the resource link above.

16

Turn static files into dynamic content formats.

Create a flipbook