Data Mining Short Questions and Answers

This post is about Data Mining Short Questions and Answers. The Data Mining Short Questions and Answers are related to Different levels of Analysis, Techniques used for Data Mining, Steps Used in Data Mining, Steps involved in Data Mining Knowledge Process, Data Aggregation, Data Generalization, and Book names related to Data Mining.

Data Mining Short Questions and Answers

What is the History of Data Mining?

In the 1960s, statisticians used the terms Data Fishing or Data Dredging. Consequently, the term Data Mining appeared in 1990, especially in the database community.

Name Different Levels of Analysis of Data Mining

  1. Artificial Neural Networks (ANNs)
  2. Genetic Algorithms
  3. Nearest Neighbour Method
  4. Rule Induction
  5. Data Visualization

What Techniques are Used for Data Mining?

The following techniques are used for data mining:

  • Artificial Neural Networks: Generally, data mining is used in many ways. Artificial Neural Networks (ANNs), a type of machine learning algorithm, are used in data mining to identify patterns, make predictions, and extract knowledge from large datasets, forming the basis of deep learning. It is also used for non-linear predictive models.
  • Decision Trees: Generally, tree-shaped structures are used to represent sets of decisions. It is also used for the classification of dataset rules are generated. A decision tree is a non-parametric supervised learning algorithm, utilized for both classification and regression tasks. It has a hierarchical, tree structure, which consists of a root node, branches, internal nodes, and leaf nodes.
  • Genetic Algorithm: The genetic algorithms are present with the use of data mining as a powerful optimization technique to find the best solutions for complex problems, mimicking evolution to improve a population of potential solutions iteratively. Genetic algorithms are genetic combination, mutation, and natural selection for optimization techniques.
Data Mining Short Questions and Answers Data Mining Applications

Name the Steps Used in Data Mining

  • Business Understanding
  • Data Understanding
  • Data Preparation
  • Modeling
  • Evaluation
  • Deployment

Explain the Steps Involved in the Data Mining Knowledge Process

  • Data Cleaning: In the Data Cleaning Step, the noise and inconsistent data are removed.
  • Data Integration: In the Data Integration Step, multiple data sources are combined.
  • Data Selection: In the Data Selection Step, data relevant to the analysis tasks are retrieved from the data (or database).
  • Data Transformation: In the Data Transformation Step, data is transformed into different forms appropriate for data mining. The summary and aggregation operations are also performed in this step.
  • Data Mining: In the Data Mining Step, intelligent methods are applied to extract data patterns.
  • Pattern Evaluation: In The Pattern Evaluation Step, data patterns are evaluated.
  • Knowledge Presentation: In the Knowledge Presentation Step, knowledge is presented.

Name Some Data Mining Books

  • Introduction to Data Mining by Tan, Steinbach & Kumar (2006)
  • Big Data, Data Mining, and Machine Learning: Value Creation for Business Leaders and Practitioners
  • Data Science for Business: What you need to know about data mining and data analytic thinking
  • Probabilistic Programming and Bayesian Methods for Hackers
  • Data Mining: Practical Machine Learning Tools and Techniques
  • Data Mining: The Text Book by Charu C. Aggarwal (2015)
  • Data Mining: Practical Machine Learning Tools and Techniques by Ian Witten (2016)
  • Data Mining and Machine Learning: Fundamental Concepts and Algorithms by Mohammed J. Zaki, (2020)

What is Data Aggregation and Generalization?

Data Aggregation: Data aggregation is the process of combining and summarizing data from multiple sources into a single, more manageable format to facilitate analysis and decision-making

Generalization: It is a process where low-level data is replaced by high-level concepts so that the data can be generalized and meaningful. Generalization is often used to enhance privacy or summarize data for easier analysis, such as replacing specific dates with months or specific values with ranges. 

Learn R Programming

Leave a Comment

Discover more from Statistics for Data Analyst

Subscribe now to keep reading and get access to the full archive.

Continue reading