"The knowledge discovery process is as old as Homo sapiens. Until some time ago this process was solely based on the 'natural personal' computer provided by Mother Nature. Fortunately, in recent decades the problem has begun to be solved based on the development of the Data mining technology, aided by the huge computational power of the 'artificial' computers. Digging intelligently in different large databases, data mining aims to extract implicit, previously unknown and potentially useful information from data, since "knowledge is power". The goal of this book is to provide, in a friendly way, both theoretical concepts and, especially, practical techniques of this exciting field, ready to be applied in real-world situations. Accordingly, it is meant for all those who wish to learn how to explore and analysis of large quantities of data in order to discover the hidden nugget of information."--Back cover Cover 1 Intelligent Systems Reference Library, Volume 12 3 Data Mining 4 ISBN 9783642197208 5 Preface 8 Contents 10 1 Introduction to Data Mining 14 What Is and What Is Not Data Mining? 14 Why Data Mining? 18 How to Mine the Data? 20 Problems Solvable with Data Mining 27 Classification 28 Cluster Analysis 32 Association Rule Discovery 36 Sequential Pattern Discovery 38 Regression 38 Deviation/Anomaly Detection 39 About Modeling and Models 39 Data Mining Applications 51 Data Mining Terminology 55 Privacy Issues 55 2 The “Data-Mine” 58 What Are Data? 58 Types of Datasets 59 Data Quality 63 Types of Attributes 65 3 Exploratory Data Analysis 70 What Is Exploratory Data Analysis? 70 Descriptive Statistics 72 Descriptive Statistics Parameters 73 Descriptive Statistics of a Couple of Series 81 Graphical Representation of a Dataset 94 Analysis of Correlation Matrix 98 Data Visualization 102 Examination of Distributions 112 Advanced Linear and Additive Models 118 Multiple Linear Regression 118 Logistic Regression 129 Cox Regression Model 133 Additive Models 136 Time Series: Forecasting 137 Multivariate Exploratory Techniques 143 Factor Analysis 143 Principal Components Analysis 146 Canonical Analysis 149 Discriminant Analysis 150 OLAP 151 Anomaly Detection 161 4 Classification and Decision Trees 172 What Is a Decision Tree? 172 Decision Tree Induction 174 GINI Index 179 Entropy 182 Misclassification Measure 184 Practical Issues Regarding Decision Trees 192 Predictive Accuracy 192 STOP Condition for Split 192 Pruning Decision Trees 193 Extracting Classification Rules from Decision Trees 195 Advantages of Decision Trees 196 5 Data Mining Techniques and M 198 Data Mining Methods 198 Bayesian Classifier 199 Artificial Neural Networks 204 Perceptron 205 Types of Artificial Neural Networks 218 Probabilistic Neural Networks 230 Some Neural Networks Applications 237 Support Vector Machines 247 Association Rule Mining 262 Rule-Based Classification 265 k-Nearest Neighbor 269 Rough Sets 273 Clustering 284 Hierarchical Clustering 295 Non-hierarchical/Partitional Clustering 297 Genetic Algorithms 302 Components of GAs 305 Architecture of GAs 323 Applications 326 6 Classification Performance Evaluation 332 Costs and Classification Accuracy 332 ROC (Receiver Operating Characteristic) Curve 336 Statistical Methods for Comparing Classifiers 341 Index 344 3642197205,9783642197208 Springer, 2011 Front Matter....Pages - Introduction to Data Mining....Pages 1-43 The “Data-Mine”....Pages 45-56 Exploratory Data Analysis....Pages 57-157 Classification and Decision Trees....Pages 159-183 Data Mining Techniques and Models....Pages 185-317 Classification Performance Evaluation....Pages 319-330 Back Matter....Pages -