Handbook of Data Mining
This book fits the bill for those who are looking for an all in one volume overview of data mining concepts, methods and techniques.
My contributed chapter, Data Collection, Preparation, Quality, and Visualization provides a principled approach to the subject. Topics include: why data needs to be prepared before mining it, choosing the right data, enforcing data quality (advantages and disadvantages), data quality and model quality, absolute versus relative visualization, and visualizing multiple interactions.
About the Downloads
» Ref. 1 in text Demonstration showing the different way neural networks and decision trees handle interactions between variables. Dataset, example neural netword, example decision tree. Explanation of example. Document is in MS Word; dataset is provided as Excel spreadsheet.
» Ref. 2 in text Demonstration of using supervised binning techniques to translate a variables values from numerical to categorical representation with a numeric output variable in the dataset. Dataset, explanation of example, binning and category numeration demonstration tool. Requires Windows 95 or later.
» Refs. 3 and 4 in text Visualization of the credit card solicitation dataset. 3 Self Organizing Maps (SOMs), credit dataset. Maps are in MS Word. Dataset is provided as Excel spreadsheet. Note: Eudaptics no longer offers a free evaluation of its Viscovery SOMiner software for download. Errata: The link to the Eudaptics home page in the Resource list of Chapter 14 is incorrect. The correct link is www.eudaptics.com.
Questions about the Book
Email me: dpyle at model and mine dot com
