Model + Mine logo
data mining resources
  • Home
  • Data Prep
  • Modeling
  • Handbook
  • Adaptive Systems
Data Preparation for Data Mining

Most data mining books focus on what various algorithms do, and how to apply them to data that's already prepared. This book provides a proven method to improve model performance or speed (or both) by applying data preparation techniques. It also provides a conceptual overview of the data exploration process for business managers and anyone new to the subject.

 Available on Amazon

About the Download

Contains a suite of “C” source files. They can be compiled into a DOS-based, command-line-driven toolkit. A DOS command-line compiled version is included as dp10.exe.

Four datasets are provided: CREDIT, SHOE, CARS, HOUSE. These are based on or extracted from actually modeled datasets. They're prepared only inasmuch as they're in a format suitable to be read by the compiled demonstration code. Otherwise they're unprepared and contain all of the problems discussed in the book. Some of the datasets have types of problems discussed in the book but are not illustrated there.

Caveats

This is demo code only. No point-and-click interface.

It's not intended to have the functionality of a commercial product.

It's not intended to be fast, robust or fully optimized.

The toolkit requires the data to be in a specific format and to take a steering file.

Requires a PC running Windows 95 or later.

 Download now

Questions about the Book

Email me: dpyle at model and mine dot com

Copyright © 2000 Dorian Pyle All rights reserved
Site Policy