This book gives a general overview of data mining and was written for a broad-based audience. Providing an innovative, easy approach to learning data mining, the emphasis of this book is on market focus and a hands-on teaching style. *Introduces the tools, resources, Web sites, and vendors that can increase students success with data mining. *Provides examples of data mining in various industries, including: *Banking and finance. *Retail and marketing. *Telecommunications. *Healthcare. *This is the first data mining book published devoted to the business field and the business professional. *Targets the business analyst and end users who do not necessarily have a statistics background, but want to try data mining. *Enclosed CD-ROM provides a hands-on approach to learning data mining. *Trial versions of three leading data mining products for Windows PCs are included on the CD-ROM and explained in depth. *DataMind? *Angoss KnowledgeSEEKER? *NeutralWorks Predict?

This book contains all the practical information, hands-on demos and software you need to understand data mining.This book doesn't just explain data mining concepts: it shows you exactly how to make the most of them. If you're in marketing, you'll learn how data mining can help you rank your customers by the likelihood they'll respond to your mailings. If you're in MIS, you'll learn exactly how to prepare relational data for data mining. You'll learn how to use each of three powerful data mining tools; demos for all three are included on CD-ROM. The book also includes detailed case studies for several of the industries that can benefit most from data mining, including banking, finance, retail, healthcare, direct marketing, and telecommunications. The book is replete with shortcuts and techniques that have never been published before.For all business and marketing professionals, systems analysts, database administrators, students and others who want to leverage the power of data mining.

The process of knowledge discovery is certainly not new. Someone first discovered how to start a fire, that the earth was round, and that passing out pretzels to customers at a restaurant promotes drinking. Even before computers were used to automate the knowledge discovery process, statisticians were using probability and regression techniques to model historical data. Certainly, people have attempted to perform “data mining” before the popular term was first coined. So why is data mining suddenly reaching the covers of magazines and into the imagination of corporate America? Hopefully by explaining the current field of data mining and introducing popular tools easy enough for you to use, this book will help answer this question. Data mining is hot. An article in Bank Systems & Technology, January 1996, stated, “Data mining is the most important application in Financial Services in 1996.” A 1996 commercial by IBM¨, played during the SuperBowl, shows fashion models discussing the use and advantages of data mining. This is remarkable attention to an “emerging” market. Consider the projections in market opportunity for data mining shown in the figure on the next page.

Data mining, until recently, has been largely an academic field and required computer systems out of the reach of most business analysts. Something has happened to move knowledge discovery into the mainstream. Below are a few of the reasons why data mining has recently gained such popularity and, consequently, why this book was written:

  • The cost of personal computing power is decreasing to a point where data mining is now possible for business professionals. A few years ago, only the IT departments of large corporations, like finance and insurance institutions, had the computers to perform data mining; today, you can perform data mining studies on a PC. Data mining is a computationally intensive process and it requires a fair amount of disk space as well. As the price of computing power and disk space continue to drop, the possibilities open to end users are mind numbing.

  • Recent innovations in methodologies used for data mining are making data mining more powerful and easier to understand. Advances in computational power have enabled innovative, new algorithms for data mining to develop. New algorithms have increased the usefulness, power, and the usability of these tools. The mid-80s were a time when several techniques first appeared. Influential works by John Hopfield, on neural networks, and Breiman, Friedman, Olshen, and Stone, on decision trees, propelled data mining into the corporate world. By 1996, the number of new techniques applied to data mining were staggering, whether they be in the form of decision trees, neural networks, fractals, genetic algorithms, or the networked agent-based technology.

  • Software vendors are making data mining available to the end user. A few companies are trying to bring data mining out of its traditional roots in acadamia and within the reach of the business professional. There will always be the need for experts in the field of statistics, but business analysts have largely had to rely on others to answer fundamental business questions. Analysts need a tool easy enough to use to point them in the right direction. However you view this emerging market, a basic fact remains: Data mining has become available to business analysts and end users. That is an awesome statement! As a business professional, student, researcher, manager of information systems, or consultant, it should catch your attention enough to want to learn more about data mining.

    The Purpose of this Book

    This book is devoted to the business professional. This book, Data Mining: A Hands-on Approach for Business Professionals resulted from the fundamental realization that data mining is heading into the mainstream, and that there are no books about data mining devoted to the business professional. This book provides an innovative, easy approach to learning data mining for business professionals, students, and consultants. The CD-ROM at the back of the book makes learning data mining a hands-on activity. You can try out several different software packages available for data mining. The book discusses how knowledge discovery is used in different industries as well as discusses several of the data mining software products, which, although they may be appropriate for IT organizations and run on larger servers, also run on personal computers. The focus on products which run on personal computers is deliberate because they are the products that offer data mining to the widest audiences. Sample studies for specific industries, like retail, banking, insurance, and healthcare are provided. This book takes a distinctly different approach to introducing data mining than the academic-focused books currently on the market. The emphasis of this book is on market focus and a hands-on teaching style. Market Focus Data mining is still largely an evolving field, with great variety in terminology and methodology. To gain a reasonable understanding of what data mining is all about, you must have a broad perspective on how it is being used within the market today and where to go to find information. Information vendors and web sites are listed for you. Data mining tools currently on the market are also discussed to familiarize you with the market. Chapters 7 through 8 provide examples of data mining in various industries, including:

  • Banking and finance
  • Retail and marketing
  • Telecommunications
  • Healthcare

    This book broadens the scope of what is relevant to learning data mining. Not only should you learn the methodology and terminology needed to use data mining, you should also learn about specific examples of how to achieve fast results in the corporate environment.

    Hands-On Teaching Style

    This book also provides a hands-on approach to learning data mining. By devoting three hours of your time, you can use the enclosed CD-ROM to familiarize yourself with all the major processes.

    Once we cover the concepts of data mining, we go directly to exercises to show how easy it is to turn data into information. Data Mining: A Hands-on Approach for Business Professionals includes a CD-ROM containing demonstrations of three end user data mining tools from Angoss¨ KnowledgeSEEKERª, NeuralWare's¨ NeuralWorks Predictª, and DataMind¨ Corporation's DataMind Professional Edition. This book should increase your own marketability by showing how data mining is used in the database industry today. This book should provide you an answer to the following basic questions:

  • What is data mining?
  • How is data mining used in the market today?
  • Why use data mining?
  • Which vendors are in the data mining market?
  • Where do you go to find information on data mining?
  • How do you data mine?


    Of the many books published to introduce data mining, this is the first devoted to the business professional. You do not have to have a statistics degree to use these tools. This book gives a general overview of data mining and was written for a broad-based audience. The book will be useful to:

  • Business Professionals Anyone in business who deals with large amounts of data should be interested in data mining and this book in particular. A conscious effort is made to provide industry examples as well as make the use of data mining products understandable.
  • Database Administrators (DBAs) Database administrators will be interested in this book from the standpoint of how end users can extract data from today's relational databases and data warehouses in order to mine data. This book discusses example data structures for different industries and what data fields are used in different types of data mining studies.
  • Marketing Analysts Data mining is especially useful to a marketing organization, because it allows you to profile customers to a level not possible before. Some people refer to this as “one-to-one” marketing. Distributors of mass mailers today generally all use data mining tools. In several years, data mining will not be an advantage, but a requirement of marketing organizations.

  • Students Students desiring a practical introduction to the basics of data mining and what is being done in the market can start with this book.
  • Systems Analysts and Consultants Consultants can benefit from the discussions of the vendors involved in this market and by industry specific examples.

    Scope of the Book

    Data Mining: A Hands-on Approach for Business Professionals does not attempt to explain the algorithms used with data mining. f you want to learn more about the algorithms, I would suggest Advances in Knowledge Discovery and Data mining, by Usama M. Fayyad, Gregory Piatestsky-Shapiro, Padhraic Smyth, and Ramasamy Uthurusam. This book, at over 550 pages, is the most comprehensive work today on the academic approaches used in data mining. This book is devoted to the business professional and targets an audience of PC users who do not necessarily have a statistics background and who want to try data mining themselves. Data mining for business professionals is just one market segment. Consider the segmentation of the data mining market in the figure on the next page.

    While many data mining tools are available to business professionals, data mining tools are also sold into other market segments. For example, some data mining products are sold to IT Organizations and require sophisticated hardware and specially trained technicians to use them. They also usually require a system administrator and are often focused on vertical markets, like Falconª from HNC¨ software, which is used for fraud detection for banks and usually runs on mainframes. Some data mining products are used as managed systems, and are sold with consultative services. For example, IBM sells outsourcing services where you can send data to them and they will return the result. Data mining for business professionals does not imply “low-end” , rather it requires tools that are accessible and understandable to business users who are not necessarily statisticians. Many vendors have PC versions of their products, but they also sell high-end servers. As long as these tools provide an integrated approach to data mining and make the reporting of the results straightforward, then they are selling closer to the business professional. The trend indicates that more vendors will be selling to this market within the next few years. CD-ROM Installation Requirements The minimum system requirements for installing the CD-ROM included in this book are discussed in Appendix B. Each of the three data mining software products included in the CD-ROM have their own requirements. The installed software enables you to run the CD-ROM-based tutorial included in this book. Additional files have been added specifically for this book beyond those provided by Angoss Software, DataMind Corporation, and NeuralWare.
    Organization of this Book

    Data Mining: A Hands-on Approach for Business Professionals is divided into three parts:

    Starting Out: The first three chapters introduce data mining, discuss the data mining process, and cover vendors involved in this market. Chapter 1, Introduction to Data Mining, introduces basic concepts of data mining, discusses the models used for data mining, introduces terminology, and provides a brief history. Chapter 2, The Data Mining Process, covers the process of data mining and introduces different types of studies as well as data cleaning issues. Chapter 3, The Data Mining Marketplace, introduces vendors in the data mining market today.

    A Rapid Tutorial: Chapters 4 through 6 introduce several of the leading data mining software products that are focused on business professionals. Chapter 4, A Look at Angoss:KnowledgeSEEKER, covers the leading, commercial data mining software product based on a decision tree model that is focused on end users. Chapter 5, A Look at DataMind, covers an innovative commercial data mining software product that enables business professionals to mine data and make predictions. The front end to the product is based on Microsoft Excel to make the product understandable to end users. Chapter 6, A Look at NeuralWorks Predict, covers a leading commercial data mining software product based on neural networks and focused on business professionals.

    Industry Focus: Chapters 7 through 8 focus on specific industry uses of data mining. Examples for each study performed are provided, with tips on how these can be performed on corporate database systems. Chapter 7, Industry Applications of Data Mining, looks at types of data mining studies in banking and finance, retail, healthcare, and the telecommunications industries. Examples of companies performing data mining are provided. Chapter 8, Enabling Data Mining through Data Warehouses, looks at how data warehouses provide a methodology for helping perform data mining studies. Four data warehouse industry examples are provided to discuss the type of data that would be integrated as well as introducing how some data mining studies could be performed using these data warehouses. Acknowledgments

    This book would not have been completed without the help of many individuals. Many thanks to my wife, Michele Groth, for her review of the book and to Leo Gelman, at Red Brick Systems¨, who contributed greatly to the discussion on data warehousing in Chapter Eight. Special thanks to Angoss Software, Belmont Research¨, DataMind Corporation, HNC Software, MapInfo¨ Corporation, Neural Applications¨, NeuralWare, Pilot Software¨, and Silicon Graphics¨ for providing the use of images used in this book. Angoss Software, NeuralWare, and DataMind Corporation all contributed to the demo CD-ROM included at the back of the book. Thanks to Penny Buckley and Fritz Vandenheuvel at Angoss; Patricia Campbell at HNC Software; Craig Zielazny and Casey Klimasaus at NeuralWare; Karen Gobler at Pilot Software; Lisa Jacobsen at MapInfo; A.J. Brown, Ram Srinivasan, Janet Fish, Karen Thomas, and Shaw Taylor at DataMind Corporation; Tracy Timpson, Mark Olsen, and Patricia Baumhart at Silicon Graphics; Jim Ong at Belmont Research; and Kurt Kimmerling at Neural Applications.

