Discount Book Store - Rbookshop.comOnline Book StoreBusiness BooksComputer BooksEngineering BooksMathematics BooksScience BooksView All Categoriesnavmap
arrow Search for books at ARC Spider:
arrow Search for books at Powells:
arrow
Buy a Book from Amazon.com
bar
How to buy? - A step-by-step guide

Book Categories


Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management...

Buy Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management... here, one of many Bioinformatics books offered for sale at discount prices here at Rbookshop.com.  We greatly appreciate your patronage at Rbookshop and look forward to offering you great products and prices now and in the future.
You Are Here:  Home > Science Books > Bioinformatics > Item 10

View Previous Product in our Bioinformatics Store      View Next Product in our Bioinformatics Store

Click here to buy Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management... by  Ian H. Witten and Eibe Frank. Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management...
by Ian H. Witten and Eibe Frank
Sales Rank: 17794
4.0 out of 5 stars
$41.55
At Amazon
on 10-11-2008.
Buy Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management... now! Get Info on Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management...
Features
  • Cover Type: Paperback with 560 pages
  • Published by: Morgan Kaufmann
  • Edition: 2nd Edition June 10, 2005
  • Written in: English
  • ISBN 10 Number: 0120884070
  • ISBN 13 Number: 978-0120884070
  • Book Dimensions: 9.1 x 7.5 x 1.2 inches
  • Weighs: 2.6 pounds

Product Review
"I was a big fan of the first edition and I'm excited about this new edition."
- Peter Norvig, Director of Search Quality, Google, Inc.

"This book presents this new discipline in a very accessible form: both as a text to train the next generation of practitioners and researchers, and to inform lifelong learners like myself. Witten and Frank have a passion for simple and elegant solutions. They approach each topic with this mindset, grounding all concepts in concrete examples, and urging the reader to consider the simple techniques first, and then progress to the more sophisticated ones if the simple ones prove inadequate. If you have data that you want to analyze and understand, this book and the associated Weka toolkit are an great way to start."
- From the foreword by Jim Gray, Microsoft Research

"It covers cutting-edge, data mining technology that forward-looking organizations use to successfully tackle problems that are complex, highly dimensional, chaotic, non-stationary (changing over time), or plagued by. The writing style is well-rounded and engaging without subjectivity, hyperbole, or ambiguity. I consider this book a classic already!"
- Dr. Tilmann Bruckhaus, StickyMinds.com

Book Description
Highly anticipated second edition of the highly-acclaimed reference on data mining and machine learning.

Reader Reviews
The major virtue of this book is the emphasis on practical applications and bread-and-butter techniques for accomplishing tasks that one could expect in a business environment. That is not to say that these techniques could not be used in a scientific research environment. They indeed could be, and in fact may be even easier to implement due to the long time scales that are available in research environments for processing information. In the business world however data mining has proven to be an activity that gives a substantial competitive edge, and so many businesses are seeking even more sophisticated methods of data mining and Web mining. Data mining could easily be considered to a branch of artificial intelligence (AI), due to its emphasis on learning patterns and performing classification, and the learning and classification tools it uses were discovered by individuals who would describe themselves as being researchers in artificial intelligence. But many, and it is fair to include the authors of this book, do not want to view data mining as part of artificial intelligence, since the latter stirs up discussions on the origin of intelligence, autonomous robots, and conscious machines, to paraphrase a line from chapter 8 of this book. The authors make it a point to emphasize that data mining, or "machine learning" is concerned with the algorithms for the inference of structure from data and the validation of that structure. Along with its practical emphasis, the book includes discussions of some very interesting developments that are not usually included in books or monographs on data mining. One of these concerns the current research in `programming by demonstration.' This research is targeted towards the "ordinary" computer user who does not possess any programming knowledge but yet wants to automate predictable tasks. The only thing required from the user is knowledge of how to do the task in the usual way. As an example, the authors discuss briefly the `Familiar' system, which extracts information from user applications to make predictions and then generates explanations for the user about its predictions. Even more interesting is that it learns the tasks that are specialized for each individual user. It learns from the unique style of each user and their interaction history. One of the most interesting and powerful claims of programming by demonstration is that is domain-independent, considering the current intense interest in reasoning patterns or algorithms that can process information arising from multiple domains. In this regard a successful system would then be able to learn how to play chess from a user along with perhaps composing music. Again, the ability of a machine to reason in many domains is a step towards what many in the artificial community have called a `universal' learning machine. But the authors do not hold to this view, and in fact they open up the discussion in the chapter on the Weka workbench with a statement to the effect that there is no single learning algorithm that will work with all data mining problems. The "universal learner" they say, is an "idealistic fantasy." Another interesting discussion included in the book is that of `co-training', which is a methodology that arises in the context of `semi-supervised learning.' In this learning scheme the input contains both unlabeled and labeled data. In co-training, one depends on the fact that the classification task depends on two different and independent perspectives. Then assuming there are a few labeled examples, a different model will be learned for each perspective, and then the models are separately used to label the unlabeled examples. Each model will contribute both negative and positive examples to the pool of labeled examples. The procedure is then repeated until the unlabeled pool is empty. This allows both models to be trained on the new pool of labeled examples. The authors point out some evidence indicating that if a (naive) Bayesian learner is used throughout this procedure, then it outperforms a learner that develops a single model from the labeled data. The intuition behind this is that using the independence of the two perspectives allows one to reduce the likelihood of an incorrect labeling. References are given for readers that want to investigate this approach in more detail, along with more brief discussions on its generalizations, such as co-EM, which involves probabilistic labeling of unlabeled data in one perspective, and how to use support vector machines in place of the naive Bayesian learner. For the practitioner, the most useful discussion in the book concerns the evaluation of the different methods for data mining. What makes one approach to data mining better than another, and is there then a ranking of the different approaches? Can one in fact make judgments on the reliability or performance of data mining algorithms using solely the training or test data? If one had a general methodology for ranking data mining algorithms according to their performance then this would be a major advance, since this would allow a classification scheme for machine learning where one could speak of one machine being `more intelligent' than another. Unfortunately however this is difficult, and even said to be impossible according to some researchers. There are results in the research literature, going by the name of `free lunch' theorems, which seem to indicate that one cannot distinguish machine learning algorithms based solely on the way the deal with training or test data. The authors do not discuss these results in this book, but it is certainly apparent that they are aware of the difficult issues involved in the prediction of performance for data mining algorithms.


Back To Top

View Previous Product in our Bioinformatics Store      View Next Product in our Bioinformatics Store

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management...
List Price: $65.95
Available from Amazon
Price: $41.55
Updated on 10-11-2008.
Buy Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management... now! Get Info on Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management...




NOTICE: All prices, availability, and specifications
are subject to verification by their respective retailers.




We offer Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management... and other related Bioinformatics Books here at Rbookshop.com. To view more books about Bioinformatics please use the previous and next buttons near the top of this page.




Alternative Med Books | Art Books | Business Books | Comic Books | Computer Books | Cook Books | Engineering Books | History Books | Hobby Books | Law Books | Mathematics Books | Medical Books | Popular Authors | Rare Books | Religion Books | Romance Books | Science Books | Science Fiction Books | Sports Books | Travel Books | Unusual Subjects Books
Discount Book Store
Rbookshop

Copyright © 2008 Dominant Systems Corporation

86915 Science Books Online and Available as of 10-11-2008.