Book Review - Programming Collective Intelligence by Toby Segaran
Category Book Review
Have you ever wondered how some of those "collective intelligence" sites work? How Amazon can suggest books that you'll like based on your browsing history? How a search engine can rank and filter results? Toby Segaran does a very good job in revealing and teaching those types of algorithms in his book Programming Collective Intelligence: Building Smart Web 2.0 Applications. While I'm not ready to run out and build my own version of Facebook now, at least I can start to understand how sites like that are designed.
Contents:
Introduction to Collective Intelligence; Making Recommendations; Discovering Groups; Searching and Ranking; Optimization; Document Filtering; Modeling with Decision Trees; Building Price Models; Advanced Classification - Kernel Methods and SVMs; Finding Independent Features; Evolving Intelligence; Algorithm Summary; Third-Party Libraries; Mathematical Formulas; Index
In each of the chapters, Segaran takes a type of capability, be it decision-making or filtering, and shows how a programming language can be used to build that feature. His examples are all in Python, so it helps if you are already familiar with that language if you want to actually work with the code. But even if you don't know Python, the examples are clear and detailed enough that you can follow along and get the gist of what's happening. I personally think that it would help immensely if you had a background in mathematics and statistics. You can use the code here without having a detailed understanding of math, but I'm sure much of this would be more deeply appreciated if you already know about such things as Tanimoto similarity scores, Euclidean distances, or Pearson coefficients.
From my perspective (a non-Python programmer *without* the math background), I was more interested in understanding the overall picture about things like how ranking systems work or how recommendation engines are structured. While there was more detail than I needed (or understood), I still felt as if I accomplished my goal. I have a much greater appreciation for what companies like Google and Amazon have done in building web applications that allow the knowledge and wisdom of groups to be gathered and applied to my own preferences.
Statistical programmers will probably find years of entertainment here. :) "Normal" programmers will expand their horizons, too.
Have you ever wondered how some of those "collective intelligence" sites work? How Amazon can suggest books that you'll like based on your browsing history? How a search engine can rank and filter results? Toby Segaran does a very good job in revealing and teaching those types of algorithms in his book Programming Collective Intelligence: Building Smart Web 2.0 Applications. While I'm not ready to run out and build my own version of Facebook now, at least I can start to understand how sites like that are designed.
Contents:
Introduction to Collective Intelligence; Making Recommendations; Discovering Groups; Searching and Ranking; Optimization; Document Filtering; Modeling with Decision Trees; Building Price Models; Advanced Classification - Kernel Methods and SVMs; Finding Independent Features; Evolving Intelligence; Algorithm Summary; Third-Party Libraries; Mathematical Formulas; Index
In each of the chapters, Segaran takes a type of capability, be it decision-making or filtering, and shows how a programming language can be used to build that feature. His examples are all in Python, so it helps if you are already familiar with that language if you want to actually work with the code. But even if you don't know Python, the examples are clear and detailed enough that you can follow along and get the gist of what's happening. I personally think that it would help immensely if you had a background in mathematics and statistics. You can use the code here without having a detailed understanding of math, but I'm sure much of this would be more deeply appreciated if you already know about such things as Tanimoto similarity scores, Euclidean distances, or Pearson coefficients.
From my perspective (a non-Python programmer *without* the math background), I was more interested in understanding the overall picture about things like how ranking systems work or how recommendation engines are structured. While there was more detail than I needed (or understood), I still felt as if I accomplished my goal. I have a much greater appreciation for what companies like Google and Amazon have done in building web applications that allow the knowledge and wisdom of groups to be gathered and applied to my own preferences.
Statistical programmers will probably find years of entertainment here. :) "Normal" programmers will expand their horizons, too.


