While reading up on how Singular Value Decomposition (SVD) works I found this quote by Simon Funk

“In today’s foray, that model is called singular value decomposition, which is just a fancy way of saying what I’ve already eluded to above.”

Nice to hear other people shudder about how we try to over-complicate a lot of things, especially in research and marketing departments.

As far as I understand, the intuition behind SVD (as applied to estimating ratings) is that you have matrix and you want to find two matrices that, when multiplied, predict the actual matrix with a minimal error of the approximation. This can be done in iterations to improve the accuracy of the prediction, and the nice thing about SVD is that it will automagically converge on the most optimal solution.

Alternating Least Squares
However, a complication arises when the matrix you are trying to decompose is not complete. Or, in other words, when there are some fields in the matrix which are empty or undefined SVD is insufficient for finding the composing matrices. In this case Alternating Least Squares (ALS) is the way to go. More to come about this.

Also, here is another remarkable quote I found in a paper from Zhou et al.

“We have found parallel Matlab to be flexible and efficient, and very straightforward to program. Thus, from our experience, it seems to be a strong candidate for widespread, easily scalable parallel/distributed computing [compared to Hadoop and MapReduce].”

Some more references:

  • Zhou, Yunhong, Dennis Wilkinson, Robert Schreiber, and Rong Pan. “Large-Scale Parallel Collaborative Filtering for the Netflix Prize.” PROC. 4TH INT’L CONF. ALGORITHMIC ASPECTS IN INFORMATION AND MANAGEMENT, LNCS 5034 (2008): 337—348.
  • Koren, Y., R. Bell, and C. Volinsky. “Matrix Factorization Techniques for Recommender Systems.” Computer 42, no. 8 (August 2009): 30-37.
  • How does SVD work