Monday, October 30, 2006

 

Netflix & Rasch partial-credit-model

Modeling each of the 17770 movies in the Netflix database to have its own rating scale (the Rasch-Masters partial-credit model) produces worse predictions than modeling them all to share the same rating scale! And modeling each of the 480,189 customers to have a unique rating scale is even worse!!
This accords with the Rasch proposition that "betterdescription of the local dataset can result in worse inference for other data sets." I was already skeptical of the accidental nature of many partial credit analyses, particularly those with low category frequencies. The Netflix data confirm my skepticism.

Friday, October 27, 2006

 

Asia and Rasch

Asia is rushing into hard science research Time magazine story, but seems to have overlooked soft science: psychology, sociology, etc. But it is advances in soft science that they need in order to solve the huge societal problems they are experiencing, and that are likely to become worse as populations grow and the migration to the cities continues.

Only two of the 30+ enrolled in my current online Rasch Course give Asian countries as their addresses. Yet Rasch measurement is a powerful tool in the advance of soft science because it demands that you know what you are doing. Books on Rasch have already been published in Korean and Chinese. Please email me links to where those Rasch books (or any in non-English languages) can be obtained so that I can announce them on this website.

Tuesday, October 24, 2006

 

The Netflix Prize - more

The Facets bug on the customer x movie analysis turned out to be rather silly. It was in computing the location of the progress bar across the screen. After fixing that, the analysis worked and I submitted its predictions to Netflix. The result was an RMSE of .98. This was better than merely submitting the mean rating. This gives an RMSE of 1.05. Netflix themselves have 0.95, and the best so far is .90.

So now I am trying a movie x customer-style model. Each customer is modeled to have his/her own rating scale. This produces 480,189 Rasch-Masters partial credit scales. First time out, this blew up Facets. Facets had a maximum of 16,000 partial credit scales. But now that is fixed and various other optimizations made. Facets is running again, about one iteration per hour, and 20 iterations to converge. Perhaps I will be able to submit a new set of predictions today ....

Tuesday, October 03, 2006

 

The Netflix Prize

This $1 million prize, announced at http://www.netflixprize.com/ looks to be a challenging demonstration application for Rasch measurement - thanks for telling me about it, Martin Caust. It has to do with the analysis of customer ratings of movies, and the prediction of what movies they would like to obtain. Right now I'm downloading the initial 665MB data set. My computer tells me it will take 13 hours over my connection - but other folks have reported 20 minutes to download over theirs.

Netflix are likely to receive submission that are local optimizations of the type that W.E. Deming decried. My suspicion is that someone will discover an opportunistic algorithm that beats the existing Netflix algorithm, but which does not generalize. A Rasch solution would generalize better even though apparently doing "worse" on the initial data set.

Any other Rasch folk going to give this a try?

Progress report on the Netflix analysis: there are 17,770 movies (items), 480,189 customers and 100,480,507 observations on a 1-5 rating scale. If this was set up for Winsteps (capacity 40,000 items, 10 million persons) the data set size would be about 20,000 x 500,000 = 10GB and the required workspace about 50GB - too big for my current hardware. So I'm running the data through Facets - which is very efficient with sparse data matrices - the Netflix data is 99% missing. No unusual problem with the data file or data input. But there is a computational overflow halfway through the first estimation iteration - time to enhance the Facets software!

This page is powered by Blogger. Isn't yours?