Dividing data into training and test sets in recommender systems

I have implemented a recommendation system based on matrix factorization methods. I want to appreciate it.

I want to use 10x cross-protocol validation with All-but-one protocol ( https://ai2-s2-pdfs.s3.amazonaws.com/0fcc/45600283abca12ea2f422e3fb2575f4c7fc0.pdf ).

My dataset has the following structure:

user_id,item_id,rating
1,1,2
1,2,5
1,3,0
2,1,5
...

      

I'm embarrassed to think about how the data will be split because I can't fit some triplets (user, item, rating) into the test suite. For example, if I select a triple (2,1,5) for a test suite, and this is the only user rating of 2, there will be no other information about that user and the trained model will not predict any value for that user.

Given this scenario, how do I do the splitting?

+3


source to share


2 answers


In your basic logic, you are completely correct: if you only have one observation in a class, you should indicate that there is some fairness in the training set for the model in that class.

However, splitting the input into these classes depends on the interaction between different observations. Can you define data classes like mentioning "single rating"? When you find other small classes, you also need to make sure that you have enough observational data in your training data.



Unfortunately, this is a difficult process to automate. Most one-time applications simply have to select these observations from the data and then distribute the rest to normal divisions. This has the problem that special cases are over-represented in the training set, which can take some of the distraction from normal cases when training the model.

Do you have the ability to tweak the model when faced with later data? This is generally the best way to handle sparse input classes.

+1


source


collaborative filtering (matrix factorization) may not have a good recommendation for an invisible user without feedback. However, the assessment should consider this case and take it into account.

One thing you can do is report performance to all tested users, just test users with some feedback and just invisible users with no feedback.



So I would say keep the test, train separately, but evaluate separately for invisible users.

More details here .

0


source







All Articles