mlpack_cf man page

mlpack_cf — collaborating filtering


 mlpack_cf [-h] [-v]  


This program performs collaborative filtering (CF) on the given dataset. Given a list of user, item and preferences (--training_file) the program will perform a matrix decomposition and then can perform a series of actions related to collaborative filtering. Alternately, the program can load an existing saved CF model with the --input_model_file (-m) option and then use that model to provide recommendations or predict values.

The input file should contain a 3-column matrix of ratings, where the first column is the user, the second column is the item, and the third column is that user's rating of that item. Both the users and items should be numeric indices, not names. The indices are assumed to start from 0.

A set of query users for which recommendations can be generated may be specified with the --query_file (-q) option; alternately, recommendations may be generated for every user in the dataset by specifying the --all_user_recommendations (-A) option. In addition, the number of recommendations per user to generate can be specified with the --recommendations (-r) parameter, and the number of similar users (the size of the neighborhood) to be considered when generating recommendations can be specified with the --neighborhood (-n) option.

For performing the matrix decomposition, the following optimization algorithms can be specified via the --algorithm (-a) parameter:  ’RegSVD' -- Regularized SVD using a SGD optimizer ’NMF' -- Non-negative matrix factorization with alternating least squares update rules ’BatchSVD' -- SVD batch learning ’SVDIncompleteIncremental' -- SVD incomplete incremental learning ’SVDCompleteIncremental' -- SVD complete incremental learning

A trained model may be saved to a file with the --output_model_file (-M) parameter.

Optional Input Options

--algorithm (-a) [string]

Algorithm used for matrix factorization. Default value 'NMF'. --all_user_recommendations (-A)  Generate recommendations for all users.

--help (-h)

Default help info.

--info [string]

Get help on a specific module or option. Default value ''. --input_model_file (-m) [string]  File to load trained CF model from. Default value ''. --iteration_only_termination (-I)  Terminate only when the maximum number of iterations is reached.

--max_iterations (-N) [int]

Maximum number of iterations. Default value


--min_residue (-r) [double]

Residue required to terminate the factorization (lower values generally mean better fits).  Default value 1e-05.

--neighborhood (-n) [int]

Size of the neighborhood of similar users to consider for each query user. Default value 5.

--query_file (-q) [string]

List of users for which recommendations are to be generated. Default value ''.

--rank (-R) [int]

Rank of decomposed matrices (if 0, a heuristic is used to estimate the rank). Default value


--recommendations (-c) [int] Number of recommendations to generate for each query user. Default value 5.

--seed (-s) [int]

Set the random seed (0 uses std::time(NULL)). Default value 0.

--test_file (-T) [string]

Test set to calculate RMSE on. Default value ’'. --training_file (-t) [string]  Input dataset to perform CF on. Default value ’'.

--verbose (-v)

Display informational messages and the full list of parameters and timers at the end of execution.

--version (-V)

Display the version of mlpack.

Optional Output Options

--output_file (-o) [string]

File to save output recommendations to. Default value ''. --output_model_file (-M) [string]  File to save trained CF model to. Default value ’'.

Additional Information

