Abstract

The result of this thesis is a distributed recommender system based on the item-item collaborative filtering. The recommendation algorithm builds an item-item similarity matrix based on the collaboratively collected data on user-item interactions, for all users in the system. The recommendation algorithm supports several similarity measures including a vector normalisation of rows in the matrix. Moreover, the recommendation algorithm supports three different distributed matrix multiplication algorithms. The entire recommender system source code is written in Scala programming language based on Apache Spark. However, the data pre-processing scripts are written in C++ programming language executed in a single-node environment. The tests and performance evaluation of the implemented algorithm were executed on a Cloudera cluster using real dataset obtained from the particular case study.

Running

$spark-submit \
    --class hr.fer.ztel.thesis.Main \
    --master yarn --deploy-mode cluster \
    ...
    <jar> \
    <mode: inner, outer, blocks> \
    <input folder> \
    <user-item file> \
    <item-item file> \
    <similarity measure: cos, yuleq, llr> \
    <normalisation: true, false> \
    <output file> \
    <topK: 5, ... > \
    <block size: 1024, 2048, ... >

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.idea		.idea
project		project
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Abstract

Running

About

Releases

Packages

Languages

License

fpopic/master-thesis

Folders and files

Latest commit

History

Repository files navigation

Abstract

Running

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages