摘要
随着互联网技术的快速发展,在给用户带来诸多方便、满足用户需求的同时,也伴随着带来了信息过载问题。如何从庞大的信息中快速找到感兴趣的信息变得及其重要,个性化推荐也因此变得比较热门,电商平台通常利用用户平时购买商品的记录、门户网站通常根据用户浏览新闻的类别、娱乐行业通过分析用户观看电影的类型等历史行为数据来挖掘用户的兴趣,并对其进行推荐相关的信息。通常根据用户维度、物品维度、或者深度学习的模型对推荐算法进行划分。尽管传统的协同过滤推荐算法已经在广泛的使用,但是该算法仍然存在推荐精度不高、新物品的冷启动问题等。本文旨在利用深度学习模型改善推荐算法的精确度。
本文主要的工作:介绍传统推荐算法(基于用户的UserCF推荐算法、基于物品的ItemCF推荐算法、基于矩阵分解的FunkSVD推荐算法)的原理,并基于1M的movielens数据集对上述算法进行实验。通过实验分析得知UserCF推荐算法和ItemCF推荐算法二者推荐的精确率和覆盖率较低,同时FunkSVD算法预测的结果和实际情况偏差较大。近些年随着深度学习的火热兴起,采用深度学习模型和协同过滤算法结合也变的越来越热门。为了解决上述问题,本文首先介绍了如何将受限玻尔兹曼机模型和推荐算法结合,并提出了一种提取数据特征的方法——设定阈值提取数据特征,再此基础上本文通过将RBM推荐算法和ItemCF推荐算法加权融合,介绍了一种改进的K-Item RBM推荐算法。最后通过提取的特征对算法模型训练并预测,通过实验对比分析得出,K-Item RBM算法可以降低预测数据和真实数据的误差、改善推荐系统的性能;此外为了提高推荐的精确率,本文介绍一种改进的CNN-CF神经网络推荐算法,该算法采用卷积神经网络(CNN)提取数据集中的文本特征,然后对算法模型进行训练,最后对用户做出个性化推荐。通过实验对比分析,该算法推荐的精确率和覆盖率有显著的提升。在最后本文通过对娱乐行业中电影推荐网站的细微分析和需求调研,在理清需求和核心推荐算法的基础上对电影推荐应用做了整体框架设计、数据库设计,实现了一个基于深度学习的推荐算法应用。
关键词:个性化推荐;数据挖掘;协同过滤;深度学习;卷积神经网络
Abstract
With the interactive-highly development of computer technology, although it brings a lot of convenience and solves the needs of users, it also occurs the information overload problem. How to quickly find the information of interest from these huge information has became important, and personalized recommendation has became more popular. The e-commerce platform usually uses the record of the user to purchase the goods at ordinary times, the portal website often uses the category of the user browsing the news to recommends relevant information, the entertainment industry always analyzes the type of the user watching the movie to Attract users' attention. The current collaborative filtering recommendation algorithm is typically partitioned according to a user dimension, an item dimension, or a model of deep learning. Although it has been widely used, which still has low recommendation accuracy, cold start of new items and so on. So this paper aims to improve the accuracy of the recommendation algorithm by using a deep learning model.
The main work of this paper: introduce the principle of traditional recommendation algorithm (based on user collaborative filtering algorithm, item-based collaborative filtering algorithm, collaborative decomposition algorithm based on matrix decomposition), and experiment with above algorithms based on 1M's movielens dataset. It is found that the accuracy and coverage of the user-based collaborative filtering recommendation algorithm and the item-based collaborative filtering recommendation algorithm are relatively low, and founds that the matrix-based collaborative filtering algorithm which’s deviation between the prediction result and the actual value is large. With the rise of deep learning in recent years, the combination of deep learning model and collaborative filtering algorithm is adopted. It has also become more and more popular. In order to solve the above problems, the paper first introduces how to combine the restricted Boltzmann machine model with the recommendation algorithm, and proposes a method for extracting data features—setting thresholds to extract data features, and then given a new K-Item RBM recommendation algorithm. Finally the algorithm model is trained and predicted with the extracted features. The error between the predicted data and the real data can be reduced through the above K-Item algorithm by the experimental analysis and the performance of the recommendation system has been improved. In addition, in order to improve the accuracy of recommendation, this paper introduce an improved CNN-CF neural network recommendation algorithm, which uses convolutional neural network to extract the text data in the dataset, then trains the algorithm model, and finally makes personalized recommendations for users. Through the comparative analysis of the experiment, the accuracy and coverage of the recommendation are significantly improved. In the end, with making lots of the subtle analysis and demand research of the movie recommendation website in the entertainment industry, based on the clearing of the requirements and the core recommendation algorithm, we realize the overall framework design and database design of the film recommendation application, and finally complete a recommendation algorithm application based on deep learning.
Keywords: Personalized Recommendation; Data Mining; Collaborative Filtering; Deep Learning; Convolutional neural network
目录
摘要
Abstract
目录
第1章 绪论
1.1 研究背景和意义
1.2 国内外研究现状
1.3 研究内容
1.4 论文结构与组织
第2章 传统推荐算法
2.1 智能推荐
2.2 传统数据挖掘推荐算法
2.2.1 基于用户的协同过滤推荐算法
2.2.2 基于物品的协同过滤推荐算法
2.2.3 基于模型的协同过滤推荐算法
2.3 实验结果
2.3.1 数据集
2.3.2 推荐算法评价指标
2.3.3 实验结果及分析
2.4 本章小结
第3章 改进的K-Item RBM推荐算法
3.1 受限玻尔兹曼机
3.1.1 玻尔兹曼机
3.1.2 受限玻尔兹曼机
3.2 算法模型
3.2.1 模型训练特征的选取
3.2.2 受限玻尔兹曼机模型结构
3.2.3 改进的K-Item RBM推荐算法
3.3 实验结果
3.3.1 数据预处理
3.3.2 实验结果
3.4 算法对比分析
3.5 本章小结
第4章 改进的CNN-CF推荐算法
4.1 神经网络
4.1.1 传统神经网络
4.1.2 卷积神经网络(CNN)
4.1.3 卷积神经网络(CNN)与文本分类
4.2 算法模型
4.2.1 训练数据的特征描述
4.2.2 改进的CNN-CF算法模型结构
4.3 实验结果
4.4 算法对比分析
4.5 本章小结
第5章 推荐算法的应用
5.1 电影推荐系统的需求
5.2 电影推荐系统的设计
5.2.1 电影推荐系统UI设计
5.2.2 电影推荐系统数据库设计
5.3 电影推荐系统的实现
5.3.1 requests模块介绍
5.3.2 lxml模块介绍
5.3.3 django框架介绍
5.4 电影推荐系统成果
5.5 本章小结
总结与展望
1. 主要工作总结
2. 未来工作展望
参考文献
致谢
























