Site Search
Computer Science


Jessica Chen, Ph.D.Dr. Jessica Chen
Dr. Jessica Chen
Alioune Ngom, Ph.D.Dr. Alioune Ngom
Dr. Alioune Ngom
Imran Ahmad, Ph.D.Dr. Imran Ahmad
Dr. Imran Ahmad
Windsor WaterfrontWindsor Waterfront Park
Windsor Waterfront Park
Dr. Scott GoodwinDr. Scott Goodwin
Dr. Scott Goodwin
Xiaobu Yuan, Ph.D.Dr. Xiaobu Yuan
Dr. Xiaobu Yuan
Arunita Jaekel, Ph.D.Dr. Arunita Jaekel
Dr. Arunita Jaekel
Dr. Robert KentDr. Robert Kent
Dr. Robert Kent
Dr. Ziad Kobti lecturingDr. Ziad Kobti
Dr. Ziad Kobti
Lambton TowerLambton Tower
Lambton Tower
Christie Ezeife, Ph.D.Dr. Christie Ezeife
Dr. Christie Ezeife
Robin Gras, Ph.D.Dr. Robin Gras
Dr. Robin Gras
Dr. Luis RuedaDr. Luis Rueda
Dr. Luis Rueda

Discovering E-commerce Data Sets Sequential Patterns for Recommendation

Add this event into your calendar using the iCAL format
  • Fri, 09/14/2018 - 10:30am - 12:30pm

Discovering E-commerce Data Sets Sequential Patterns for Recommendation

MSc Thesis Proposal by:

Raj Bhatta

Date:  Friday, September 14th, 2018
Time:  10: 30 am – 12:30 pm
Location: 3105, Lambton Tower

Abstract: In E-commerce Recommendation system accuracy will be improved if more complex sequential patterns of user purchase behavior are learned and included in the user-item matrix to make it more informative before collaborative filtering. Existing recommendation systems that attempted use of mining and some sequences are those referred to as ChoiRec12, LiuRec07, and SuChenRec15. LiuRec007 system clusters users with similar clickstream sequence data, then uses association rule mining and segmentation based collaborative filtering to select Top-N neighbours from the cluster to which a target user belongs.  Using binary (purchase/not purchased) analysis of shopping basket, it derives the prediction score of items not yet purchased by the target user based on the frequency count of k-neighbours. LiuRec007 system does not learn user historical sequential purchase behavior from historical data. ChoiRec012 system derives implicit rating value from online transaction data which it uses to improve user-item rating matrix input to Collaborative filtering. It also fails to learn detailed sequential customer behavior including utility of items such as frequency, price and profit gained. SuChenRec15 system is based on clickstream sequence similarity using frequency of purchase of items, duration of time spent and clickstream path, but also unable to integrate sequential pattern of customer purchases.

This thesis proposes two algorithms called SHOD (Sequence Historical Dataset) and SCID (Sequence Clickstream Dataset) for first pre-processing large amounts of E-Commerce historical and clickstream datasets from and ACM RecSys datasets into sequential databases for discovering sequential patterns of customer purchase behavior for computing more informative user-item matrix for collaborative filtering recommendation.  The datasets will be generated into time sessions as daily, weekly, monthly and yearly purchase habits to track purchase habit change of users over time. Thesis will extract, transform and load data to clean, filter, enrich and split the data using data extraction and transformation modules where indicators to represent each user’s record have to be added. Finally, from created sequential database, sequential pattern algorithms as PLWAP or GSP algorithm will be used to discover sequential patterns of customer purchases such as <useri, (item5, item1),(item3>. This sequential pattern means that whenever useri purchased items5 and item1 in one visit, they follow it up with purchase of item3 in the next visit.

See More: