This repo contains the code for HKUST COMP4332 projects, which is using the data from the Yelp Challenge. The model for each project is provided under model.py
or model.ipynb
on each folder, as well as the training and validation data used.
report.pdf
on each project will further discuss the model and features of the data used, as well as further explain the implementation and the final hyperparameters.
Project 1: Sentiment Analysis, predicting the rating based on the review provided by the user, mainly the text review is used. The final model uses Bidirectional-GRU with Time-Distributed layers, which able to achieve 70.25% validation accuracy
Project 2: Link Prediction using Deep Walk, predicting the presence of relationship between vertices using DFS-like approach. The final model uses AUC score metrics, and able to achieve 95.87%
Project 3: Recommendation Prediction based on Wide and Deep Learning implementation with some feature engineering. RMSE metrics is used, and the final model is able to achieve the value of 1.0293
Most of the training of the model is done on either Google Colab because of their TPU support. However, as running grid search requires significantly longer time to train the model, and Google Colab has its limit on the runtime, Intel AI Cluster is used instead.
The easiest way to run locally is to make a Conda environment for Python3.6 and install the required library in that environment:
- keras
- nltk
- tensorflow
- sklearn
- numpy
- pandas
- tqdm
- node2vec
- networkx
- gensim
- and other basic libraries