This is the page of the lab section of the Data Mining course for A.Y. 2020-21.
If you are looking for the main course website, click here.
Colab notebooks:
PyTorch Basics and Feedforward Neural Networks: here
Text Classification with TorchText and LSTMs: here
You may find also useful a notebook on CNNs and Image Classification.
Getting started:
In this lab we will see how to use PyTorch. You don't need to install anything fancy since we will use Google Colab.
Download the Quora Questions dataset here. I suggest you upload it on your Google Drive (remember the location!) in order to access it fastly via Colab.
The commented notebook from the lab can be found here.
Getting started:
You should all have Spark installed on your system now, but if you don't:
Install on Windows: https://phoenixnap.com/kb/install-spark-on-windows-10
Install on Linux: https://computingforgeeks.com/how-to-install-apache-spark-on-ubuntu-debian
Install pyspark: pip install pyspark.
Install Jupyter Notebook (it'll be easier to follow the lab).
Download James Joyce's Ulysses here (we will use it for examples).
Friday 15-16:30 (email me first).
Due to the current pandemic situation, office hours are suspended.
May you need anything, just send me an emal and we'll arrange an online meeting.