Programming environment

This guide is for setting up your own machine with the software needed to work through the weekly and mandatory exercises in IN4080 Natural Language Processing. For most of our work, we will use Python, extended with various packages.

We recommend you to install the environment on your own computer. This setup is covered in the section Setup with Anaconda below. You can also work on the IFI machines. Details about this option are available at the end of this document.

Setup with Anaconda

Download and install either Anaconda or Miniconda

For the differences between the two, see here.

Follow the online guides to install Anaconda or Miniconda. Once you got a working shell with conda installed you can continue here.

You should now have access to the conda command. For a quick test execute the conda command without any arguments.

conda

You are recommended to download the conda cheat sheet and keep it at hand.

Create environment

Make sure that you are in the directory with the environment.yml file that you can download here. The following command will create an environment named in4080_2023 with all the Python modules that we'll use in the course.

conda env create

If you want to change the name of the environment, you can edit the name in the environment.yml file before executing the command. Beware that may take some time to create the environment.

Activate the environment

To activate the environment

conda activate in4080_2023

If you want to get back to you standard python and packages:

conda deactivate

In Windows you can choose the environment from the start menu. There is also an option to open Jupyter Notebooks directly in your browser. This is the recommended way of working in this course.

Working on IFI machines

Please contact the course teachers to get access to the machines. The information below is from previous years and may be outdated.

IFI machines

On an IFI terminal, the default python is now

opt/ifi/anaconda3/bin/python3

This means that when you start a python/jupyter notebook session, you get access to all these packages.

In the course, we will use several corpora and data provided with nltk. To get easy access to them on the IFI-machines, you can put

export NLTK_DATA=/projects/nlp/nltk_data

in your .bashrc file. Then you don't have to install the data to your own disk area.

Remote login

You may login remotely to the IFI-machines when working from home. The recommended solution is now to use VDI, see here

It will (eventually) give you access to the same environment as when you login directly to an IFI linux machine. However, you will not have access to the /projects/nlp/. You will have to download data within nltk when needed.