Your first GPU job

From ALICE Documentation

Revision as of 14:38, 13 October 2020 by Schulzrf (talk | contribs) (About this walkthrough)

About this walkthrough

This walkthrough will guide you through running a job on one of ALICE's GPU nodes. It uses TensorFlow and Keras to train a model on an example dataset using one GPU. You can find the full tutorial here: Tutorial

What you will learn?

  • Setting up the batch script for a job using GPUs
  • Setting up a basic TensorFlow+Keras job
  • Move data to and from local node scratch
  • Loading the necessary modules
  • Submitting your job
  • Monitoring your job
  • Collect information about your job

What this example will not cover?

  • Introducing TensorFlow, Keras or machine learning in general
  • Installing your own or special Python modules
  • Using multiple GPUs
  • Compiling code for GPU

What you should know before starting?

  • Basic Python. This walkthrough is not intended as a tutorial on Python. If you are completely new to Python, we recommend that you go through a generic Python tutorial first. There are many great ones out there.
  • Basic understanding of machine learning is recommended. However, this is a kind of HelloWorld programme for TensorFlow. Therefore, you do not need prior knowledge of TensorFlow.
  • Basic knowledge of how to use a Linux OS from the command line.
  • How to connect to ALICE.
  • How to move files to and from ALICE.
  • How to setup a simple batch job as shown in: Your first bash job