Before compiling and executing the example to be profiled with TAU, certain TAU environment variables need to be set appropriately:
TAUROOTDIR and TAULIBDIR might already be set on a system where TAU has been installed (or, if the system uses 'modules', after loading the TAU module). However, the TAU_MAKEFILE variable must be set by the user and will depend on what measurements are of interest. TAU_MAKEFILE should be set to one of the makefiles in the TAU installation. The command 'ls $TAULIBDIR/Makefile.tau*' should print a list of all available makefiles to choose from.
In these examples, we are going to use the makefile setup for automatic instrumentation and MPI codes. To automatically insert annotations in the source code, TAU uses the Program Database Toolkit (PDT). Assuming TAUROOTDIR is set correctly, we can use this variable to set the TAU_MAKEFILE variable to $(TAUROOTDIR)/Makefile-mpi-pdt.
The next step is to compile the target source code using TAU. TAU provides several shell scripts that act as the compiler. The most commonly used compiler scripts are:
For example, 'tau_f90.sh matmult.f90' should compile the example. After compiling the code, run the resulting executable. This should produce TAU profiling data in files named profile.x.x.x. There should be one profile data file per MPI process. If run with 4 processors, the output might be the following 4 files:
A quick way to see the results is to use TAU's pprof command which is a text-based tool that works similarly to GNU's gprof. In the same directory as the profile.x.x.x files, run 'pprof'. This should print out profiling tables for each MPI process as well as a summary. An example of pprof output is here. The tables show values such as time spent in each routine (inclusive and exclusive), the number of times the routine was called, and the percentage of time spent in each routine. A function summary table is also generated at the end that reduces all of the data from each MPI processes.
Another TAU tool is paraprof which has a window user interface. To begin, run 'paraprof' in the same directory as the profile data. Note that paraprof can be run on a different platform than was used to generate the data. An example of the initial window is seen below:

To show time spent in the routines, double-click on "TIME", which should present another window similar to:

To show the legend, bring the focus to the TIME metric window, then select Window->"Function Legend".
