$Id: README,v 1.41 2009/11/10 16:16:53 ruiliu Exp $

This directory contains various small example programs to help you get a feel
for using PerfSuite.  These are not built during the installation process, but
have Makefiles that you can tailor for your own environment.

If you create any interesting examples or test cases and would like to share
with others, please let us know!

The current contents are:

cpi
   Serial, POSIX threads, OpenMP, MPI, and Java variants of a program that
   approximates the value of pi.  Only serial and pthreads are built by
   default, but the Makefile supports additional targets "ps" (PerfSuite),
   "omp" (OpenMP), "psomp" (OpenMP + PerfSuite), "mpi" (MPI)
   and "java" (Java + PerfSuite hwpc package support).
   "make all" will attempt to build them all.

hl
   A test case adapted from the PAPI distribution to demonstrate how to
   selectively monitor portions of source code.  This example assumes PAPI
   support (if not available, you should still be able to build but will have
   to hand-edit the Makefile).

matvec
   Matrix-vector multiplication in Fortran 77.  Try interchanging the loops in
   the sgemv subroutine and note performance effects.  A second version of this
   program is included that contains calls to the libpshwpc library.

misc
   Miscellaneous small examples.  Currently contains:
   
   "geneventlist": a Bourne shell script that filters the output of psinv to
   produce an XML document that contains all available PAPI events.  This
   output could be used as a starting point for making a custom configuration
   file for psrun or libpshwpc (note that psinv will do the same thing if given
   the option "--papi=xml").
   
   "getsolibs": a Bourne shell script to help determine what shared libraries
   are used by a dynamically linked program.

   "misalign.c": a C program that attempts to force misaligned data accesses.
   You may need to tailor this for your computer to obtain observable results.
   Originally written for Itanium.

   "mpicheckargs.c": a C program that can be used to check if the MPI
   implementation you are using preserves command names and arguments, which is
   necessary to be able to use psrun with MPI programs.

   "profile_rate.c": a C program that suggests a threshold to use for profiling
   with the processor cycle counter to approximate the same sampling rate that
   Linux profil()/gprof would use.

   "psrun-batch": a shell script that executes a command with multiple psrun
   configuration files.

   "psrunfull.pl": a Perl script contributed by Kalev Leetaru that executes
   a given command using all available PAPI events on the system.  The script
   breaks up the number of events used into user-specified chunks (up to a
   maximum of 32, the limit for PAPI multiplexing) and runs the program
   multiple times if necessary until all events are used.

psrun-raw
   A shell script that filters the output of psrun to display only the raw
   event counts collected.  Requires PAPI support.  Note that you can achieve
   the same effect by using psrun's text-output mode ("-F text -o stdout") but
   this example still shows basic filtering of the default XML output with the
   shell.

pstcl
   Sample Tcl scripts that exercise a few PerfSuite Tcl features:

   "hwpcfilter.tcl": a filter program that allows you to examine a selected
   portion of a source file that has been profiled.  For example, you might
   want to see what the cache miss ratio or megaflop rate is for a key loop
   within your program.  You can supply arbitrary source line ranges to this
   script, as it is not restricted simply to loops.  If no range is specified,
   the program effectively converts a profiling report to a counting report
   based on the samples collected and the sampling rate for each event.
   Instructions for use are contained in the source code.

   "hwpcpatch.tcl": an example that "patches" a counting hwpcreport, allowing
   you to replace one or more event counts within with alternate values that
   you specify.

   "maxwall.tcl": an example that adjusts the <wallclock> elements of each
   report contained within a <multihwpcreport> to the maximum among them.
   This can provide an alternate view of wallclock-based rate metrics for
   parallel programs.

   "multihwpc-extract.tcl": an example that extracts individual reports from a
   PerfSuite <multi>-document.  Note that psprocess has an option "--extract"
   that will do the same thing.  This example is included as a short example
   showing the use of tDOM with PerfSuite XML documents.

   "multihwpc-merge.tcl": an example that merges the individual event counts
   from a <multihwpcreport> (counting mode) into a single <hwpcreport>, as if
   generated from a single run.  This mimics multiplexing with an arbitrary
   number of events.

   "papiavail.tcl": a pure Tcl program that exercises the PerfSuite Tcl
   extension for PAPI.  It shows machine information and available performance
   events if you have PAPI installed on your system.  It's a basic alternative
   to PAPI's "avail" test case/utility.

   "runall.tcl": an example that runs a command using psrun multiple times,
   using every performance event available on your system.

psxml
   Sample Java programs to demonstrate PerfSuite Java XML parsing API features.

   PS_SimpleReportTest class:
       takes an XML file name as input, parses it,
       and prints out the parsed result in text format.

   PS_ReportTest class:
       takes an XML file name,
       a flag (true or false) indicating whether DTD validation is to be done,
       and a flag (print or noprint) indicating whether parsed output is
       to be printed out. The output also shows the amount of time and memory
       used to parse the given XML file and print out the parsed result.

   API_Demo class:
       takes an XML file name, and demonstrates the use of
       most methods in the Java XML parsing API.

   Usage:
       java -classpath <path_to_perfsuite.jar>:. PS_SimpleReportTest <xml_file_name>
           to parse the XML file, and print the parsed result using
           toString() method.

       java -classpath <path_to_perfsuite.jar>:. PS_ReportTest <true | false> <print | noprint> <xml_file_name>
           to parse and optionally print the XML file using toString() method.

       java -classpath <path_to_perfsuite.jar>:. API_Demo <xml_file_name>
           to parse the XML file and print the parsed result using the "getter"
           methods. This demonstrates the usage of the "getter" APIs to
           programmatically access the fields after the XML is parsed.

   Example:
       java -classpath ../../javalib/perfsuite.jar:. PS_SimpleReportTest multi.xml
       java -classpath /usr/local/share/perfsuite/javalib/perfsuite.jar:. PS_ReportTest false print ls.27847.pureland.xml
       java -classpath ../../javalib/perfsuite.jar:. API_Demo multi.xml

metrics-java-api
   An example Java program to demonstrate PerfSuite Java metric calculation.

   Main class:
       takes the name of a PerfSuite hwpcreport in count mode and a metric
       definition file name as input, calculates the metric values and
       display them in localized description strings.
   Example:
       java -Duser.language=es -Duser.region=ES -classpath ../../javalib/perfsuite.jar:../../javalib/resources:. Main <a_counting_report_using_PAPI_event.xml> ../../xml/pshwpc/PAPI_metrics.xml

sampler
   An example program that uses libpshwpc in conjunction with interval timers
   to sample the value of performance counters at fixed intervals to provide
   real-time monitoring of performance data.

threadrun
   A POSIX threads program in which individual threads execute different source
   code.  Useful to verify thread-aware profiling.
