StarPU Handbook
 All Data Structures Files Functions Variables Typedefs Enumerations Enumerator Groups Pages
Data Structures | Enumerations | Enumerator | Functions
Performance Model

Data Structures

struct  starpu_perfmodel
struct  starpu_perfmodel_regression_model
struct  starpu_perfmodel_per_arch
struct  starpu_perfmodel_history_list
struct  starpu_perfmodel_history_entry

Enumerations

enum  starpu_perfmodel_archtype { STARPU_CPU_DEFAULT, STARPU_CUDA_DEFAULT, STARPU_OPENCL_DEFAULT }
enum  starpu_perfmodel_type {
  STARPU_PER_ARCH, STARPU_COMMON, STARPU_HISTORY_BASED, STARPU_REGRESSION_BASED,
  STARPU_NL_REGRESSION_BASED
}

Functions

void starpu_perfmodel_free_sampling_directories (void)
int starpu_perfmodel_load_symbol (const char *symbol, struct starpu_perfmodel *model)
int starpu_perfmodel_unload_model (struct starpu_perfmodel *model)
void starpu_perfmodel_debugfilepath (struct starpu_perfmodel *model, enum starpu_perfmodel_archtype arch, char *path, size_t maxlen, unsigned nimpl)
void starpu_perfmodel_get_arch_name (enum starpu_perfmodel_archtype arch, char *archname, size_t maxlen, unsigned nimpl)
enum starpu_perfmodel_archtype starpu_worker_get_perf_archtype (int workerid)
int starpu_perfmodel_list (FILE *output)
void starpu_perfmodel_directory (FILE *output)
void starpu_perfmodel_print (struct starpu_perfmodel *model, enum starpu_perfmodel_archtype arch, unsigned nimpl, char *parameter, uint32_t *footprint, FILE *output)
int starpu_perfmodel_print_all (struct starpu_perfmodel *model, char *arch, char *parameter, uint32_t *footprint, FILE *output)
void starpu_bus_print_bandwidth (FILE *f)
void starpu_bus_print_affinity (FILE *f)
void starpu_perfmodel_update_history (struct starpu_perfmodel *model, struct starpu_task *task, enum starpu_perfmodel_archtype arch, unsigned cpuid, unsigned nimpl, double measured)
double starpu_transfer_bandwidth (unsigned src_node, unsigned dst_node)
double starpu_transfer_latency (unsigned src_node, unsigned dst_node)
double starpu_transfer_predict (unsigned src_node, unsigned dst_node, size_t size)

Detailed Description


Data Structure Documentation

struct starpu_perfmodel

Contains all information about a performance model. At least the type and symbol fields have to be filled when defining a performance model for a codelet. For compatibility, make sure to initialize the whole structure to zero, either by using explicit memset, or by letting the compiler implicitly do it in e.g. static storage case. If not provided, other fields have to be zero.

Data Fields

enum starpu_perfmodel_type type
double(* cost_model )(struct starpu_data_descr *)
double(* cost_function )(struct starpu_task *, unsigned nimpl)
size_t(* size_base )(struct starpu_task *, unsigned nimpl)
struct starpu_perfmodel_per_arch per_arch [STARPU_NARCH_VARIATIONS][STARPU_MAXIMPLEMENTATIONS]
const char * symbol
unsigned is_loaded
unsigned benchmarking
starpu_pthread_rwlock_t model_rwlock

Field Documentation

starpu_perfmodel::type

is the type of performance model

starpu_perfmodel::cost_model
Deprecated:
This field is deprecated. Use instead the field starpu_perfmodel::cost_function field.
starpu_perfmodel::cost_function

Used by STARPU_COMMON: takes a task and implementation number, and must return a task duration estimation in micro-seconds.

starpu_perfmodel::size_base

Used by STARPU_HISTORY_BASED, STARPU_REGRESSION_BASED and STARPU_NL_REGRESSION_BASED. If not NULL, takes a task and implementation number, and returns the size to be used as index for history and regression.

starpu_perfmodel::per_arch

Used by STARPU_PER_ARCH: array of structures starpu_per_arch_perfmodel

starpu_perfmodel::symbol

is the symbol name for the performance model, which will be used as file name to store the model. It must be set otherwise the model will be ignored.

starpu_perfmodel::is_loaded

Whether the performance model is already loaded from the disk.

starpu_perfmodel::benchmarking

Whether the performance model is still being calibrated.

starpu_perfmodel::model_rwlock

Lock to protect concurrency between loading from disk (W), updating the values (W), and making a performance estimation (R).

struct starpu_perfmodel_regression_model

...

Data Fields
double sumlny sum of ln(measured)
double sumlnx sum of ln(size)
double sumlnx2 sum of ln(size)^2
unsigned long minx minimum size
unsigned long maxx maximum size
double sumlnxlny sum of ln(size)*ln(measured)
double alpha estimated = alpha * size ^ beta
double beta estimated = alpha * size ^ beta
unsigned valid whether the linear regression model is valid (i.e. enough measures)
double a estimated = a size ^b + c
double b estimated = a size ^b + c
double c estimated = a size ^b + c
unsigned nl_valid whether the non-linear regression model is valid (i.e. enough measures)
unsigned nsample number of sample values for non-linear regression
struct starpu_perfmodel_per_arch

contains information about the performance model of a given arch.

Data Fields

double(* cost_model )(struct starpu_data_descr *t)
double(* cost_function )(struct starpu_task *task, enum starpu_perfmodel_archtype arch, unsigned nimpl)
size_t(* size_base )(struct starpu_task *, enum starpu_perfmodel_archtype arch, unsigned nimpl)
struct
starpu_perfmodel_history_table * 
history
struct
starpu_perfmodel_history_list
list
struct
starpu_perfmodel_regression_model 
regression

Field Documentation

starpu_perfmodel_per_arch::cost_model
Deprecated:
This field is deprecated. Use instead the field starpu_perfmodel_per_arch::cost_function.
starpu_perfmodel_per_arch::cost_function

Used by STARPU_PER_ARCH, must point to functions which take a task, the target arch and implementation number (as mere conveniency, since the array is already indexed by these), and must return a task duration estimation in micro-seconds.

starpu_perfmodel_per_arch::size_base

Same as in structure starpu_perfmodel, but per-arch, in case it depends on the architecture-specific implementation.

starpu_perfmodel_per_arch::history

The history of performance measurements.

starpu_perfmodel_per_arch::list

Used by STARPU_HISTORY_BASED and STARPU_NL_REGRESSION_BASED, records all execution history measures.

starpu_perfmodel_per_arch::regression

Used by STARPU_REGRESSION_BASED and STARPU_NL_REGRESSION_BASED, contains the estimated factors of the regression.

struct starpu_perfmodel_history_list

todo

Data Fields
struct
starpu_perfmodel_history_list *
next todo
struct
starpu_perfmodel_history_entry *
entry todo
struct starpu_perfmodel_history_entry

todo

Data Fields
double mean mean_n = 1/n sum
double deviation n dev_n = sum2 - 1/n (sum)^2
double sum sum of samples (in µs)
double sum2 sum of samples^2
unsigned nsample number of samples
uint32_t footprint data footprint
size_t size in bytes
double flops Provided by the application

Enumeration Type Documentation

Enumerates the various types of architectures.

it is possible that we have multiple versions of the same kind of workers, for instance multiple GPUs or even different CPUs within the same machine so we do not use the archtype enum type directly for performance models.

Enumerator:
STARPU_CPU_DEFAULT 

CPU combined workers between 0 and STARPU_MAXCPUS-1

STARPU_CUDA_DEFAULT 

CUDA workers

STARPU_OPENCL_DEFAULT 

OpenCL workers

TODO

Enumerator:
STARPU_PER_ARCH 

Application-provided per-arch cost model function

STARPU_COMMON 

Application-provided common cost model function, with per-arch factor

STARPU_HISTORY_BASED 

Automatic history-based cost model

STARPU_REGRESSION_BASED 

Automatic linear regression-based cost model (alpha * size ^ beta)

STARPU_NL_REGRESSION_BASED 

Automatic non-linear regression-based cost model (a * size ^ b + c)

Function Documentation

void starpu_perfmodel_free_sampling_directories ( void  )

this function frees internal memory used for sampling directory management. It should only be called by an application which is not calling starpu_shutdown as this function already calls it. See for example tools/starpu_perfmodel_display.c.

int starpu_perfmodel_load_symbol ( const char *  symbol,
struct starpu_perfmodel model 
)

loads a given performance model. The model structure has to be completely zero, and will be filled with the information saved in $STARPU_HOME/.starpu. The function is intended to be used by external tools that should read the performance model files.

int starpu_perfmodel_unload_model ( struct starpu_perfmodel model)

unloads the given model which has been previously loaded through the function starpu_perfmodel_load_symbol()

void starpu_perfmodel_debugfilepath ( struct starpu_perfmodel model,
enum starpu_perfmodel_archtype  arch,
char *  path,
size_t  maxlen,
unsigned  nimpl 
)

returns the path to the debugging information for the performance model.

void starpu_perfmodel_get_arch_name ( enum starpu_perfmodel_archtype  arch,
char *  archname,
size_t  maxlen,
unsigned  nimpl 
)

returns the architecture name for arch

enum starpu_perfmodel_archtype starpu_worker_get_perf_archtype ( int  workerid)

returns the architecture type of a given worker.

int starpu_perfmodel_list ( FILE *  output)

prints a list of all performance models on output

int starpu_perfmodel_directory ( FILE *  output)

prints the directory name storing performance models on output

void starpu_perfmodel_print ( struct starpu_perfmodel model,
enum starpu_perfmodel_archtype  arch,
unsigned  nimpl,
char *  parameter,
uint32_t *  footprint,
FILE *  output 
)

todo

int starpu_perfmodel_print_all ( struct starpu_perfmodel model,
char *  arch,
char *  parameter,
uint32_t *  footprint,
FILE *  output 
)

todo

void starpu_bus_print_bandwidth ( FILE *  f)

prints a matrix of bus bandwidths on f.

void starpu_bus_print_affinity ( FILE *  f)

prints the affinity devices on f.

void starpu_perfmodel_update_history ( struct starpu_perfmodel model,
struct starpu_task task,
enum starpu_perfmodel_archtype  arch,
unsigned  cpuid,
unsigned  nimpl,
double  measured 
)

This feeds the performance model model with an explicit measurement measured (in µs), in addition to measurements done by StarPU itself. This can be useful when the application already has an existing set of measurements done in good conditions, that StarPU could benefit from instead of doing on-line measurements. And example of use can be seen in PerformanceModelExample.

double starpu_transfer_bandwidth ( unsigned  src_node,
unsigned  dst_node 
)

Return the bandwidth of data transfer between two memory nodes

double starpu_transfer_latency ( unsigned  src_node,
unsigned  dst_node 
)

Return the latency of data transfer between two memory nodes

double starpu_transfer_predict ( unsigned  src_node,
unsigned  dst_node,
size_t  size 
)

Return the estimated time to transfer a given size between two memory nodes.