Application execution

Executing COMPSs applications

Prerequisites

Prerequisites vary depending on the application’s code language: for Java applications the users need to have a jar archive containing all the application classes, for Python applications there are no requirements and for C/C++ applications the code must have been previously compiled by using the buildapp command.

For further information about how to develop COMPSs applications please refer to Application development.

Runcompss command

COMPSs applications are executed using the runcompss command:

compss@bsc:~$ runcompss [options] application_name [application_arguments]

The application name must be the fully qualified name of the application in Java, the path to the .py file containing the main program in Python and the path to the master binary in C/C++.

The application arguments are the ones passed as command line to main application. This parameter can be empty.

The runcompss command allows the users to customize a COMPSs execution by specifying different options. For clarity purposes, parameters are grouped in Runtime configuration, Tools enablers and Advanced options.

compss@bsc:~$ runcompss -h

Usage: runcompss [options] application_name application_arguments

* Options:
General:
  --help, -h                              Print this help message

  --opts                                  Show available options

  --version, -v                         Print COMPSs version

Tools enablers:
  --graph=<bool>, --graph, -g             Generation of the complete graph (true/false)
                                          When no value is provided it is set to true
                                          Default: false
  --tracing=<level>, --tracing, -t        Set generation of traces and/or tracing level ( [ true | basic ] | advanced | scorep | arm-map | arm-ddt | false)
                                          True and basic levels will produce the same traces.
                                          When no value is provided it is set to true
                                          Default: false
  --monitoring=<int>, --monitoring, -m    Period between monitoring samples (milliseconds)
                                          When no value is provided it is set to 2000
                                          Default: 0
  --external_debugger=<int>,
  --external_debugger                     Enables external debugger connection on the specified port (or 9999 if empty)
                                          Default: false

Runtime configuration options:
  --task_execution=<compss|storage>       Task execution under COMPSs or Storage.
                                          Default: compss
  --storage_impl=<string>                 Path to an storage implementation. Shortcut to setting pypath and classpath. See Runtime/storage in your installation folder.
  --storage_conf=<path>                   Path to the storage configuration file
                                          Default: null
  --project=<path>                        Path to the project XML file
                                          Default: /apps/COMPSs/2.6.pr/Runtime/configuration/xml/projects/default_project.xml
  --resources=<path>                      Path to the resources XML file
                                          Default: /apps/COMPSs/2.6.pr/Runtime/configuration/xml/resources/default_resources.xml
  --lang=<name>                           Language of the application (java/c/python)
                                          Default: Inferred is possible. Otherwise: java
  --summary                               Displays a task execution summary at the end of the application execution
                                          Default: false
  --log_level=<level>, --debug, -d        Set the debug level: off | info | debug
                                          Warning: Off level compiles with -O2 option disabling asserts and __debug__
                                          Default: off

Advanced options:
  --extrae_config_file=<path>             Sets a custom extrae config file. Must be in a shared disk between all COMPSs workers.
                                          Default: null
  --comm=<ClassName>                      Class that implements the adaptor for communications
                                          Supported adaptors: es.bsc.compss.nio.master.NIOAdaptor | es.bsc.compss.gat.master.GATAdaptor
                                          Default: es.bsc.compss.nio.master.NIOAdaptor
  --conn=<className>                      Class that implements the runtime connector for the cloud
                                          Supported connectors: es.bsc.compss.connectors.DefaultSSHConnector
                                                              | es.bsc.compss.connectors.DefaultNoSSHConnector
                                          Default: es.bsc.compss.connectors.DefaultSSHConnector
  --streaming=<type>                      Enable the streaming mode for the given type.
                                          Supported types: FILES, OBJECTS, PSCOS, ALL, NONE
                                          Default: null
  --streaming_master_name=<str>           Use an specific streaming master node name.
                                          Default: null
  --streaming_master_port=<int>           Use an specific port for the streaming master.
                                          Default: null
  --scheduler=<className>                 Class that implements the Scheduler for COMPSs
                                          Supported schedulers: es.bsc.compss.scheduler.fullGraphScheduler.FullGraphScheduler
                                                              | es.bsc.compss.scheduler.fifoScheduler.FIFOScheduler
                                                              | es.bsc.compss.scheduler.resourceEmptyScheduler.ResourceEmptyScheduler
                                          Default: es.bsc.compss.scheduler.loadbalancing.LoadBalancingScheduler
  --scheduler_config_file=<path>          Path to the file which contains the scheduler configuration.
                                          Default: Empty
  --library_path=<path>                   Non-standard directories to search for libraries (e.g. Java JVM library, Python library, C binding library)
                                          Default: Working Directory
  --classpath=<path>                      Path for the application classes / modules
                                          Default: Working Directory
  --appdir=<path>                         Path for the application class folder.
                                          Default: /home/bsc19/bsc19234
  --pythonpath=<path>                     Additional folders or paths to add to the PYTHONPATH
                                          Default: /home/bsc19/bsc19234
  --base_log_dir=<path>                   Base directory to store COMPSs log files (a .COMPSs/ folder will be created inside this location)
                                          Default: User home
  --specific_log_dir=<path>               Use a specific directory to store COMPSs log files (no sandbox is created)
                                          Warning: Overwrites --base_log_dir option
                                          Default: Disabled
  --uuid=<int>                            Preset an application UUID
                                          Default: Automatic random generation
  --master_name=<string>                  Hostname of the node to run the COMPSs master
                                          Default:
  --master_port=<int>                     Port to run the COMPSs master communications.
                                          Only for NIO adaptor
                                          Default: [43000,44000]
  --jvm_master_opts="<string>"            Extra options for the COMPSs Master JVM. Each option separed by "," and without blank spaces (Notice the quotes)
                                          Default:
  --jvm_workers_opts="<string>"           Extra options for the COMPSs Workers JVMs. Each option separed by "," and without blank spaces (Notice the quotes)
                                          Default: -Xms1024m,-Xmx1024m,-Xmn400m
  --cpu_affinity="<string>"               Sets the CPU affinity for the workers
                                          Supported options: disabled, automatic, user defined map of the form "0-8/9,10,11/12-14,15,16"
                                          Default: automatic
  --gpu_affinity="<string>"               Sets the GPU affinity for the workers
                                          Supported options: disabled, automatic, user defined map of the form "0-8/9,10,11/12-14,15,16"
                                          Default: automatic
  --fpga_affinity="<string>"              Sets the FPGA affinity for the workers
                                          Supported options: disabled, automatic, user defined map of the form "0-8/9,10,11/12-14,15,16"
                                          Default: automatic
  --fpga_reprogram="<string>"             Specify the full command that needs to be executed to reprogram the FPGA with the desired bitstream. The location must be an absolute path.
                                          Default:
  --task_count=<int>                      Only for C/Python Bindings. Maximum number of different functions/methods, invoked from the application, that have been selected as tasks
                                          Default: 50
  --input_profile=<path>                  Path to the file which stores the input application profile
                                          Default: Empty
  --output_profile=<path>                 Path to the file to store the application profile at the end of the execution
                                          Default: Empty
  --PyObject_serialize=<bool>             Only for Python Binding. Enable the object serialization to string when possible (true/false).
                                          Default: false
  --persistent_worker_c=<bool>            Only for C Binding. Enable the persistent worker in c (true/false).
                                          Default: false
  --enable_external_adaptation=<bool>     Enable external adaptation. This option will disable the Resource Optimizer.
                                          Default: false
  --python_interpreter=<string>           Python interpreter to use (python/python2/python3).
                                          Default: python Version: 2
  --python_propagate_virtual_environment=<true>  Propagate the master virtual environment to the workers (true/false).
                                                 Default: true
  --python_mpi_worker=<false>             Use MPI to run the python worker instead of multiprocessing. (true/false).
                                          Default: false

* Application name:
    For Java applications:   Fully qualified name of the application
    For C applications:      Path to the master binary
    For Python applications: Path to the .py file containing the main program

* Application arguments:
    Command line arguments to pass to the application. Can be empty.

Running a COMPSs application

Before running COMPSs applications the application files must be in the CLASSPATH. Thus, when launching a COMPSs application, users can manually pre-set the CLASSPATH environment variable or can add the --classpath option to the runcompss command.

The next three sections provide specific information for launching COMPSs applications developed in different code languages (Java, Python and C/C++). For clarity purposes, we will use the Simple application (developed in Java, Python and C++) available in the COMPSs Virtual Machine or at https://compss.bsc.es/projects/bar webpage. This application takes an integer as input parameter and increases it by one unit using a task. For further details about the codes please refer to Sample Applications.

Running Java applications

A Java COMPSs application can be launched through the following command:

compss@bsc:~$ cd tutorial_apps/java/simple/jar/
compss@bsc:~/tutorial_apps/java/simple/jar$ runcompss simple.Simple <initial_number>

compss@bsc:~/tutorial_apps/java/simple/jar$ runcompss simple.Simple 1
[  INFO] Using default execution type: compss
[  INFO] Using default location for project file: /opt/COMPSs/Runtime/configuration/xml/projects/default_project.xml
[  INFO] Using default location for resources file: /opt/COMPSs/Runtime/configuration/xml/resources/default_resources.xml
[  INFO] Using default language: java

----------------- Executing simple.Simple --------------------------

WARNING: COMPSs Properties file is null. Setting default values
[(1066)    API]  -  Starting COMPSs Runtime v<version>
Initial counter value is 1
Final counter value is 2
[(4740)    API]  -  Execution Finished

------------------------------------------------------------

In this first execution we use the default value of the --classpath option to automatically add the jar file to the classpath (by executing runcompss in the directory which contains the jar file). However, we can explicitly do this by exporting the CLASSPATH variable or by providing the --classpath value. Next, we provide two more ways to perform the same execution:

compss@bsc:~$ export CLASSPATH=$CLASSPATH:/home/compss/tutorial_apps/java/simple/jar/simple.jar
compss@bsc:~$ runcompss simple.Simple <initial_number>

compss@bsc:~$ runcompss --classpath=/home/compss/tutorial_apps/java/simple/jar/simple.jar \
                        simple.Simple <initial_number>

Running Python applications

To launch a COMPSs Python application users have to provide the --lang=python option to the runcompss command. If the extension of the main file is a regular Python extension (.py or .pyc) the runcompss command can also infer the application language without specifying the lang flag.

compss@bsc:~$ cd tutorial_apps/python/simple/
compss@bsc:~/tutorial_apps/python/simple$ runcompss --lang=python ./simple.py <initial_number>

compss@bsc:~/tutorial_apps/python/simple$ runcompss simple.py 1
[  INFO] Using default execution type: compss
[  INFO] Using default location for project file: /opt/COMPSs/Runtime/configuration/xml/projects/default_project.xml
[  INFO] Using default location for resources file: /opt/COMPSs/Runtime/configuration/xml/resources/default_resources.xml
[  INFO] Inferred PYTHON language

----------------- Executing simple.py --------------------------

WARNING: COMPSs Properties file is null. Setting default values
[(616)    API]  -  Starting COMPSs Runtime v<version>
Initial counter value is 1
Final counter value is 2
[(4297)    API]  -  Execution Finished

------------------------------------------------------------

Attention: Executing without debug (e.g. default log level or --log_level=off) uses -O2 compiled sources, disabling asserts and __debug__.

Alternatively, it is possible to execute the a COMPSs Python application using pycompss as module:

compss@bsc:~$ python -m pycompss <runcompss_flags> <application> <application_parameters>

Consequently, the previous example could also be run as follows:

compss@bsc:~$ cd tutorial_apps/python/simple/
compss@bsc:~/tutorial_apps/python/simple$ python -m pycompss simple.py <initial_number>

If the -m pycompss is not set, the application will be run ignoring all PyCOMPSs imports, decorators and API calls, that is, sequentially.

In order to run a COMPSs Python application with a different interpreter, the runcompss command provides a specific flag:

compss@bsc:~$ cd tutorial_apps/python/simple/
compss@bsc:~/tutorial_apps/python/simple$ runcompss --python_interpreter=python3 ./simple.py <initial_number>

However, when using the pycompss module, it is inferred from the python used in the call:

compss@bsc:~$ cd tutorial_apps/python/simple/
compss@bsc:~/tutorial_apps/python/simple$ python3 -m pycompss simple.py <initial_number>

Finally, both runcompss and pycompss module provide a particular flag for virtual environment propagation (--python_propagate_virtual_environment=<bool>). This, flag is intended to activate the current virtual environment in the worker nodes when set to true.

Running C/C++ applications

To launch a COMPSs C/C++ application users have to compile the C/C++ application by means of the buildapp command. For further information please refer to Application development. Once complied, the --lang=c option must be provided to the runcompss command. If the main file is a C/C++ binary the runcompss command can also infer the application language without specifying the lang flag.

compss@bsc:~$ cd tutorial_apps/c/simple/
compss@bsc:~/tutorial_apps/c/simple$ runcompss --lang=c simple <initial_number>

compss@bsc:~/tutorial_apps/c/simple$ runcompss ~/tutorial_apps/c/simple/master/simple 1
[  INFO] Using default execution type: compss
[  INFO] Using default location for project file: /opt/COMPSs/Runtime/configuration/xml/projects/default_project.xml
[  INFO] Using default location for resources file: /opt/COMPSs/Runtime/configuration/xml/resources/default_resources.xml
[  INFO] Inferred C/C++ language

----------------- Executing simple --------------------------

JVM_OPTIONS_FILE: /tmp/tmp.ItT1tQfKgP
COMPSS_HOME: /opt/COMPSs
Args: 1

WARNING: COMPSs Properties file is null. Setting default values
[(650)    API]  -  Starting COMPSs Runtime v<version>
Initial counter value is 1
[   BINDING]  -  @compss_wait_on  -  Entry.filename: counter
[   BINDING]  -  @compss_wait_on  -  Runtime filename: d1v2_1497432831496.IT
Final counter value is 2
[(4222)    API]  -  Execution Finished

------------------------------------------------------------

Additional configurations

The COMPSs runtime has two configuration files: resources.xml and project.xml . These files contain information about the execution environment and are completely independent from the application.

For each execution users can load the default configuration files or specify their custom configurations by using, respectively, the --resources=<absolute_path_to_resources.xml> and the --project=<absolute_path_to_project.xml> in the runcompss command. The default files are located in the /opt/COMPSs/Runtime/configuration/xml/ path. Users can manually edit these files or can use the Eclipse IDE tool developed for COMPSs. For further information about the Eclipse IDE please refer to COMPSs IDE Section.

For further details please check the Configuration Files Subsection inside the Installation and Administration Section.

Results and logs

Results

When executing a COMPSs application we consider different type of results:

Application Output: Output generated by the application.
Application Files: Files used or generated by the application.
Tasks Output: Output generated by the tasks invoked from the application.

Regarding the application output, COMPSs will preserve the application output but will add some pre and post output to indicate the COMPSs Runtime state. Figure 4 shows the standard output generated by the execution of the Simple Java application. The green box highlights the application stdout while the rest of the output is produced by COMPSs.

Figure 4 Output generated by the execution of the Simple Java application with COMPSs

Regarding the application files, COMPSs does not modify any of them and thus, the results obtained by executing the application with COMPSs are the same than the ones generated by the sequential execution of the application.

Regarding the tasks output, COMPSs introduces some modifications due to the fact that tasks can be executed in remote machines. After the execution, COMPSs stores the stdout and the stderr of each job (a task execution) inside the ``/home/$USER/.COMPSs/$APPNAME/$EXEC_NUMBER/jobs/`` directory of the main application node.

Figure 5 and Figure 6 show an example of the results obtained from the execution of the Hello Java application. While Figure 5 provides the output of the sequential execution of the application (without COMPSs), Figure 6 provides the output of the equivalent COMPSs execution. Please note that the sequential execution produces the Hello World! (from a task) message in the stdout while the COMPSs execution stores the message inside the job1_NEW.out file.

Figure 5 Sequential execution of the Hello java application

Figure 6 COMPSs execution of the Hello java application

Logs

COMPSs includes three log levels for running applications but users can modify them or add more levels by editing the logger files under the /opt/COMPSs/Runtime/configuration /log/ folder. Any of these log levels can be selected by adding the --log_level=<debug | info | off> flag to the runcompss command. The default value is off.

The logs generated by the NUM_EXEC execution of the application APP by the user USER are stored under /home/$USER/.COMPSs/$APP/$EXEC_NUMBER/ folder (from this point on: base log folder). The EXEC_NUMBER execution number is automatically used by COMPSs to prevent mixing the logs of data of different executions.

When running COMPSs with log level off only the errors are reported. This means that the base log folder will contain two empty files (runtime.log and resources.log) and one empty folder (jobs). If somehow the application has failed, the runtime.log and/or the resources.log will not be empty and a new file per failed job will appear inside the jobs folder to store the stdout and the stderr. Figure 7 shows the logs generated by the execution of the Simple java application (without errors) in off mode.

Figure 7 Structure of the logs folder for the Simple java application in off mode

When running COMPSs with log level info the base log folder will contain two files (runtime.log and resources.log) and one folder (jobs). The runtime.log file contains the execution information retrieved from the master resource, including the file transfers and the job submission details. The resources.log file contains information about the available resources such as the number of processors of each resource (slots), the information about running or pending tasks in the resource queue and the created and destroyed resources. The jobs folder will be empty unless there has been a failed job. In this case it will store, for each failed job, one file for the stdout and another for the stderr. As an example, Figure 8 shows the logs generated by the same execution than the previous case but with info mode.

Figure 8 Structure of the logs folder for the Simple java application in info mode

The runtime.log and resources.log are quite large files, thus they should be only checked by advanced users. For an easier interpretation of these files the COMPSs Framework includes a monitor tool. For further information about the COMPSs Monitor please check COMPSs Monitor.

Figure 9 and Figure 10 provide the content of these two files generated by the execution of the Simple java application.

Figure 9 runtime.log generated by the execution of the Simple java application

Figure 10 resources.log generated by the execution of the Simple java application

Running COMPSs with log level debug generates the same files as the info log level but with more detailed information. Additionally, the jobs folder contains two files per submitted job; one for the stdout and another for the stderr. In the other hand, the COMPSs Runtime state is printed out on the stdout. Figure 11 shows the logs generated by the same execution than the previous cases but with debug mode.

The runtime.log and the resources.log files generated in this mode can be extremely large. Consequently, the users should take care of their quota and manually erase these files if needed.

Figure 11 Structure of the logs folder for the Simple java application in debug mode

When running Python applications a pycompss.log file is written inside the base log folder containing debug information about the specific calls to PyCOMPSs.

Furthermore, when running runcompss with additional flags (such as monitoring or tracing) additional folders will appear inside the base log folder. The meaning of the files inside these folders is explained in COMPSs Tools.

COMPSs Tools

Application graph

At the end of the application execution a dependency graph can be generated representing the order of execution of each type of task and their dependencies. To allow the final graph generation the -g flag has to be passed to the runcompss command; the graph file is written in the base_log_folder/monitor/complete_graph.dot at the end of the execution.

Figure 12 shows a dependency graph example of a SparseLU java application. The graph can be visualized by running the following command:

compss@bsc:~$ compss_gengraph ~/.COMPSs/sparseLU.arrays.SparseLU_01/monitor/complete_graph.dot

Figure 12 The dependency graph of the SparseLU application

COMPSs Monitor

The COMPSs Framework includes a Web graphical interface that can be used to monitor the execution of COMPSs applications. COMPSs Monitor is installed as a service and can be easily managed by running any of the following commands:

compss@bsc:~$ /etc/init.d/compss-monitor usage
Usage: compss-monitor {start | stop | reload | restart | try-restart | force-reload | status}

Service configuration

The COMPSs Monitor service can be configured by editing the /opt/COMPSs/Tools/monitor/apache-tomcat/conf/compss-monitor.conf file which contains one line per property:

COMPSS_MONITOR: Default directory to retrieve monitored applications (defaults to the .COMPSs folder inside the root user).
COMPSs_MONITOR_PORT: Port where to run the compss-monitor web service (defaults to 8080).
COMPSs_MONITOR_TIMEOUT: Web page timeout between browser and server (defaults to 20s).

Usage

In order to use the COMPSs Monitor users need to start the service as shown in Figure 13.

Figure 13 COMPSs Monitor start command

And use a web browser to open the specific URL:

compss@bsc:~$ firefox http://localhost:8080/compss-monitor &

The COMPSs Monitor allows to monitor applications from different users and thus, users need to first login to access their applications. As shown in Figure 14, the users can select any of their executed or running COMPSs applications and display it.

Figure 14 COMPSs monitoring interface

To enable all the COMPSs Monitor features, applications must run the runcompss command with the -m flag. This flag allows the COMPSs Runtime to store special information inside inside the log_base_folder under the monitor folder (see Figure 14 and Figure 15). Only advanced users should modify or delete any of these files. If the application that a user is trying to monitor has not been executed with this flag, some of the COMPSs Monitor features will be disabled.

compss@bsc:~/tutorial_apps/java/simple/jar$ runcompss -dm simple.Simple 1
[  INFO] Using default execution type: compss
[  INFO] Using default location for project file: /opt/COMPSs/Runtime/configuration/xml/projects/default_project.xml
[  INFO] Using default location for resources file: /opt/COMPSs/Runtime/configuration/xml/resources/default_resources.xml
[  INFO] Using default language: java

----------------- Executing simple.Simple --------------------------

WARNING: COMPSs Properties file is null. Setting default values
[(799)    API]  -  Deploying COMPSs Runtime v<version>
[(801)    API]  -  Starting COMPSs Runtime v<version>
[(801)    API]  -  Initializing components
[(1290)    API]  -  Ready to process tasks
[(1293)    API]  -  Opening /home/compss/tutorial_apps/java/simple/jar/counter in mode OUT
[(1338)    API]  -  File target Location: /home/compss/tutorial_apps/java/simple/jar/counter
Initial counter value is 1
[(1340)    API]  -  Creating task from method increment in simple.SimpleImpl
[(1340)    API]  -  There is 1 parameter
[(1341)    API]  -    Parameter 1 has type FILE_T
Final counter value is 2
[(4307)    API]  -  No more tasks for app 1
[(4311)    API]  -  Getting Result Files 1
[(4340)    API]  -  Stop IT reached
[(4344)    API]  -  Stopping Graph generation...
[(4344)    API]  -  Stopping Monitor...
[(6347)    API]  -  Stopping AP...
[(6348)    API]  -  Stopping TD...
[(6509)    API]  -  Stopping Comm...
[(6510)    API]  -  Runtime stopped
[(6510)    API]  -  Execution Finished

------------------------------------------------------------

Figure 15 Logs generated by the Simple java application with the monitoring flag enabled

Graphical Interface features

In this section we provide a summary of the COMPSs Monitor supported features available through the graphical interface:

Resources information Provides information about the resources used by the application
Tasks information Provides information about the tasks definition used by the application
Current tasks graph Shows the tasks dependency graph currently stored into the COMPSs Runtime
Complete tasks graph Shows the complete tasks dependecy graph of the application
Load chart Shows different dynamic charts representing the evolution over time of the resources load and the tasks load
Runtime log Shows the runtime log
Execution Information Shows specific job information allowing users to easily select failed or uncompleted jobs
Statistics Shows application statistics such as the accumulated cloud cost.

Important

To enable all the COMPSs Monitor features applications must run with the -m flag.

The webpage also allows users to configure some performance parameters of the monitoring service by accessing the Configuration button at the top-right corner of the web page.

For specific COMPSs Monitor feature configuration please check our FAQ section at the top-right corner of the web page.

Application tracing

COMPSs Runtime can generate a post-execution trace of the execution of the application. This trace is useful for performance analysis and diagnosis.

A trace file may contain different events to determine the COMPSs master state, the task execution state or the file-transfers. The current release does not support file-transfers informations.

During the execution of the application, an XML file is created in the worker nodes to keep track of these events. At the end of the execution, all the XML files are merged to get a final trace file.

In this manual we only provide information about how to obtain a trace and about the available Paraver (the tool used to analyze the traces) configurations. For further information about the application instrumentation or the trace visualization and configurations please check the Tracing Section.

Trace Command

In order to obtain a post-execution trace file one of the following options -t, --tracing, --tracing=true, --tracing=basic must be added to the runcompss command. All this options activate the basic tracing mode; the advanced mode is activated with the option --tracing=advanced. For further information about advanced mode check the Tracing Section. Next, we provide an example of the command execution with the basic tracing option enabled for a java K-Means application.

compss@bsc:~$ runcompss -t kmeans.Kmeans
*** RUNNING JAVA APPLICATION KMEANS
[  INFO] Relative Classpath resolved: /path/to/jar/kmeans.jar

----------------- Executing kmeans.Kmeans --------------------------

Welcome to Extrae VERSION
Extrae: Parsing the configuration file (/opt/COMPSs/Runtime/configuration/xml/tracing/extrae_basic.xml) begins
Extrae: Warning! <trace> tag has no <home> property defined.
Extrae: Generating intermediate files for Paraver traces.
Extrae: <cpu> tag at <counters> level will be ignored. This library does not support CPU HW.
Extrae: Tracing buffer can hold 100000 events
Extrae: Circular buffer disabled.
Extrae: Dynamic memory instrumentation is disabled.
Extrae: Basic I/O memory instrumentation is disabled.
Extrae: System calls instrumentation is disabled.
Extrae: Parsing the configuration file (/opt/COMPSs/Runtime/configuration/xml/tracing/extrae_basic.xml) has ended
Extrae: Intermediate traces will be stored in /user/folder
Extrae: Tracing mode is set to: Detail.
Extrae: Successfully initiated with 1 tasks and 1 threads

WARNING: COMPSs Properties file is null. Setting default values
[(751)    API]  -  Deploying COMPSs Runtime v<version>
[(753)    API]  -  Starting COMPSs Runtime v<version>
[(753)    API]  -  Initializing components
[(1142)   API]  -  Ready to process tasks
...
...
...
merger: Output trace format is: Paraver
merger: Extrae 3.3.0 (revision 3966 based on extrae/trunk)
mpi2prv: Assigned nodes < Marginis >
mpi2prv: Assigned size per processor < <1 Mbyte >
mpi2prv: File set-0/TRACE@Marginis.0000001904000000000000.mpit is object 1.1.1 on node Marginis assigned to processor 0
mpi2prv: File set-0/TRACE@Marginis.0000001904000000000001.mpit is object 1.1.2 on node Marginis assigned to processor 0
mpi2prv: File set-0/TRACE@Marginis.0000001904000000000002.mpit is object 1.1.3 on node Marginis assigned to processor 0
mpi2prv: File set-0/TRACE@Marginis.0000001980000001000000.mpit is object 1.2.1 on node Marginis assigned to processor 0
mpi2prv: File set-0/TRACE@Marginis.0000001980000001000001.mpit is object 1.2.2 on node Marginis assigned to processor 0
mpi2prv: File set-0/TRACE@Marginis.0000001980000001000002.mpit is object 1.2.3 on node Marginis assigned to processor 0
mpi2prv: File set-0/TRACE@Marginis.0000001980000001000003.mpit is object 1.2.4 on node Marginis assigned to processor 0
mpi2prv: File set-0/TRACE@Marginis.0000001980000001000004.mpit is object 1.2.5 on node Marginis assigned to processor 0
mpi2prv: Time synchronization has been turned off
mpi2prv: A total of 9 symbols were imported from TRACE.sym file
mpi2prv: 0 function symbols imported
mpi2prv: 9 HWC counter descriptions imported
mpi2prv: Checking for target directory existance... exists, ok!
mpi2prv: Selected output trace format is Paraver
mpi2prv: Stored trace format is Paraver
mpi2prv: Searching synchronization points... done
mpi2prv: Time Synchronization disabled.
mpi2prv: Circular buffer enabled at tracing time? NO
mpi2prv: Parsing intermediate files
mpi2prv: Progress 1 of 2 ... 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85% 90% 95% done
mpi2prv: Processor 0 succeeded to translate its assigned files
mpi2prv: Elapsed time translating files: 0 hours 0 minutes 0 seconds
mpi2prv: Elapsed time sorting addresses: 0 hours 0 minutes 0 seconds
mpi2prv: Generating tracefile (intermediate buffers of 838848 events)
         This process can take a while. Please, be patient.
mpi2prv: Progress 2 of 2 ... 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85% 90% 95% done
mpi2prv: Warning! Clock accuracy seems to be in microseconds instead of nanoseconds.
mpi2prv: Elapsed time merge step: 0 hours 0 minutes 0 seconds
mpi2prv: Resulting tracefile occupies 991743 bytes
mpi2prv: Removing temporal files... done
mpi2prv: Elapsed time removing temporal files: 0 hours 0 minutes 0 seconds
mpi2prv: Congratulations! ./trace/kmeans.Kmeans_compss_trace_1460456106.prv has been generated.
[   API]  -  Execution Finished

------------------------------------------------------------

At the end of the execution the trace will be stored inside the trace folder under the application log directory.

compss@bsc:~$ cd .COMPSs/kmeans.Kmeans_01/trace/
compss@bsc:~$ ls -1
kmeans.Kmeans_compss_trace_1460456106.pcf
kmeans.Kmeans_compss_trace_1460456106.prv
kmeans.Kmeans_compss_trace_1460456106.row

Trace visualization

The traces generated by an application execution are ready to be visualized with Paraver. Paraver is a powerful tool developed by BSC that allows users to show many views of the trace data by means of different configuration files. Users can manually load, edit or create configuration files to obtain different trace data views.

If Paraver is installed, issue the following command to visualize a given tracefile:

compss@bsc:~$ wxparaver path/to/trace/trace_name.prv

For further information about Paraver please visit the following site:

http://www.bsc.es/computer-sciences/performance-tools/paraver

COMPSs IDE

COMPSs IDE is an Integrated Development Environment to develop, compile, deploy and execute COMPSs applications. It is available through the Eclipse Market as a plugin and provides an even easier way to work with COMPSs.

For further information please check the COMPSs IDE User Guide available at: http://compss.bsc.es .

Special Execution Platforms

This section provides information about how to run COMPSs Applications in specific platforms such as Docker, Chameleon or MareNostrum.

Docker

Introduction

Docker is an open-source project that automates the deployment of applications inside software containers, by providing an additional layer of abstraction and automation of operating-system-level virtualization on Linux. In addition to the Docker container engine, there are other Docker tools that allow users to create complex applications (Docker-Compose) or to manage a cluster of Docker containers (Docker Swarm).

COMPSs supports running a distributed application in a Docker Swarm cluster.

Requirements

In order to use COMPSs with Docker, some requirements must be fulfilled:

Have Docker and Docker-Compose installed in your local machine.
Have an available Docker Swarm cluster and its Swarm manager ip and port to access it remotely.
A Dockerhub account. Dockerhub is an online repository for Docker images. We don’t currently support another sharing method besides uploading to Dockerhub, so you will need to create a personal account. This has the advantage that it takes very little time either upload or download the needed images, since it will reuse the existing layers of previous images (for example the COMPSs base image).

Execution

The runcompss-docker execution workflow uses Docker-Compose, which is in charge of spawning the different application containers into the Docker Swarm manager. Then the Docker Swarm manager schedules the containers to the nodes and the application starts running. The COMPSs master and workers will run in the nodes Docker Swarm decides. To see where the masters and workers are located in runtime, you can use:

$ docker -H '<swarm_manager_ip:swarm_port>' ps -a

The execution of an application using Docker containers with COMPSs consists of 2 steps:

Execution step 1: Creation of the application image

The very first step to execute a COMPSs application in Docker is creating your application Docker image.

This must be done only once for every new application, and then you can run it as many times as needed. If the application is updated for whatever reason, this step must be done again to create and share the updated image.

In order to do this, you must use the compss_docker_gen_image tool, which is available in the standard COMPSs application. This tool is the responsible of taking your application, create the needed image, and upload it to Dockerhub to share it.

The image is created injecting your application into a COMPSs base image. This base image is available in Dockerhub. In case you need it, you can pull it using the following command:

$ docker pull compss/compss

The compss_docker_gen_image script receives 2 parameters:

–c, –context-dir: Specifies the context directory path of the application. This path MUST BE ABSOLUTE, not relative. The context directory is a local directory that must contain the needed binaries and input files of the app (if any). In its simplest case, it will contain the executable file (a .jar for example). Keep the context-directory as lightest as possible.

For example: –context-dir=’/home/compss-user/my-app-dir’ (where ’my-app-dir’ contains ’app.jar’, ’data1.dat’ and ’data2.csv’). For more details, this context directory will be recursively copied into a COMPSs base image. Specifically, it will create all the path down to the context directory inside the image.
–image-name: Specifies a name for the created image. It MUST have this format: ’DOCKERHUB-USERNAME/image-name’. The DOCKERHUB_USERNAME must be the username of your personal Dockerhub account. The image_name can be whatever you want, and will be used as the identifier for the image in Dockerhub. This name will be the one you will use to execute the application in Docker. For example, if my Dockerhub username is john123 and I want my image to be named “my-image-app”: –image-name=“john123/my-image-app”.

As stated before, this is needed to share your container application image with the nodes that need it. Image tags are also supported (for example “john123/my-image-app:1.23).

Important

After creating the image, be sure to write down the absolute context-directory and the absolute classpath (the absolute path to the executable jar). You will need it to run the application using runcompss-docker. In addition, if you plan on distributing the application, you can use the Dockerhub image’s information tab to write them, so the application users can retrieve them.

Execution step 2: Run the application

To execute COMPSs in a Docker Swarm cluster, you must use the runcompss-docker command, instead of runcompss.

The command runcompss-docker has some additional arguments that will be needed by COMPSs to run your application in a distributed Docker Swarm cluster environment. The rest of typical arguments (classpath for example) will be delegated to runcompss command.

These additional arguments must go before the typical runcompss arguments. The runcompss-docker additional arguments are:

–w, –worker-containers:

Specifies the number of worker containers the app will execute on. One more container will be created to host the master. If you have enough nodes in the Swarm cluster, each container will be executed by one node. This is the default schedule strategy used by Swarm. For example: –worker-containers=3

–s, –swarm-manager:

Specifies the Swarm manager ip and port (format: ip:port). For example: –swarm-manager=’129.114.108.8:4000’

–i, –image-name:

Specify the image name of the application image in Dockerhub. Remember you must generate this with compss_docker_gen_image Remember as well that the format must be: ’DOCKERHUB_USERNAME/APP_IMAGE_NAME:TAG’ (the :TAG is optional). For example: –image-name=’john123/my-compss-application:1.9’

–c, –context-dir:

Specifies the context directory of the app. It must be specified by the application image provider. For example: –context-dir=’/home/compss-user/my-app-context-dir’.

As optional arguments:

–c-cpu-units:

Specifies the number of cpu units used by each container (default value is 4). For example: –c-cpu-units:=16

–c-memory:

Specifies the physical memory used by each container in GB (default value is 8 GB). For example, in this case, each container would use as maximum 32 GB of physical memory: –c-memory=32

Here is the format you must use with runcompss-docker command:

$ runcompss-docker --worker-containers=N \
                   --swarm-manager='<ip>:<port>' \
                   --image-name='DOCKERHUB_USERNAME/image_name' \
                   --context-dir='CTX_DIR' \
                   [rest of classic runcompss args]

Or alternatively, in its shortest form:

$ runcompss-docker --w=N --s='<ip>:<port>' --i='DOCKERHUB_USERNAME/image_name' --c='CTX_DIR' \
                   [rest of classic runcompss args]

Execution with TLS

If your cluster uses TLS or has been created using Docker-Machine, you will have to export two environment variables before using runcompss-docker:

On one hand, DOCKER_TLS_VERIFY environment variable will tell Docker that you are using TLS:

export DOCKER_TLS_VERIFY="1"

On the other hand, DOCKER_CERT_PATH variable will tell Docker where to find your TLS certificates. As an example:

export DOCKER_CERT_PATH="/home/compss-user/.docker/machine/machines/my-manager-node"

In case you have created your cluster using docker-machine, in order to know what your DOCKER_CERT_PATH is, you can use this command:

$ docker-machine env my-swarm-manager-node-name | grep DOCKER_CERT_PATH

In which swarm-manager-node-name must be changed by the name docker-machine has assigned to your swarm manager node. With these environment variables set, you are ready to use runcompss-docker in a cluster using TLS.

Execution results

The execution results will be retrieved from the master container of your application.

If your context-directory name is ’matmul’, then your results will be saved in the ’matmul-results’ directory, which will be located in the same directory you executed runcompss-docker on.

Inside the ’matmul-results’ directory you will have:

A folder named ’matmul’ with all the result files that were in the same directory as the executable when the application execution ended. More precisely, this will contain the context-directory state right after finishing your application execution. Additionally, and for more advanced debug purposes, you will have some intermediate files created by runcompss-docker (Dockerfile, project.xml, resources.xml), in case you want to check for more complex errors or details.
A folder named ’debug’, which (in case you used the runcompss debug option (-d)), will contain the ’.COMPSs’ directory, which contains another directory in which there are the typical debug files runtime.log, jobs, etc. Remember .COMPSs is a hidden directory, take this into account if you do ls inside the debug directory (add the -a option).

To make it simpler, we provide a tree visualization of an example of what your directories should look like after the execution. In this case we executed the Matmul example application that we provide you:

Figure 16 Result and log folders of a Matmul execution with COMPSs and Docker

Execution examples

Next we will use the Matmul application as an example of a Java application running with COMPSs and Docker.

Imagine we have our Matmul application in /home/john/matmul and inside the matmul directory we only have the file matmul.jar.

We have created a Dockerhub account with username ’john123’.

The first step will be creating the image:

$ compss_docker_gen_image --context-dir='/home/john/matmul' \
                          --image-name='john123/matmul-example'

Now, we write down the context-dir (/home/john/matmul) and the classpath (/home/john/matmul/matmul.jar). We do this because they will be needed for future executions. Since the image is created and uploaded, we won’t need to do this step anymore.

Now we are going to execute our Matmul application in a Docker cluster.

Take as assumptions:

We will use 5 worker docker containers.
The swarm-manager ip will be 129.114.108.8, with the Swarm manager listening to the port 4000.
We will use debug (-d).
Finally, as we would do with the typical runcompss, we specify the main class name and its parameters (16 and 4 in this case).

In addition, we know from the former step that the image name is john123/matmul-example, the context directory is /home/john/matmul, and the classpath is /home/john/matmul/matmul.jar. And this is how you would run runcompss-docker:

$ runcompss-docker --worker-containers=5 \
                   --swarm-manager='129.114.108.8:4000' \
                   --context-dir='/home/john/matmul' \
                   --image-name='john123/matmul-example' \
                   --classpath=/home/john/matmul/matmul.jar \
                   -d \
                   matmul.objects.Matmul 16 4

Here we show another example using the short arguments form, with the KMeans example application, that is also provided as an example COMPSs application to you:

First step, create the image once:

$ compss_docker_gen_image --context-dir='/home/laura/apps/kmeans' \
                          --image-name='laura-67/my-kmeans'

And now execute with 30 worker containers, and Swarm located in ’110.3.14.159:26535’.

$ runcompss-docker --w=30 \
                   --s='110.3.14.159:26535' \
                   --c='/home/laura/apps/kmeans' \
                   --image-name='laura-67/my-kmeans' \
                   --classpath=/home/laura/apps/kmeans/kmeans.jar \
                   kmeans.KMeans

Chameleon

Introduction

The Chameleon project is a configurable experimental environment for large-scale cloud research based on a OpenStack KVM Cloud. With funding from the National Science Foundation (NSF), it provides a large-scale platform to the open research community allowing them explore transformative concepts in deeply programmable cloud services, design, and core technologies. The Chameleon testbed, is deployed at the University of Chicago and the Texas Advanced Computing Center and consists of 650 multi-core cloud nodes, 5PB of total disk space, and leverage 100 Gbps connection between the sites.

The project is led by the Computation Institute at the University of Chicago and partners from the Texas Advanced Computing Center at the University of Texas at Austin, the International Center for Advanced Internet Research at Northwestern University, the Ohio State University, and University of Texas at San Antoni, comprising a highly qualified and experienced team. The team includes members from the NSF supported FutureGrid project and from the GENI community, both forerunners of the NSFCloud solicitation under which this project is funded. Chameleon will also sets of partnerships with commercial and academic clouds, such as Rackspace, CERN and Open Science Data Cloud (OSDC).

For more information please check https://www.chameleoncloud.org/ .

Execution

Currently, COMPSs can only handle the Chameleon infrastructure as a cluster (deployed inside a lease). Next, we provide the steps needed to execute COMPSs applications at Chameleon:

Make a lease reservation with 1 minimum node (for the COMPSs master instance) and a maximum number of nodes equal to the number of COMPSs workers needed plus one
Instantiate the master image (based on the published image COMPSs__CC-CentOS7)
Attach a public IP and login to the master instance (the instance is correctly contextualized for COMPSs executions if you see a COMPSs login banner)
Set the instance as COMPSs master by running /etc/init.d/chameleon_init start
Copy your CH file (API credentials) to the Master and source it
Run the chameleon_cluster_setup script and fill the information when prompted (you will be asked for the name of the master instance, the reservation id and number of workers). This scripts may take several minutes since it sets up the all cluster.
Execute your COMPSs applications normally using the runcompss script

As an example you can check this video https://www.youtube.com/watch?v=BrQ6anPHjAU performing a full setup and execution of a COMPSs application at Chameleon.

SuperComputers

To maintain the portability between different environments, COMPSs has a pre-build structure (see Figure [fig:queue_scripts_structure]) to execute applications in SuperComputers. For this purpose, users must use the enqueue_compss script provided in the COMPSs installation. This script has several parameters (see enqueue_compss -h) that allow users to customize their executions for any SuperComputer.

Figure 17 Structure of COMPSs queue scripts. In Blue user scripts, in Green queue scripts and in Orange system dependant scripts

To make this structure works, the administrators must define a configuration file for the queue system and a configuration file for the specific SuperComputer parameters. The COMPSs installation already provides queue configurations for LSF and SLURM and several examples for SuperComputer configurations. To create a new configuration we recommend to use one of the configurations provided by COMPSs (such as the configuration for the MareNostrum IV SuperComputer) or to contact us at support-compss@bsc.es .

For information about how to submit COMPSs applications at any Supercomputer please refer to Supercomputers.

Common Issues

This section provides answers for the most common issues of the execution of COMPSs applications. For specific issues not covered in this section, please do not hesitate to contact us at: support-compss@bsc.es .

How to debug

When the application does not behave as expected the first thing users must do is to run it in debug mode executing the runcompss command withthe -d flag to enable the debug log level.

In this case the application execution will produce the following files:

runtime.log
resources.log
jobs folder

First, users should check the last lines of the runtime.log. If the file-transfers or the tasks are failing an error message will appear in this file. If the file-transfers are successfully and the jobs are submitted, users should check the jobs folder and look at the error messages produced inside each job. Users should notice that if there are RESUBMITTED files something inside the job is failing.

Tasks are not executed

If the tasks remain in Blocked state probably there are no existing resources matching the specific task constraints. This error can be potentially caused by two facts: the resources are not correctly loaded into the runtime, or the task constraints do not match with any resource.

In the first case, users should take a look at the resouces.log and check that all the resources defined in the project.xml file are available to the runtime. In the second case users should re-define the task constraints taking into account the resources capabilities defined into the resources.xml and project.xml files.

Jobs fail

If all the application’s tasks fail because all the submitted jobs fail, it is probably due to the fact that there is a resource miss-configuration. In most of the cases, the resource that the application is trying to access has no passwordless access through the configured user. This can be checked by:

Open the project.xml. (The default file is stored under /opt/COMPSs/ Runtime/configuration/xml/projects/project.xml)
For each resource annotate its name and the value inside the User tag. Remember that if there is no User tag COMPSs will try to connect this resource with the same username than the one that launches the main application.
For each annotated resourceName - user please try ssh user@resourceName. If the connection asks for a password then there is an error in the configuration of the ssh access in the resource.

The problem can be solved running the following commands:

compss@bsc:~$ scp ~/.ssh/id_dsa.pub user@resourceName:./mydsa.pub
compss@bsc:~$ ssh user@resourceName "cat mydsa.pub >> ~/.ssh/authorized_keys; rm ./mydsa.pub"

These commands are a quick solution, for further details please check the Additional Configuration Section.

Exceptions when starting the Worker processes

When the COMPSs master is not able to communicate with one of the COMPSs workers described in the project.xml and resources.xml files, different exceptions can be raised and logged on the runtime.log of the application. All of them are raised during the worker start up and contain the [WorkerStarter] prefix. Next we provide a list with the common exceptions:

InitNodeException: Exception raised when the remote SSH process to start the worker has failed.
UnstartedNodeException: Exception raised when the worker process has aborted.
Connection refused: Exception raised when the master cannot communicate with the worker process (NIO).

All these exceptions encapsulate an error when starting the worker process. This means that the worker machine is not properly configured and thus, you need to check the environment of the failing worker. Further information about the specific error can be found on the worker log, available at the working directory path in the remote worker machine (the worker working directory specified in the project.xml} file).

Next, we list the most common errors and their solutions:

java command not found: Invalid path to the java binary. Check the JAVA_HOME definition at the remote worker machine.
Cannot create WD: Invalid working directory. Check the rw permissions of the worker’s working directory.
No exception: The worker process has started normally and there is no exception. In this case the issue is normally due to the firewall configuration preventing the communication between the COMPSs master and worker. Please check that the worker firewall has in and out permissions for TCP and UDP in the adaptor ports (the adaptor ports are specified in the resources.xml file. By default the port rank is 43000-44000.

Compilation error: @Method not found

When trying to compile Java applications users can get some of the following compilation errors:

error: package es.bsc.compss.types.annotations does not exist
import es.bsc.compss.types.annotations.Constraints;
                                          ^
error: package es.bsc.compss.types.annotations.task does not exist
import es.bsc.compss.types.annotations.task.Method;
                                          ^
error: package es.bsc.compss.types.annotations does not exist
import es.bsc.compss.types.annotations.Parameter;
                                          ^
error: package es.bsc.compss.types.annotations.Parameter does not exist
import es.bsc.compss.types.annotations.parameter.Direction;
                                                    ^
error: package es.bsc.compss.types.annotations.Parameter does not exist
import es.bsc.compss.types.annotations.parameter.Type;
                                                    ^
error: cannot find symbol
@Parameter(type = Type.FILE, direction = Direction.INOUT)
^
  symbol:   class Parameter
  location: interface APPLICATION_Itf

error: cannot find symbol
@Constraints(computingUnits = "2")
^
  symbol:   class Constraints
  location: interface APPLICATION_Itf

error: cannot find symbol
@Method(declaringClass = "application.ApplicationImpl")
^
  symbol:   class Method
  location: interface APPLICATION_Itf

All these errors are raised because the compss-engine.jar is not listed in the CLASSPATH. The default COMPSs installation automatically inserts this package into the CLASSPATH but it may have been overwritten or deleted. Please check that your environment variable CLASSPATH containts the compss-engine.jar location by running the following command:

$ echo $CLASSPATH | grep compss-engine

If the result of the previous command is empty it means that you are missing the compss-engine.jar package in your classpath.

The easiest solution is to manually export the CLASSPATH variable into the user session:

$ export CLASSPATH=$CLASSPATH:/opt/COMPSs/Runtime/compss-engine.jar

However, you will need to remember to export this variable every time you log out and back in again. Consequently, we recommend to add this export to the .bashrc file:

$ echo "# COMPSs variables for Java compilation" >> ~/.bashrc
$ echo "export CLASSPATH=$CLASSPATH:/opt/COMPSs/Runtime/compss-engine.jar" >> ~/.bashrc

Warning

The compss-engine.jar is installed inside the COMPSs installation directory. If you have performed a custom installation, the path of the package may be different.

Jobs failed on method reflection

When executing an application the main code gets stuck executing a task. Taking a look at the runtime.log users can check that the job associated to the task has failed (and all its resubmissions too). Then, opening the jobX_NEW.out or the jobX_NEW.err files users find the following error:

[ERROR|es.bsc.compss.Worker|Executor] Can not get method by reflection
es.bsc.compss.nio.worker.executors.Executor$JobExecutionException: Can not get method by reflection
        at es.bsc.compss.nio.worker.executors.JavaExecutor.executeTask(JavaExecutor.java:142)
        at es.bsc.compss.nio.worker.executors.Executor.execute(Executor.java:42)
        at es.bsc.compss.nio.worker.JobLauncher.executeTask(JobLauncher.java:46)
        at es.bsc.compss.nio.worker.JobLauncher.processRequests(JobLauncher.java:34)
        at es.bsc.compss.util.RequestDispatcher.run(RequestDispatcher.java:46)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoSuchMethodException: simple.Simple.increment(java.lang.String)
        at java.lang.Class.getMethod(Class.java:1678)
        at es.bsc.compss.nio.worker.executors.JavaExecutor.executeTask(JavaExecutor.java:140)
        ... 5 more

This error is due to the fact that COMPSs cannot find one of the tasks declared in the Java Interface. Commonly this is triggered by one of the following errors:

The declaringClass of the tasks in the Java Interface has not been correctly defined.
The parameters of the tasks in the Java Interface do not match the task call.
The tasks have not been defined as public.

Jobs failed on reflect target invocation null pointer

When executing an application the main code gets stuck executing a task. Taking a look at the runtime.log users can check that the job associated to the task has failed (and all its resubmissions too). Then, opening the jobX_NEW.out or the jobX_NEW.err files users find the following error:

[ERROR|es.bsc.compss.Worker|Executor]
java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at es.bsc.compss.nio.worker.executors.JavaExecutor.executeTask(JavaExecutor.java:154)
        at es.bsc.compss.nio.worker.executors.Executor.execute(Executor.java:42)
        at es.bsc.compss.nio.worker.JobLauncher.executeTask(JobLauncher.java:46)
        at es.bsc.compss.nio.worker.JobLauncher.processRequests(JobLauncher.java:34)
        at es.bsc.compss.util.RequestDispatcher.run(RequestDispatcher.java:46)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
        at simple.Ll.printY(Ll.java:25)
        at simple.Simple.task(Simple.java:72)
        ... 10 more

This cause of this error is that the Java object accessed by the task has not been correctly transferred and one or more of its fields is null. The transfer failure is normally caused because the transferred object is not serializable.

Users should check that all the object parameters in the task are either implementing the serializable interface or following the java beans model (by implementing an empty constructor and getters and setters for each attribute).

Tracing merge failed: too many open files

When too many nodes and threads are instrumented, the tracing merge can fail due to an OS limitation, namely: the maximum open files. This problem usually happens when using advanced mode due to the larger number of threads instrumented. To overcome this issue users have two choices. First option, use Extrae parallel MPI merger. This merger is automatically used if COMPSs was installed with MPI support. In Ubuntu you can install the following packets to get MPI support:

$ sudo apt-get install libcr-dev mpich2 mpich2-doc

Please note that extrae is never compiled with MPI support when building it locally (with buildlocal command).

To check if COMPSs was deployed with MPI support, you can check the installation log and look for the following Extrae configuration output:

Package configuration for Extrae VERSION based on extrae/trunk rev. 3966:
-----------------------
Installation prefix: /gpfs/apps/MN3/COMPSs/Trunk/Dependencies/extrae
Cross compilation: no
CC: gcc
CXX: g++
Binary type: 64 bits

MPI instrumentation: yes
    MPI home: /apps/OPENMPI/1.8.1-mellanox
    MPI launcher: /apps/OPENMPI/1.8.1-mellanox/bin/mpirun

On the other hand, if you already installed COMPSs, you can check Extrae configuration executing the script /opt/COMPSs/Dependencies/extrae/etc/configured.sh. Users should check that flags --with-mpi=/usr and --enable-parallel-merge are present and that MPI path is correct and exists. Sample output:

EXTRAE_HOME is not set. Guessing from the script invoked that Extrae was installed in /opt/COMPSs/Dependencies/extrae
The directory exists .. OK
Loaded specs for Extrae from /opt/COMPSs/Dependencies/extrae/etc/extrae-vars.sh

Extrae SVN branch extrae/trunk at revision 3966

Extrae was configured with:
$ ./configure --enable-gettimeofday-clock --without-mpi --without-unwind --without-dyninst --without-binutils --with-mpi=/usr --enable-parallel-merge --with-papi=/usr --with-java-jdk=/usr/lib/jvm/java-7-openjdk-amd64/ --disable-openmp --disable-nanos --disable-smpss --prefix=/opt/COMPSs/Dependencies/extrae --with-mpi=/usr --enable-parallel-merge --libdir=/opt/COMPSs/Dependencies/extrae/lib

CC was gcc
CFLAGS was -g -O2 -fno-optimize-sibling-calls -Wall -W
CXX was g++
CXXFLAGS was -g -O2 -fno-optimize-sibling-calls -Wall -W

MPI_HOME points to /usr and the directory exists .. OK
LIBXML2_HOME points to /usr and the directory exists .. OK
PAPI_HOME points to /usr and the directory exists .. OK
DYNINST support seems to be disabled
UNWINDing support seems to be disabled (or not needed)
Translating addresses into source code references seems to be disabled (or not needed)

Please, report bugs to tools@bsc.es

Disclaimer: the parallel merge with MPI will not bypass the system’s maximum number of open files, just distribute the files among the resources. If all resources belong to the same machine, the merge will fail anyways.

The second option is to increase the OS maximum number of open files. For instance, in Ubuntu add `` ulimit -n 40000 `` just before the start-stop-daemon line in the do_start section.