SpeQuloS administrator guide INRIA - EDGI JRA2 Purpose of the document ======================= This document contains guidelines for administrator of SpeQuloS. It describes installation and configuration of the SpeQuloS service inside an infrastructure of Desktop Grids and Cloud. Requirements ============ These software are required for SpeQuloS to run: - Python 2.6 - Python paramiko package (paraproxy package is also optionally supported) - Python libcloud package (version >= 0.6) - Mysql database - Apache 2 web server Even if SpeQuloS should work on any Linux systems, Debian 6.0 (Squeeze) is supported. 1. SpeQuloS installation instructions ===================================== This tutorial has been realized on a Debian 6.0 (Squeeze) operating system. Commands must be executed as root. 1. Install needed package to run SpeQuloS: aptitude update aptitude install apache2 mysql-server python python-mysqldb python- paramiko bzip2 ca-certificates The Scheduler module also requires to install the libcloud API to use the "libcloud" handler. wget http://mirror.speednetwork.de/apache/libcloud/apache- libcloud-0.7.1.tar.bz2 tar xfj apache-libcloud-0.7.1.tar.bz2 cd apache-libcloud-0.7.1 python setup.py install cd .. 2. Download SpeQuloS wget http://graal.ens-lyon.fr/~sdelamar/spequlos/spequlos- latest.tar.gz tar xfz spequlos-latest.tar.gz cd spequlos/ 3. Edit the installation configuration file install.conf (you need to provide the Mysql administrator login and password) 4. Install SpeQuloS: To install every SpeQuloS modules on the same host, use this command: ./install.sh all You can also install only one (or more) module on the host and distribute SpeQuloS accross several hosts. 2. Verifying SpeQuloS installation ================================== Here are the steps to verify that SpeQuloS is correctly installed. 2.1. Oracle module ================== - Verify that the $INSTALL_DIR/oracle/info_db.cfg contains the correct information to connect to the Information module database. (if Information module is installed on a diffrent host than Oracle, verify that the host parameter refers to the address of the MySQL server of the Information module). 2.2. Information module ======================= - Verify that the $INSTALL_DIR/information/db.cfg file contains the correct information to connect to the Information module database. - Desktop Grids are monitored using $INSTALL/information/info_grab.py script. This program is periodically executed using cron. Verify that the /etc/cron.d/spequlos_information file has been created and that the script correctly runs by checking the log file $INSTALL/log/info_grab.log 2.3. Scheduler module ===================== - Verify that the $INSTALL_DIR/scheduler/db.cfg file contains the correct information to connect to the Scheduler module database - Verify that the $INSTALL_DIR/scheduler/creditsystem.cfg file contains the address parameter with the URL to access to the CreditSystem module - Verify that the $INSTALL_DIR/scheduler/oracle.cfg file contains the address parameter with the URL to access to the Oracle module - Verify that the $INSTALL_DIR/scheduler/info_db.cfg contains the correct information to connect to the Information module database. - Cloud usage is managed using $INSTALL/scheduler/monitor_batches.py and $INSTALL/scheduler/monitor_cloud_workers.py scripts. These programs are periodically executed using cron. Verify that the /etc/cron.d/spequlos_scheduler file has been created and that the scripts correctly run by checking the log files $INSTALL/log/monitor_batches.log and $INSTALL/log/monitor_cloud_workers.log 2.4. CreditSystem module ======================== - Verify that the $INSTALL_DIR/creditsystem/db.cfg file contains the correct information to connect to the Information module database. - Desktop Grids credits are grabbed using $INSTALL/creditsystem/credits_grab.py script. This program is periodically executed using cron. Verify that the /etc/cron.d/spequlos_creditsystem file has been created and that the script correctly runs by checking the log file $INSTALL/log/credits_grab.log (the script only run once per hour). - Desktop Grids credits are converted to SpeQuloS credits using $INSTALL/creditsystem/deposit.py script. This program executed every day at 12am using cron. This can be changed in the /etc/cron.d/spequlos_creditsystem file. When it should have been executed, verify that the script correctly ran by checking the log file $INSTALL/log/deposit.log. 3. Desktop Grids configuration ============================== SpeQuloS has some requierment to work with desktop grids. In particular, it needs some plugin to be installed on the desktop grid servers. Those plugins are PHP web pages and must be deployed on an HTTP server on each desktop grid server. 3.1. XtremWeb-HEP ================= SpeQuloS requires an XtremWeb HEP version >= 7.2.0. The php files inside the DG-plugin/XWHEP/ directory must be copied to the XtremWeb-HEP HTTP server, in the directory /var/www/spequlos for instance. The file config.php must be edited. $mysql, $login, $password and $database variables must be changed according to XWHEP database parameters. 3.2. BOINC ========== The php files inside the DG-plugin/BOINC/ directory must be copied to the BOINC administration web page, i.e. in the sub-directory html/ops inside the BOINC project directory. Then, plugins will be accessible at the URL http://:@/_ops/. Additionnally, BOINC server must exports XML statistics to http:////stats/. This requires to enable the dbdump tool, as explained in [http://boinc.berkeley.edu/trac/wiki/DbDump]. To enable full QoS support in BOINC using the "Reschedule" strategy, SpeQuloS also requires to use a patched version of the BOINC server. BOINC server must be build by replacing the files sched/sched_assign.cpp, db/boinc_db.h and tools/create_work.cpp in the BOINC source root directory by those provided in the DG- plugins/BOINC/src/ directory. Additionnally, SpeQuloS requires a dedicated client account registered to the BOINC server to be used by the CloudWorkers. The script DG-plugins/BOINC/create_cloud_id.sql can be applied to the BOINC server database to create such client. The account number associated to this client ("1c3516472b9a86f40f0d96f8124fb403" in the script) must be noted down as it will be used later for Cloud configuration. For detailed instruction on how to setup a BOINC server for SpeQuloS, see the install_boinc_for_spequlos.txt file. 4. SpeQuloS configuration ========================= Each SpeQuloS modules must be configured according to Desktop Grids and Cloud services it interacts with. Most of the configuration can be performed using a Web interface accessible at an URL similar to: http:///spequlos//admin_.py. Here are the configuration steps for each module. 4.1. Oracle module ================== The Oracle module does not require any additional configuration. 4.2. Information module ======================= Each Desktop Grids monitored by SpeQuloS must be registered to the information module using the http:///admin_dg.py page. The page displays a form to register a new Desktop Grid as well as the Desktop Grids already registered. Here are the fields to fill to register a new Desktop Grid: - Desktop_Grid_Identifier: An unique identifier for the Desktop Grid - Desktop_Grid_Plugin_Address: The HTTP URL to access to the Desktop Grid SpeQuloS "plugins" (php pages). It must contains the login and password if it is required to access to the URL. For instance, http://boincadmin:boincpassword@my-boinc.eu/myproject_ops/. - Desktop_Grid_Type: The middleware used in the Desktop Grid (may be BOINC or XWHEP). To delete a Desktop Grid, click on the Delete link in front of the Desktop Grid to deregister it. 4.3. CreditSystem module ======================== - Desktop Grid registration: Each Desktop Grids used with SpeQuloS must be registered to the CreditSystem module using the http:///admin_dg.py page. The page displays a form to register a new Desktop Grid as well as the Desktop Grids already registered. Here are the fields to fill to register a new Desktop Grid: - Desktop_Grid_Identifier: An unique identifier for the Desktop Grid - Desktop_Grid_Plugin_Address: The HTTP URL to access to the Desktop Grid SpeQuloS "plugins" (php pages). It must contains the login and password if it is required to access to the URL. For instance, http://boincadmin:boincpassword@my-boinc.eu/myproject_ops/. - Desktop_Grid_Type: The middleware used in the Desktop Grid (may be BOINC or XWHEP). To delete a Desktop Grid, click on the Delete link in front of the Desktop Grid to deregister. - User and Institution registration: SpeQuloS users are registered, to account their Cloud usage. Each SpeQuloS usage belong to an Institution, which hold an account with credits. Each Institution used with SpeQuloS must be registered to the CreditSystem module using the http:///admin_institution.py page. The page displays a form to register a new Institution as well as the Institutions already registered. Here are the fields to fill to register a new Institution: - User_Identifier: An unique identifier for the User - Initial_Credits_(optional): The amount of credits to set on the Institution's account at registration (this parameter is optional, if it is not set, 0 credit will be set) Each SpeQuloS User must be registered to the CreditSystem module using the http:///admin_user.py page. The page displays a form to register a new User as well as the Users already registered. Here are the fields to fill to register a new User: - Institution_Identifier: An unique identifier for the Institution - Institution_Identifier: The identifier of the User's Institution - Resource Providers and supported Institutions: Desktop Grids registered to CreditSystem are monitored to know the computing power given by various Resource Providers computation participation. The efforts are denoted as Desktop Grids credits (such as BOINC credits). For each Resource Provider, the Desktop Grids credits earned in participation is grabbed using the $INSTALL/creditsystem/credits_grab.py script. The Resource Providers are identified by BOINC "team" or by XWHEP "project". Therefore, every Desktop Grid nodes tagged with the same team or project are considered to belong to the same Resource Provider. The credits_grab.py script automatically add Resource Providers found in registered Desktop Grids to CreditSystem database. In SpeQuloS, each Resource Provider supports an Institution, meaning that depending on the amount of Desktop Grid credits earned by the Resource Provider, some SpeQuloS credits will be added (using the deposit.py script) to the supported Institution account, to finally be available for Users' Institution to support their batch execution with Cloud resources. SpeQuloS administrator has to assign which Institution is supported by each Resource Provider registered in CreditSystem database. This operaion is done using the http:///admin_resource_provider.py page. The page displays the Resource Providers registered and allows to change the supported Institutions. Any modification must be validated by clicking on the "Update Association" button. The page also displays the number of available Desktop Grids credsits ("Available DG credits"), which is the number of Credits that have been earned since the last execution of the deposit.py script, and the total number of Desktop Grids Credits earned by the Resource Provider ("Total DG credits") - Fine tuning of CreditSystem parameters: Other CreditSystem parameters may be edited in $INSTALL/creditsystem/creditsystem_config.py file. In particular, the number of Cloud CPUs available in the infrastructure for SpeQuloS should be changed. Other CreditSystem parameters may also be changed for advanced tuning of CreditSystem. 4.4. Scheduler module ===================== 4.4.1. Desktop Grid registration ================================ Each Desktop Grid managed by SpeQuloS must be declared to the scheduler module using the Web service http:///admin_dg.py. The page displays a form to register a new Desktop Grid as well as the Desktop Grids already registered. Here are the fields to fill to register a new Desktop Grid: - Desktop_Grid_Identifier: An unique identifier for the Desktop Grid - Desktop_Grid_Configuration_File: The path to the Desktop Grid configuration file. The Desktop Grid configuration file must exists on the scheduler module host. This configuration file is composed of several = lines. Here are the parameters recognized by SpeQuloS: - DG_TYPE (mandatory): The Desktop Grid middleware. Currently, only BOINC or XWHEP are supported by SpeQuloS. - DG_PLUGIN_URL (mandatory): The URL to Desktop Grid plugins (PHP web pages). The HTTP login and password must be present in the URL if needed (example: http://boincadmin:passwd@boinc-server.org/boinc-project_ops/ for BOINC or http://xwhep-server/spequlos/ for XWHEP). - CW_SSH_CMD (mandatory for Cloud workers managed by libcloud to connect to the Desktop Grid). See section Cloud resource registration/Using libcloud handler. 4.4.2. Cloud resource registration ================================== The Cloud resources available to SpeQuloS Scheduler module must be declared. Each Cloud service available to be used by the scheduler module must be registered using the Web service http:///admin_cloud.py. The page displays a form to register a new Cloud service as well as the Cloud services already registered. Here are the fields to fill to register a new Cloud service: - Cloud_Service_Identifier: An unique identifier for the Cloud service - Cloud_Service_Configuration_File: The path to the Cloud service configuration file existing on the Scheduler module host. - Number_Of_Instances: It is the maximum number of Cloud worker that can be simultaneously instanciated on the Cloud service. - Cloud_Service_Handler: Indicates how the Cloud service is managed. SpeQuloS offers two ways to manage Cloud resources: It can use the libcloud library and a SSH connection to the Cloud instances, but also can use any command line program to manage the Cloud. These handlers are respectivelly called "libcloud" and "command" 4.4.2.1. Using the "libcloud" handler ===================================== SpeQuloS is able to manage any Cloud services supported by libcloud. To use a Cloud service with SpeQuloS, an authorization and a Cloud worker image loaded for this service are necessary. To build a cloud worker image, see the tutorial "how_to_create_a_cw_vm.txt". Each Cloud service used in SpeQuloS is described by its configuration file. A configuration file is composed of several = lines. Parameters recognized by SpeQuloS/libcloud are: - LC_DRIVER (mandatory): The name of the libcloud driver to use, "EC2NodeDriver" for Amazon EC2 Cloud or "EucNodeDriver" for Eucalyptus or any EC2 compatible Clouds. For the full list of drivers available in libcloud, see the libcloud/compute/providers.py file in the libcloud source code. - LC_KEY (mandatory): The access key (or username) to connect to the Cloud service. - LC_SECRET (mandatory): The secret key (or password) to connect to the Cloud service. - LC_HOST: The Cloud service hostname. - LC_PORT: The Cloud service port number. - LC_PATH: If a web interface is used to manage the Cloud service, it is the path to the URL (excluding hostname) of the web interface root address (for instance, it is usually "service/Cloud" for EC2 interface of OpenStack based Clouds). - LC_SECURE: Must be set to "0" if SSL is not used to connect to the Cloud service - LC_DUMMY_EC2: Should be set to "1" for Cloud services that provide an uncomplete/buggy EC2 interface (it is needed to use EC2 interfaces of OpenNebula or OpenStack). - LC_: SpeQuloS is able to provide any parameter as an argument of the libcloud driver init() function. For instance, if a libcloud driver init() function takes an argument called "myarg", to instanciate the driver with "myarg" argument set to value "myvalue", use the following line: LC_MYARG=myvalue (note that the argument must be in uppercase and prefied with "LC_" in the configuration file). - CW_IMAGE_ID (mandatory): The identifier of the Cloud worker image - CW_SIZE_ID (mandatory): The size of the Cloud worker to instanciate - CW_SSH_PWD or CW_SSH_KEY (mandatory): The SSH password or the location of the SSH private key to use for connection - CW_SSH_USER: The user name for SSH connection (default is root) Once the Cloud worker is started, SpeQuloS will make it execute some commands through a SSH connection. The aim of this command execution is to make the Cloud worker to connect to the Desktop Grid it supports. Command line to execute is defined in each Desktop Grid configuration file, inside the CW_SSH_CMD, which is mandatory if the Desktop Grid is supported by Cloud workers managed with libcloud. If some configuration related to the batch identifier supported by the Cloud worker is needed, the $SQS_BATCH_ID and $SQS_BATCH_DATA variables are setup in the environment where commands are executed and contains the batch identifier, and is accessible as a shell variable. For instance, here is a possible content for the XW@LRI Desktop Grid configuration file: CW_SSH_CMD="apt-get update; apt-get -y install openjdk-6-jre; wget http://xw.lri.fr:4330/XWHEP/download/lripub/xwhep-worker-7.6.0.deb; dpkg -i xwhep-worker-7.6.0.deb; sed -i -e's/batchid=//' /opt/xwhep-worker-7.6.0/conf/xtremweb.worker.conf; echo batchid=xw://xw.lri.fr/$SQS_BATCH_DATA >> /opt/xwhep-worker-7.6.0/conf/xtremweb.worker.conf; /etc/init.d/xtremweb.worker restart > /dev/null 2>&1 &" Or, here is an example for BOINC a project: CW_SSH_CMD="wget http://some_server/data/BOINC_CLIENT.tgz; tar xfz BOINC_CLIENT.tgz; cd BOINC_CLIENT; ./boinc_client --daemon; sleep 5; ./boinccmd --project_attach http://boinc-server.org/project 1c3516472b9a86f40f0d96f8124fb403" The value "1c3516472b9a86f40f0d96f8124fb403" is the account number of the Cloud special client created on the BOINC server. 4.4.2.2. Using "command" handler ================================ SpeQuloS is able to execute any command line to start or stop Cloud worker. Therefore, any tool can be used. SpeQuloS only requirement is that command line execution returns 0 if the Cloud worker is correctly started to support the Desktop Grid and Batch for which QoS has been ordered, or any other value otherwise. Commands executed to start and stop Cloud workers in a Cloud service are described in the Cloud service configuration file. A configuration file is composed of several = lines. Here are the parameters recognized by SpeQuloS: - CMD_CW_START (mandatory): The command executed to start the Cloud worker to support the appropriate Desktop Grid and Batch. - CMD_CW_STOP (mandatory): The command executed to stop the Cloud worker. Several variables are setup in the environment where commands are executed and contains values related to Desktop Grid and batch to support, etc. There are all accessible as shell variables. Here are recognized variables: - $SQS_CLOUD_ID: The Cloud service identifier. - $SQS_CLOUD_CONF: The path to the Cloud service configuration file. - $SQS_DG_ID: The Desktop Grid identifier. - $SQS_DG_CONF: The path to the Desktop Grid configuration file. - $SQS_BATCH_ID: The identifier of the batch to support. - $SQS_BATCH_DATA: Some data associated to the batch to support. For XWHEP, it is used to store the xwgroup URI associated to the batch. - $SQS_INSTANCE_ID: The identifier of the Cloud worker. - $SQS_DATA: A special variable used to store data in SpeQuloS. After the startup command is executed, the content of this variable is parsed and stored by SpeQuloS. When the stop command is executed, this variable is setup to contain the data previously stored and is accessible by the command line as an shell environment variable. Therefore, any data needed to stop a Cloud worker should be stored inside the SQS_DATA variable during startup command to be accessible by stop command. For insance, here is a possible configuration file content that implements a fake a Cloud service: CMD_CW_START=echo "Fake start of CW with environement $SQS_CLOUD_ID $SQS_CLOUD_CONF $SQS_DG_ID $SQS_DG_CONF $SQS_BATCH_ID $SQS_BATCH_DATA $SQS_INSTANCE_ID" >> /tmp/cw.out; echo "Also storing important data" >> /tmp/cw.out; SQS_DATA="Important Data" CMD_CW_STOP=echo "Fake stop of CW with environement $SQS_CLOUD_ID $SQS_CLOUD_CONF $SQS_DG_ID $SQS_DG_CONF $SQS_BATCH_ID $SQS_BATCH_DATA $SQS_INSTANCE_ID" >> /tmp/cw.out; echo "My important data is also here: $SQS_DATA" >> /tmp/cw.out 4.4.3. Cloud Provisioning Strategies ==================================== Depending on Desktop Grid type, various Cloud deployment strategies may be used by SpeQuloS. The strategies used, for each type of DG middleware, are definied inside the $INSTALL/scheduler/dg_rpc.py file. BOINC uses the "RESCHEDULE" strategy by default, which requires the BOINC server to be patched. As an alternative the "FLAT" strategy may be used, which works with an unmodified BOINC server, but offer lowerQoS performances. To use the "FLAT" strategy, the variable CLOUD_DEPLOYMENT["BOINC"], at the beginning of the dg_rpc.py file, must be set to "FLAT". By default, XWHEP uses the simple "FLAT" strategy. A more efficient strategy, called "DUPLICATION", is also implemented for XWHEP, but it requiers more work to be configured. For more information on the Duplication strategy, and how to use it, see the duplication.txt document.