Index of /vacuum

[ICO]NameLast modifiedSizeDescription

[PARENTDIR]Parent Directory   -  
[DIR]fecmsfnalitb/ 2020-12-09 06:34 -  
[DIR]fecmsglobalitb/ 2024-10-04 06:34 -  
[DIR]fefermilab/ 2020-04-06 06:40 -  
[DIR]feglowitb/ 2021-08-05 06:34 -  
[DIR]fegluex/ 2021-12-03 06:34 -  
[DIR]feligo/ 2020-04-06 06:40 -  
[DIR]feosgflock/ 2020-04-06 06:40 -  
[DIR]feosgospool/ 2024-06-22 06:34 -  
[DIR]feucsditb/ 2020-04-06 06:40 -  
[DIR]gpfrontend01/ 2020-04-06 06:40 -  
[DIR]gpfrontend02/ 2024-02-21 06:34 -  
[   ]err 2024-10-11 06:34 172  
[   ]out 2024-10-11 06:34 958  
[   ]glidein_startup.sh 2024-10-11 06:34 85K 

Validating a worker node using "Glideins in a Vacuum"

Using the glideins in a vacuum strategy it is possible to start a glidein directly from the worker nodes without having to set up a CE. The starting point is usually the manual_glidein_startup tool tool that allows you yo generate a wrapper script for the glidein_startup.sh. Normally, requirements for that tool are having an entry in the factory, and a group in the frontend that selects this entry. In this page however, we show how it is possible to use the existing TEST_ENTRY and the nigly generated wrappers to verify a worker node is ready to accept jobs from your VO. For more details, or for using the manual_glidein_startup more consistently in a production environment, you can check the manual_glidein_startup tool in the glideinWMS documentation

In the following section we'll show how to run a CMS glidein on an lxplus machine using the niglty wrapper scripts

Step 1: Getting the glidein startup wrapper

Click on the folder corresponding to the VO you are interested in (e.g.: fecmsglobalitb for CMS), and download the wrapper script for the frontend group of your choice (e.g.: main_glidein_startup_wrapper for the main wrapper)

[mmascher@lxplus705 tmp]$ wget http://gfactory-itb-1.opensciencegrid.org/vacuum/fecmsglobalitb/main_glidein_startup_wrapper
--2020-09-02 11:56:53--  http://gfactory-itb-1.opensciencegrid.org/vacuum/fecmsglobalitb/main_glidein_startup_wrapper
Resolving gfactory-itb-1.opensciencegrid.org (gfactory-itb-1.opensciencegrid.org)... 169.228.38.36
Connecting to gfactory-itb-1.opensciencegrid.org (gfactory-itb-1.opensciencegrid.org)|169.228.38.36|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2262 (2,2K)
Saving to: ‘main_glidein_startup_wrapper’

100%[=============================================================================================================================] 2.262       --.-K/s   in 0s      

2020-09-02 11:56:53 (219 MB/s) - ‘main_glidein_startup_wrapper’ saved [2262/2262]

[mmascher@lxplus705 tmp]$ 

Step 2: Set up your X509 proxy

This is needed by the glidein_startup.sh script

[mmascher@lxplus705 tmp]$  voms-proxy-init -voms cms -valid 24:00
Enter GRID pass phrase for this identity:
Contacting voms2.cern.ch:15002 [/DC=ch/DC=cern/OU=computers/CN=voms2.cern.ch] "cms"...
Remote VOMS server contacted succesfully.


Created proxy in /tmp/x509up_u8440.

Your proxy is valid until Wed Sep 02 23:59:49 CEST 2020
[mmascher@lxplus705 tmp]$ export X509_USER_PROXY=/tmp/x509up_u8440

Step 3: Get the glidein_startup.sh scrip and mark it as executable

The glidein_startup.sh is the glidein, it is the script sent by the factory and executed on the worked nodes. The wrapper script you downloaded before needs it!

[mmascher@lxplus705 tmp]$ wget http://gfactory-itb-1.opensciencegrid.org/vacuum/glidein_startup.sh
HTTP request sent, awaiting response... 200 OK
Length: 70975 (69K) [text/plain]
Saving to: ‘glidein_startup.sh’

100%[=============================================================================================================================] 70.975      --.-K/s   in 0,02s   

2020-09-02 12:03:54 (3,24 MB/s) - ‘glidein_startup.sh’ saved [70975/70975]

[mmascher@lxplus705 tmp]$ chmod +x glidein_startup.sh 

Step 4: Execute the wrapper script and the glidein on the node!

This might take a while. The script will output to the screen the results of the validation scripts. Eventually, the condor startd will be started and will try to connect to the user pool. In this example, since my proxy is not authorized by the CMS collector the startd will not be able to connect and no user job will start.

[mmascher@lxplus705 tmp]$ sh main_glidein_startup_wrapper 
Starting glidein_startup.sh at mer  2 set 2020, 12.11.32, CEST (1599041492)
script_checksum   = 'de5f02a2aac92885e2d4c10be41f2a2b  ./glidein_startup.sh'
debug_mode        = 'std'
condorg_cluster   = '0'
condorg_subcluster= '0'
condorg_schedd    = 'UNAVAILABLE'
glidein_credential_id = 'UNAVAILABLE'
glidein_factory   = 'OSG-ITB'
glidein_name      = 'gfactory_instance'
glidein_entry     = 'TEST_ENTRY'
client_name       = 'CMSG-ITB_gWMSFrontend-v1_0'
client_group      = 'main'
multi_glidein/restart = ''/''
work_dir          = '.'
web_dir           = 'http://gfactory-itb-1.opensciencegrid.org/factory/stage'
sign_type         = 'sha1'
proxy_url         = 'None'
descript_fname    = 'description.k8s1IJ.cfg'
descript_entry_fname = 'description.k8s1IJ.cfg'
sign_id           = '54ef26baeeb1d451b434e2ac17b9adefe547eeaa'
sign_entry_id     = '0008f02a693a26ac35a02c932fd79d86d4f47090'
client_web_dir              = 'http://vocms0802.cern.ch/vofrontend/stage'
client_descript_fname       = 'description.k8s9V2.cfg'
client_sign_type            = 'sha1'
client_sign_id              = 'beb430e6ddd11689187c4bb6787f1af700986dfa'
client_web_group_dir        = 'http://vocms0802.cern.ch/vofrontend/stage/group_main'
client_descript_group_fname = 'description.k8s9V2.cfg'
client_sign_group_id        = '23db589303f998c9fc1d167e57d6db233ffdc487'

Running on lxplus705.cern.ch
System: Linux lxplus705.cern.ch 3.10.0-1127.18.2.el7.x86_64 #1 SMP Sun Jul 26 15:27:06 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Release: CentOS Linux release 7.8.2003 (Core) 
As: uid=8440(mmascher) gid=1399(zh) groups=1399(zh),1094569658 context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
PID: 15227

------- Initial environment ---------------
[..... Cutting some text here .....]

etting X509_USER_PROXY
=== Condor starting mer  2 set 2020, 12.12.36, CEST (1599041556) ===
=== Condor started in background, now waiting on process 26176 ===