Details
-
Task
-
Resolution: Done
-
Low
Description
Uploads to LAPP-DCACHE, LAPP-WEBDAV and EULAKE-1 are failing for SKA upload/replication tests (https://monit-grafana.cern.ch/d/O8MinE5Gk/es-ska-rmb?orgId=51)
For LAPP-DCACHE, our errors are consistent across 3 machines and using 2 separate certificates:
[root@f81e809bbe76 user]# export FILE=`uuidgen` && echo "test" > $FILE && rucio upload --rse LAPP-DCACHE --lifetime 3600 --scope SKA_SKAO_BARNSLEY-testing $FILE 2020-11-03 12:03:38,067 INFO Preparing upload for file 9654b818-5c99-435e-ac19-4b932d3e2fea 2020-11-03 12:03:38,300 INFO Successfully added replica in Rucio catalogue at LAPP-DCACHE 2020-11-03 12:03:38,411 INFO Successfully added replication rule at LAPP-DCACHE 2020-11-03 12:04:38,602 ERROR The requested service is not available at the moment. Details: An unknown exception occurred. Details: Connection timed out
with issues at a gfal level:
[root@624071128b1d src]# gfal-ls -la davs://lapp-dcache01.in2p3.fr:2880//data/escape/rucio/lapp_dcache/
gfal-ls error: 110 (Connection timed out) - Connection timed out
Y.Grange observes the same issue:
Singularity> export FILE=`uuidgen` && echo "test">> $FILE && rucio -v upload --rse LAPP-DCACHE --lifetime 3600 --scope LOFAR_ASTRON_GRANGE --register-after-upload $FILE 2020-11-03 15:00:59,579 DEBUG uploadclient.py upload Num. of files that upload client is processing: 1 2020-11-03 15:00:59,716 DEBUG uploadclient.py upload Input validation done. 2020-11-03 15:00:59,716 INFO Preparing upload for file 292d2866-5732-45ad-9475-917c29b14f79 2020-11-03 15:00:59,825 DEBUG uploadclient.py upload wan domain is used for the upload 2020-11-03 15:00:59,841 DEBUG gfal.py connect connecting 2020-11-03 15:00:59,881 DEBUG gfal.py exists path None 2020-11-03 15:00:59,881 DEBUG gfal.py __gfal2_exist path None 2020-11-03 15:00:59,925 DEBUG rsemanager.py exists Checking if davs://lapp-dcache01.in2p3.fr:2880//data/escape/rucio/lapp_dcache/LOFAR_ASTRON_GRANGE/3e/6a/292d2866-5732-45ad-9475-917c29b14f79 exists 2020-11-03 15:00:59,925 DEBUG gfal.py exists path davs://lapp-dcache01.in2p3.fr:2880//data/escape/rucio/lapp_dcache/LOFAR_ASTRON_GRANGE/3e/6a/292d2866-5732-45ad-9475-917c29b14f79 2020-11-03 15:00:59,925 DEBUG gfal.py __gfal2_exist path davs://lapp-dcache01.in2p3.fr:2880//data/escape/rucio/lapp_dcache/LOFAR_ASTRON_GRANGE/3e/6a/292d2866-5732-45ad-9475-917c29b14f79 2020-11-03 15:02:00,090 ERROR The requested service is not available at the moment. Details: An unknown exception occurred. Details: Connection timed out Completed in 60.5808 sec.
but R.DiMaria does not:
[root@escape-crons-78bc6669f8-sg2wk scripts]# gfal-ls -la davs://lapp-dcache01.in2p3.fr:2880//data/escape/rucio/lapp_dcache/
drwxrwxrwx 0 0 0 0 Oct 1 14:34 gfal_sam
drwxrwxrwx 0 0 0 0 Oct 14 08:03 ESCAPE_CERN_TEAM-noise
drwxrwxrwx 0 0 0 0 Oct 8 07:02 LSST_CCIN2P3_GOUNON
drwxrwxrwx 0 0 0 0 Oct 9 14:07 SKA_SKAO_BARNSLEY-testing
drwxrwxrwx 0 0 0 0 Oct 6 15:03 FAIR_GSI_SZUBA
drwxrwxrwx 0 0 0 0 Oct 28 09:23 CTA_LAPP_FREDERIC
drwxrwxrwx 0 0 0 0 Oct 15 09:38 ESCAPE_DESY_TEAM-testing
drwxrwxrwx 0 0 0 0 Oct 5 15:17 fts-testing
drwxrwxrwx 0 0 0 0 Oct 6 15:02 rucio-testing
drwxrwxrwx 0 0 0 0 Oct 21 16:47 SKA_SKAO_COLL-testing
drwxrwxrwx 0 0 0 0 Oct 27 16:44 SKA_SKAO_JOSHI-testing
drwxrwxrwx 0 0 0 0 Oct 30 10:16 ATLAS_LAPP_JEZEQUEL
For EULAKE-1, the error is not consistent across machines (works on 2 of the 3 tested), but other sites work fine.