Uploaded image for project: 'ESCAPE DataLake Operations'
  1. ESCAPE DataLake Operations
  2. EDLK-94

Upload failures from SKA team to EULAKE-1, LAPP-DCACHE and LAPP-WEBDAV

    XporterXMLWordPrintable

Details

    Description

      Uploads to LAPP-DCACHE, LAPP-WEBDAV and EULAKE-1 are failing for SKA upload/replication tests (https://monit-grafana.cern.ch/d/O8MinE5Gk/es-ska-rmb?orgId=51)

       

      For LAPP-DCACHE, our errors are consistent across 3 machines and using 2 separate certificates:

      [root@f81e809bbe76 user]# export FILE=`uuidgen` && echo "test" > $FILE && rucio upload --rse LAPP-DCACHE --lifetime 3600 --scope SKA_SKAO_BARNSLEY-testing $FILE
      2020-11-03 12:03:38,067	INFO	Preparing upload for file 9654b818-5c99-435e-ac19-4b932d3e2fea
      2020-11-03 12:03:38,300	INFO	Successfully added replica in Rucio catalogue at LAPP-DCACHE
      2020-11-03 12:03:38,411	INFO	Successfully added replication rule at LAPP-DCACHE
      2020-11-03 12:04:38,602	ERROR	The requested service is not available at the moment.
      Details: An unknown exception occurred.
      Details: Connection timed out
      

      with issues at a gfal level:

      [root@624071128b1d src]# gfal-ls -la davs://lapp-dcache01.in2p3.fr:2880//data/escape/rucio/lapp_dcache/
      gfal-ls error: 110 (Connection timed out) - Connection timed out
      

      Y.Grange observes the same issue:

      Singularity> export FILE=`uuidgen` && echo "test">> $FILE && rucio -v upload --rse LAPP-DCACHE --lifetime 3600 --scope LOFAR_ASTRON_GRANGE --register-after-upload $FILE
      2020-11-03 15:00:59,579	DEBUG	uploadclient.py	upload	Num. of files that upload client is processing: 1
      2020-11-03 15:00:59,716	DEBUG	uploadclient.py	upload	Input validation done.
      2020-11-03 15:00:59,716	INFO	Preparing upload for file 292d2866-5732-45ad-9475-917c29b14f79
      2020-11-03 15:00:59,825	DEBUG	uploadclient.py	upload	wan domain is used for the upload
      2020-11-03 15:00:59,841	DEBUG	gfal.py	connect	connecting
      2020-11-03 15:00:59,881	DEBUG	gfal.py	exists	path None
      2020-11-03 15:00:59,881	DEBUG	gfal.py	__gfal2_exist	path None
      2020-11-03 15:00:59,925	DEBUG	rsemanager.py	exists	Checking if davs://lapp-dcache01.in2p3.fr:2880//data/escape/rucio/lapp_dcache/LOFAR_ASTRON_GRANGE/3e/6a/292d2866-5732-45ad-9475-917c29b14f79 exists
      2020-11-03 15:00:59,925	DEBUG	gfal.py	exists	path davs://lapp-dcache01.in2p3.fr:2880//data/escape/rucio/lapp_dcache/LOFAR_ASTRON_GRANGE/3e/6a/292d2866-5732-45ad-9475-917c29b14f79
      2020-11-03 15:00:59,925	DEBUG	gfal.py	__gfal2_exist	path davs://lapp-dcache01.in2p3.fr:2880//data/escape/rucio/lapp_dcache/LOFAR_ASTRON_GRANGE/3e/6a/292d2866-5732-45ad-9475-917c29b14f79
      2020-11-03 15:02:00,090	ERROR	The requested service is not available at the moment.
      Details: An unknown exception occurred.
      Details: Connection timed out
      Completed in 60.5808 sec.
      

      but R.DiMaria does not:

      [root@escape-crons-78bc6669f8-sg2wk scripts]# gfal-ls -la davs://lapp-dcache01.in2p3.fr:2880//data/escape/rucio/lapp_dcache/
      drwxrwxrwx   0 0     0             0 Oct  1 14:34 gfal_sam	
      drwxrwxrwx   0 0     0             0 Oct 14 08:03 ESCAPE_CERN_TEAM-noise	
      drwxrwxrwx   0 0     0             0 Oct  8 07:02 LSST_CCIN2P3_GOUNON	
      drwxrwxrwx   0 0     0             0 Oct  9 14:07 SKA_SKAO_BARNSLEY-testing	
      drwxrwxrwx   0 0     0             0 Oct  6 15:03 FAIR_GSI_SZUBA	
      drwxrwxrwx   0 0     0             0 Oct 28 09:23 CTA_LAPP_FREDERIC	
      drwxrwxrwx   0 0     0             0 Oct 15 09:38 ESCAPE_DESY_TEAM-testing	
      drwxrwxrwx   0 0     0             0 Oct  5 15:17 fts-testing	
      drwxrwxrwx   0 0     0             0 Oct  6 15:02 rucio-testing	
      drwxrwxrwx   0 0     0             0 Oct 21 16:47 SKA_SKAO_COLL-testing	
      drwxrwxrwx   0 0     0             0 Oct 27 16:44 SKA_SKAO_JOSHI-testing	
      drwxrwxrwx   0 0     0             0 Oct 30 10:16 ATLAS_LAPP_JEZEQUEL
      

       

      For EULAKE-1, the error is not consistent across machines (works on 2 of the 3 tested), but other sites work fine.

      Attachments

        1. EULAKE-1.txt
          6 kB
        2. LAPP-DCACHE.txt
          11 kB
        3. LAPP-WEBDAV.txt
          10 kB

        Issue Links

          Structure

            Activity

              People

                r.joshi Joshi, Rohini
                r.bolton Bolton, Rosie
                Barnsley, Rob, Collinson, James, Di Maria, Riccardo [X] (Inactive), Dona, Rizart [X] (Inactive), Grange, Yan, Joshi, Rohini
                Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved:

                  Structure Helper Panel