Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-4139

SDC2 dataset is available to interactive environments in JupyterHub/CANFAR SciPlat for SKAO science team evaluation

Change Owns to Parent OfsSet start and due date...


    • National SRC
    • Hide

      Though this work needs to be done in some form, there is a potential to leverage this for the benefit of the SRCNet development, by using the data lake to manage the SDC2 dataset and providing read access without triggering downloads.

      Though this work needs to be done in some form, there is a potential to leverage this for the benefit of the SRCNet development, by using the data lake to manage the SDC2 dataset and providing read access without triggering downloads.
    • Hide

      AC1: SDC2 dataset is attached to user containers started in JupyterHub instance

      AC2 (stretch): SDC2 dataset is managed by Rucio with science metadata set. RSE storage is attached to user sessions in JupyterHub. DataLink service can be used to determine local path to data for read access.

      AC1: SDC2 dataset is attached to user containers started in JupyterHub instance AC2 (stretch): SDC2 dataset is managed by Rucio with science metadata set. RSE storage is attached to user sessions in JupyterHub. DataLink service can be used to determine local path to data for read access.
    • 0.5
    • 1.5
    • 0
    • Team_MAGENTA
    • Sprint 2
    • Hide

      Planned work has been done; can be tested at https://jupyterhub.srcdev.skao.int 

      Summary of work done:

      • A new volume provisioned on STFC Cloud, SDC2 dataset downloaded onto it
      • Volume NFS-shared to each of the workers of the K8s workload cluster running the JupyterHub prototype
      • K8s resources (PV/PVCs) created to make this hostpath visible to the Jupyter user containers

      This meets AC1, but not AC2 - this wasn't considered a high priority at backlog prioritisation, so was left for the future.

      Planned work has been done; can be tested at https://jupyterhub.srcdev.skao.int   Summary of work done: A new volume provisioned on STFC Cloud, SDC2 dataset downloaded onto it Volume NFS-shared to each of the workers of the K8s workload cluster running the JupyterHub prototype K8s resources (PV/PVCs) created to make this hostpath visible to the Jupyter user containers This meets AC1, but not AC2 - this wasn't considered a high priority at backlog prioritisation, so was left for the future.
    • PI23 - UNCOVERED

    • science-platform-services


      Previously we have presented a customised JupyterHub (with Binder and Dask Gateway services) to the SKAO science team as a means of presenting a common user experience to the SDC participants. They have asked for the SDC2 dataset (a single FITS cube of 1TB) to be attached for further testing/evaluation.

      The minimal version of this would be to download the SDC2 dataset to a volume and attach this to all nodes of the cluster on which the JupyterHub is running. Then a PV/PVC can be made and attached as read-only to user containers. (1 FP)

      A more sophisticated version of this, perhaps yielding more value to the SRCNet development, would be to store the file in Rucio, and enable data access via RSE storage mounts, as previously demonstrated. It may also be possible to use (manually created) symlinks to provide user access while preserving security for the rest of the RSE data. This could equally then be demonstrated in CANFAR SP as well. (2FP)


        Issue Links




                r.bolton Bolton, Rosie
                j.collinson Collinson, James
                0 Vote for this issue
                1 Start watching this issue

                Feature Progress

                  Story Point Burn-up: (100.00%)

                  Feature Estimate: 0.5

                  IssuesStory Points
                  To Do00.0
                  In Progress   00.0



                    Structure Helper Panel