Details
-
Epic
-
Not Assigned
-
None
-
Data Management v0.1 - Roadmap
-
SRCnet
-
0
Description
Implementation Overview for Local Data Management and Proxy Service
To facilitate proxying requests to a potentially firewalled SODA/visualization service while managing permissions checks, we propose developing a Local Data Management Service (LDMS). This service will serve as an intermediary to link data files from the Rucio Data Lake to local user environments, ensuring secure and authorized access to data.
Key Functionalities of the Local Data Management Service (LDMS)
1. *User Token Validation*:
- The LDMS will accept user tokens to authenticate users against the SRCNet Identity and Access Management (IAM) system.
- It will determine the corresponding local user using the CANFAR PosixMapper, ensuring consistent mapping of network users to local users.
2. *Data File Linking*:
- Upon receiving a request with specific data file IDs, the LDMS will create symbolic links to the requested files in a scratch area attached to the local computing environment.
- The service will ensure the local RSE (Rucio Storage Element) is mounted in read-only mode to safeguard against unauthorized modifications.
3. *Proxying Requests*:
- The LDMS will act as a proxy, relaying requests from the client to the SODA service while enforcing permissions checks based on the IAM group memberships linked to the user's access token.
- The service will implement WLCG path-based authorization, ensuring that tokens provide access to specific paths on the storage.
4. *Configuration and Synchronization*:
- The LDMS will require modifications to the Rucio server configuration to point to SRCNet IAM for permissions checks.
- The sync script in the Rucio task manager will need adjustments to accommodate this new configuration.
5. *API Development*:
- The LDMS will implement a RESTful API in line with the interface defined in JIRA SP-4678, including methods for linking data files and validating permissions.
Architecture Components
- *Web Service*: The LDMS will be implemented as a web service that can be installed at each SRCNet site, designed to handle incoming requests and manage local file links.
- *Integration with CANFAR*: By leveraging CANFAR’s PosixMapper, the service will translate network user IDs from the user tokens to local filesystem paths.
- *Storage Access*: Ensure that the service can interact with Rucio and other storage technologies, such as S3 and Ceph, to facilitate data access and management.
Pending Actions
1. *Modify Rucio Server Configuration*:
- Update configurations to redirect permission checks to SRCNet IAM.
- Synchronize user accounts and permissions using the updated configuration.
2. *Develop the LDMS*:
- Create the service architecture, including the API, user authentication, and data linking functionality.
- Implement logging and error handling mechanisms for robust operation.
3. *Testing*:
- Conduct thorough testing to ensure that permissions checks are functioning correctly and that data linking works as intended.
- Test the proxy functionality to ensure that the SODA service can process requests securely.
4. *Documentation*:
- Document the service setup, configuration changes, and API usage for future reference and training of site operators.
Additional Considerations
- *Performance*: Benchmark the LDMS to ensure it can handle the expected load, particularly with large datasets.
- *Scalability*: Design the service with scalability in mind to accommodate growing data volumes and user requests.
- *Monitoring*: Integrate with existing monitoring solutions (e.g., Grafana) to track usage metrics and service health.
By implementing the Local Data Management Service with these functionalities and considerations, we will enhance the SRCNet ecosystem, providing a secure and efficient means to manage and access data across firewalled environments.
Attachments
Issue Links
- Child Of
-
SP-4749 Data Management All Versions - Roadmap
- Funnel
-
SP-4873 Development VS v0.1 - Roadmap
- Funnel
- Parent Of
-
SP-4565 v0.1 Test Campaign preparation: Ingest image data into Rucio
- Implementing
-
SP-4258 Rucio site naming convention and definitions of attributes
- Releasing
-
SP-3308 Cyan: Review Architecture Document and Top-Level Roadmap
- Done
-
SP-4098 Add capability to ingestion service to ingest data into "non-deterministic" Rucio storage elements
- Done
-
SP-4227 Define requirements for SRCNet Ingestion Nodes
- Done
-
SP-4256 Develop rucio-task-manager to ingest data using the ska-src-ingestion tool as part of tests
- Done
-
SP-4271 Investigate RSE use in Canadian context
- Done
-
SP-3946 SODA accessing data over HTTPS
- Done
-
SP-4099 Explore User Storage Options
- Done
-
SP-3316 Complete the integration of Swiss storage into Rucio data lake
- Done
-
SP-3320 Rucio capacity and capability enchancements in B-L team
- Done