MetadataKeys Synchronization Service Overview
Overview & Problem Statement¶
The Metadata Keys Module is a dedicated standalone component designed to manage and retrieve metadata keys across the platform. This module replaces the legacy GET /datasets/metadataKeys endpoint.
Problem Addressed¶
The previous implementation in the Datasets service lacked a permission-based filtering layer. Because it attempted to return all global keys without ownership validation, it caused:
- Performance: Significant latency when processing large datasets.
- Stability: Crashes occurred when retrieval limits were missing or improperly configured.
- Risks: Users could see metadata keys they did not have permissions to access.
Module Architecture¶
This module consists of a dedicated Controller and Service layer that implements a robust permission-aware logic.
MetadataKeysController¶
Provides the API interface for searching keys. Allowed filters can be found in src/metadata-keys/metadatakeys.service.ts and exmaple can be find in src/metadata-keys/types/metadatakeys-filter-content.ts
Endpoint: GET /metadatakeys (replaces /datasets/metadataKeys)Method: findAllEndpoint Access: Endpoint can be Accessed by any users
MetadataKeysService¶
This handles the business logic and talks to the database. It is divided into user-facing search logic and internal data synchronization.
Permission Layer (Applies to findAll only):¶
When a user searches for keys, the service uses accessibleBy to automatically append access filters based on CASL permissions:
Admins: Can search and get all metadata keys in the system.Authenticated Users: Can only get keys where they are part of the ownerGroup or accessGroups.Unauthenticated Users: Can only get keys that are marked as isPublished.
Service Methods:¶
findAll: The only public-facing method. It applies the permission layer and then uses a database aggregation pipeline to find and return the specific keys requested by the user. Every search is limited to 100 results by default, if limit is not provided.insertManyFromSource: An internal method that takes an original document (like a Dataset), extracts fields from scientificMetadata, metadata, and customMetadata, and creates new records in the Metadata Keys collection.deleteMany: Removes metadata key entries associated with a source document when that document is deleted from the system.replaceManyFromSource: Triggered when a source document (e.g., a Dataset or Proposal) is updated. It callsdeleteManyandinsertManyFromSourcesequentially.
Usage Example¶
To list all metadata keys associated with a dataset, the user must provide the sourceType and sourceId. If the fields array is provided, only those specific fields will be returned:
{
"where": {
"sourceType": "dataset",
"sourceId": "datasetId"
},
"fields": ["humanreadableName", "key"],
"limits": {
"limit": 10,
"skip": 0,
"sort": {
"createdAt": "asc | desc"
}
}
}
To retrieve a specific metadata key, use the following filter:
{
"where": {
"sourceType": "dataset",
"sourceId": "datasetId",
"key": "metadata_key_name"
},
"fields": ["key"],
"limits": {
"limit": 10,
"skip": 0,
"sort": {
"createdAt": "asc | desc"
}
}
}