Central Configuration of Backend: `.env`¶

The configuration file .env allows the systems administrator to configure connected services, like authentication services and message queues, and also switching on/off almost all available features and is read by the backend at runtime.

There are currently many configurable additions to SciCat which makes it very flexible these are:

OIDC for identification
LDAP for authentication
Elastic Search
SMTP for sending emails to notify users of SciCat jobs
AMQP to provide a message queue for the jobs

Environment Variables¶

All environment variables can be used in the .env filee. The current source code contains an example .env file, named .env.example, listing all (79) environment variables available to configure the backend. They can be found here and define

How SciCat handles access rights and connects to other services e.g. to identity providers - such as LDAP or OIDC for authentication.
How to configure DOIs.
How to configure elasitc search (ES)
How to configure jobs

The list is compiled according to the configuration class defined in src/config/configuration.ts.

ADMIN_GROUPS: list of groups that have admin priviliges default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed.
DELETE_GROUPS: list of groups that are allowed to delete content default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed.
CREATE_DATASET_GROUPS: list of non admin groups that are allowed to create datasets without pid. The pid is assigned by the system. If set to "#all", all users can create a dataset belonging to any of the groups they belong to. default: "#all" format: comma separated list of strings. Leading and trailing spaces are trimmed.
CREATE_DATASET_WITH_PID_GROUPS: list of non admin groups that are allowed to create datasets with explicit pid. If set to "#all", all users can create a dataset belonging to any of the groups they belong to and with esplicit pid. If the pid verification is enabled, pid will be validated agains the specification passed. default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed.
CREATE_DATASET_PRIVILEGED_GROUPS: list of non admin groups that are allowed to create datasets for groups they do not belong to. If set to "#all", all users can create a dataset belonging to any group with explicit pid. If the pid verification is enabled, pid will be validated agains the specification passed. default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed.
PROPOSAL_GROUPS: list of non admin groups that are allowed to create and update proposals for groups they do not belong to. If set to "#all", all users can create a dataset belonging to any group with explicit pid. default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed
SAMPLE_GROUPS: list of non admin groups that are allowed to create and update samples for the groups they belong to. If set to "#all", all users can create a dataset belonging to their group. default: "#all" format: comma separated list of strings. Leading and trailing spaces are trimmed
SAMPLE_PRIVILEGED_GROUPS: list of non admin groups that are allowed to create samples for any groups, but can only update samples belonging to groups they belong to. default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed
POLICY_GROUPS: list of groups that are allowed to create, read, and update policies. Users in ADMIN_GROUPS always have this permission regardless of this setting. default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed
UPDATE_DATASET_LIFECYCLE_GROUPS: list of groups that are allowed to update the lifecycle state of a dataset. Authenticated users not in this list (or ADMIN_GROUPS) cannot modify lifecycle fields. default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed
CREATE_JOB_PRIVILEGED_GROUPS: list of groups that are allowed to create jobs for any user or group, regardless of the job configuration's create.auth field. Users in this group can also read any job. default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed
UPDATE_JOB_PRIVILEGED_GROUPS: list of groups that are allowed to update any job, regardless of the job configuration's update.auth field. Users in this group can also read any job. default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed
DELETE_JOB_GROUPS: list of groups that are allowed to delete any job. Authenticated users not in this list (or ADMIN_GROUPS) cannot delete jobs. default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed
ATTACHMENT_GROUPS: list of groups that are allowed to create, read, update, and delete attachments belonging to groups they are a member of. If set to "#all", all authenticated users have these permissions. default: "#all" format: comma separated list of strings. Leading and trailing spaces are trimmed
ATTACHMENT_PRIVILEGED_GROUPS: list of groups that are allowed to create attachments for any owner group, and to read, update, and delete attachments belonging to groups they are a member of or that they have access to. default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed
HISTORY_ACCESS_DATASET_GROUPS: list of groups that are allowed to read the change history of datasets. Users in ADMIN_GROUPS always have this access. default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed
HISTORY_ACCESS_PROPOSAL_GROUPS: list of groups that are allowed to read the change history of proposals. Users in ADMIN_GROUPS always have this access. default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed
HISTORY_ACCESS_SAMPLE_GROUPS: list of groups that are allowed to read the change history of samples. Users in ADMIN_GROUPS always have this access. default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed
HISTORY_ACCESS_INSTRUMENT_GROUPS: list of groups that are allowed to read the change history of instruments. Users in ADMIN_GROUPS always have this access. default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed
HISTORY_ACCESS_PUBLISHED_DATA_GROUPS: list of groups that are allowed to read the change history of published data records. Users in ADMIN_GROUPS always have this access. default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed
HISTORY_ACCESS_POLICIES_GROUPS: list of groups that are allowed to read the change history of policies. Users in ADMIN_GROUPS always have this access. default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed
HISTORY_ACCESS_DATABLOCK_GROUPS: list of groups that are allowed to read the change history of datablocks. Users in ADMIN_GROUPS always have this access. default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed
HISTORY_ACCESS_ATTACHMENT_GROUPS: list of groups that are allowed to read the change history of attachments. Users in ADMIN_GROUPS always have this access. default: "" format: comma separated list of strings. Leading and trailing spaces are trimmed
ACCESS_GROUPS_STATIC_VALUES: List of groups assigned by default to all users. Used in the vanilla implementation for easy configuration. If you do not want or need to assign any default group, it should be set to empty string "". Default value: "" format: Comman separated list of strings. Leading and trailing spaces are trimmed example: "group1,group2,group3,..."
ACCESS_GROUP_SERVICE_TOKEN: Access token needed to access the API specified in ACCESS_GROUP_SERVICE_API_URL, used to retrieve access groups from a third party system. _format*: string
ACCESS_GROUP_SERVICE_API_URL: Well formed url of the service API used to provide access groups. Only one value is allowed. format: string example: "https://my.access.group/service/api/url"
DOI_PREFIX: The facility DOI prefix, with trailing slash. default: "" format: string
EXPRESS_SESSION_SECRET: Secret used to set up express session. default: "" format: string
LOGOUT_URL: URL specified upon successful logout. It is returned in the json object for the frontend, or third party UI, to be used locally. default: "" format: string
HTTP_MAX_REDIRECTS: Max number of redirects for http requests. default: 5 format: integer
HTTP_TIMEOUT: Timeout from http requests in ms. default: 5000 format: integer
JWT_SECRET: The secret used to create any JWT token, used for authorization. default: "" format: string
JWT_EXPIRES_IN: Expiration time of any JWT token in seconds. default: 3600 (s) format: integer
JWT_NEVER_EXPIRES: Length of time that the never expiring jwt token will last. default: 100y format: string as in number of years
LDAP_URL: Full URI (including port) of your local LDAP server, if this is your selected authentication method. default: No default example: ldaps://ldap.server.com:636/ format: string
LDAP_BIND_DN: Bind DN to access information on your LDAP server. default: No default format: string
LDAP_BIND_CREDENTIALS: Credentials associated with your bind DN to acccess your LDAP server. default: No default format: string
LDAP_SEARCH_BASE: Search base for your LDAP server. default: No default format: string
LDAP_SEARCH_FILTER: Search filter for you LDAP server. default: No default format: string example: "(LDAPUsername={{username}})"
LDAP_MODE: type of ldap server we are communicating with NEEDS TO BE UPDATED. Not sure which other values are accepted default: ad format: string acceptable values: ad
LDAP_EXTERNAL_ID: LDAP matching field that provides the external id default: sAMAccountName format: string
LDAP_USERNAME: LDAP field providing the username default: displayName format: string
OIDC_ISSUER: Full URL of your OIDC identity provider default: No default format: string example: "https://identity.your.facility/your/realm"
OIDC_CLIENT_ID: Client id used to convert OIDC code to OIDC token. This is assigned in the OIDC service when the token is generated default: No default format: string example: "scicat"
OIDC_CLIENT_SECRET: Token used to convert OIDC code to OIDC token. This is assigned in the OIDC service when the token is generated example: "90f1268..."
OIDC_CALLBACK_URL: URL of the endpoint that is called when the authentication has been executed with the OIDC service. default: No default format: string example: "http://localhost:3000/api/v3/oidc/callback"
OIDC_SCOPE: Information returned by the OIDC service together with token default: No default format: string example: "openid profile email"
OIDC_SUCCESS_URL: Frontend URL that the user is directed to after a successful authentication. It must be a valid frontend URL. default: No default format: string example: "http://localhost:3000/Datasets"
OIDC_ACCESS_GROUPS: field used to retrieve access groups from the OIDC service. It is not used in the vanilla implementation. default: No default format: string example: "access_groups"
OIDC_ACCESS_GROUPS_PROPERTY: name of the OIDC property used to retrieve the users groups from OIDC. default: none format: string
OIDC_AUTO_LOGOUT: if enabled, when login out from SciCat, we logout from OIDC also. default: false format: boolean
OIDC_RETURN_URL: URL the user is redirected after a successful logout default: none format: string
LOGBOOK_ENABLED: Flag to enable/disable the Logbook endpoints. accept values: "yes", "no" default: no format: string
LOGBOOK_BASE_URL: The base URL to the SciChat wrapper API. Only required if Logbook is enabled. default: "http://localhost:3030/scichatapi" format: string
LOGBOOK_USERNAME: The username used to authenticate to the SciChat wrapper API. Only required if Logbook is enabled. default: No default format: string
LOGBOOK_PASSWORD: The password used to authenticate to the SciChat wrapper API. Only required if Logbook is enabled. default: No default format: string
METADATA_KEYS_RETURN_LIMIT: The maximum number of keys returned by the /Datasets/metadataKeys endpoint. default: No default format: integer
METADATA_PARENT_INSTANCES_RETURN_LIMIT: The maximum number of Datasets used to extract metadata keys in the /Datasets/metadataKeys endpoint. default: No default format: integer
MONGODB_URI: The URI for your MongoDB instance. default: No default format: string "mongodb://:@:27017/"
OAI_PROVIDER_ROUTE: URI to OAI provider, which is used in the /publisheddata/:id/resync endpoint. default: no default format: string
PID_PREFIX: The facility PID prefix, with trailing slash. default: no default format: string
PUBLIC_URL_PREFIX: The base URL to the facility Landing Page. default: No default format: string example: "https://doi.ess.eu/detail/"
PORT: The port on which the backend listen on. default: 3000 format: integer
RABBITMQ_ENABLED: Flag to enable/disable RabbitMQ consumer. accepted values: "yes", "no" deprecated. Will be removed in future releases. default: no format: string
RABBITMQ_HOSTNAME: The hostname of the RabbitMQ message broker. Only required if RabbitMQ is enabled. deprecated. Will be removed in future releases. default: no default default: string
RABBITMQ_USERNAME: The username used to authenticate to the RabbitMQ message broker. Only required if RabbitMQ is enabled. deprecated. Will be removed in future releases. default: no default format: string
RABBITMQ_PASSWORD: The password used to authenticate to the RabbitMQ message broker. Only required if RabbitMQ is enabled. deprecated. Will be removed in future releases. default: no default format: string
REGISTER_DOI_URI: URI to the organization that registers the facilities DOIs. default: no default format: string example: "https://mds.test.datacite.org/doi"
REGISTER_METADATA_URI: URI to the organization that registers the facilities published data metadata. default: no default format: string example: ="https://mds.test.datacite.org/metadata"
DOI_USERNAME: Username used to authenticate on the DOI site default: no default format: string
DOI_PASSWORD: Password used to authenticate on the DOI site default: no default format: string
SITE: The name of your site. default: no default format: string
SMTP_HOST: Host of SMTP server. deprecated. Will be removed in future releases. default: no default format: string
SMTP_MESSAGE_FROM: Email address that emails should be sent from. deprecated. Will be removed in future releases. default: no default format: string, email
SMTP_PORT: Port of SMTP server. deprecated. Will be removed in future releases. default: no default format: string
SMTP_SECURE: Secure of SMTP server. deprecated. Will be removed in future releases. default: no default format: string
POLICY_PUBLICATION_SHIFT: Number of years that needs to elapse before the dataset is made publicly acceessible default: 3 format: integer
POLICY_RETENTION_SHIFT: Number of years that the datasets are kept online before are archived or deleted. A negative value means that they are never archived/deleted default: -1 format: integer
ELASTICSEARCH_ENABLED: Flag to enable/disable the ElasticSearch service accept values: "yes", "no" default: no default format: string
ES_HOST: The base URL to the Elasticsearch cluster. Use http if xpack.security is disabled default: no default format: string example: "https://localhost:9200" or "http://localhost:9200"
MONGODB_COLLECTION: Collection name to be mapped into specified Elasticsearch index default: no default format: string
ES_MAX_RESULT: Maximum records can be indexed into Elasticsearch. default: 10000 format: number
ES_FIELDS_LIMIT: The total number of fields in an index. default: 1000 format: number
ES_INDEX: The total number of fields in an index. default: no default format: string
ES_REFRESH: The total number of fields in an index. accept values: true, false, "wait_for" default: false format: boolean or string
ES_USERNAME: Elasticsearch cluster username. default: no default, optional. format: string
ELASTIC_PASSWORD: Elasticsearch cluster password. default: no default. format: string

Environment Variables as now¶

ACCESS_GROUP_SERVICE_API_URL=""
ACCESS_GROUP_SERVICE_TOKEN=""
DOI_PREFIX="<DOI_PREFIX>"
EXPRESS_SESSION_SECRET="<EXPRESS_SESSION_SECRET>"
HTTP_MAX_REDIRECTS=5
HTTP_TIMEOUT=5000
JWT_SECRET=<JWT_SECRET>
JWT_EXPIRES_IN=3600
LDAP_URL="ldaps://ldap.server.com:636/"
LDAP_BIND_DN="<USERNAME>@server.com"
LDAP_BIND_CREDENTIALS=<PASSWORD>
LDAP_SEARCH_BASE=<SEARCH_BASE>
LDAP_SEARCH_FILTER="(LDAPUsername={{username}})"
LOGBOOK_ENABLED="no"
LOGBOOK_BASE_URL="http://localhost:3030/scichatapi"

METADATA_KEYS_RETURN_LIMIT=100
METADATA_PARENT_INSTANCES_RETURN_LIMIT=100
MONGODB_URI="mongodb://<USERNAME>:<PASSWORD>@<HOST>:27017/<DB_NAME>"
OAI_PROVIDER_ROUTE="<OAI_PROVIDER_ROUTE>"
PID_PREFIX="<PID_PREFIX>"
PUBLIC_URL_PREFIX="https://doi.esss.se/detail/"
PORT=3000
RABBITMQ_ENABLED=<"yes"|"no">
RABBITMQ_HOSTNAME="localhost"
RABBITMQ_USERNAME="rabbitmq"
RABBITMQ_PASSWORD="rabbitmq"
REGISTER_DOI_URI="https://mds.test.datacite.org/doi"
REGISTER_METADATA_URI="https://mds.test.datacite.org/metadata"
DOI_USERNAME="username"
DOI_PASSWORD="password"
SITE=<SITE>
EMAIL_TYPE=<"smtp"|"ms365">
EMAIL_FROM=<MESSAGE_FROM>
SMTP_HOST=<SMTP_HOST>
SMTP_PORT=<SMTP_PORT>
SMTP_SECURE=<"yes"|"no">
MS365_TENANT_ID=<tenantId>
MS365_CLIENT_ID=<clientId>
MS365_CLIENT_SECRET=<clientSecret>

DATASET_CREATION_VALIDATION_ENABLED=true
DATASET_CREATION_VALIDATION_REGEX="^[0-9A-F]{8}-[0-9A-F]{4}-4[0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$"

ADMIN_GROUPS=""
DELETE_GROUPS=""
CREATE_DATASET_GROUPS="#all"
CREATE_DATASET_WITH_PID_GROUPS=""
CREATE_DATASET_PRIVILEGED_GROUPS=""
UPDATE_DATASET_LIFECYCLE_GROUPS=""
CREATE_JOB_PRIVILEGED_GROUPS=""
UPDATE_JOB_PRIVILEGED_GROUPS=""
DELETE_JOB_GROUPS=""
SAMPLE_PRIVILEGED_GROUPS="sampleingestor"
SAMPLE_GROUPS="group1"
PROPOSAL_GROUPS=""
POLICY_GROUPS=""
ATTACHMENT_GROUPS="#all"
ATTACHMENT_PRIVILEGED_GROUPS=""
HISTORY_ACCESS_DATASET_GROUPS=""
HISTORY_ACCESS_PROPOSAL_GROUPS=""
HISTORY_ACCESS_SAMPLE_GROUPS=""
HISTORY_ACCESS_INSTRUMENT_GROUPS=""
HISTORY_ACCESS_PUBLISHED_DATA_GROUPS=""
HISTORY_ACCESS_POLICIES_GROUPS=""
HISTORY_ACCESS_DATABLOCK_GROUPS=""
HISTORY_ACCESS_ATTACHMENT_GROUPS=""

ACCESS_GROUPS_GRAPHQL_ENABLED=true
ACCESS_GROUP_SERVICE_TOKEN=""
ACCESS_GROUP_SERVICE_API_URL=""
ACCESS_GROUP_SERVICE_HANDLER=""
ACCESS_GROUPS_STATIC_ENABLED=true
ACCESS_GROUPS_OIDCPAYLOAD_ENABLED=true

OIDC_USERINFO_MAPPING_FIELD_USERNAME="iss, sub"
OIDC_USERINFO_MAPPING_FIELD_DISPLAYNAME="preferred_username"
OIDC_USERINFO_MAPPING_FIELD_EMAIL="email"
OIDC_USERINFO_MAPPING_FIELD_GROUP="groups"
OIDC_USERQUERY_OPERATOR=<"or"|"and">
OIDC_USERQUERY_FILTER="username:username, email:email"

ELASTICSEARCH_ENABLED=<"yes"|"no">
STACK_VERSION="8.8.2"
CLUSTER_NAME="es-cluster"
MEM_LIMIT="4G"
MONGODB_COLLECTION="Dataset"
ES_MAX_RESULT=100000
ES_FIELDS_LIMIT=400000
ES_INDEX="dataset"
ES_PORT=9200
ES_HOST="https://localhost:9200"
ES_USERNAME="elastic"
ES_PASSWORD="duo-password"
ES_REFRESH=<"wait_for"|"false">

LOGGERS_CONFIG_FILE="loggers.json"
DATASET_TYPES_FILE="datasetTypes.json"
PROPOSAL_TYPES_FILE="proposalTypes.json"

FRONTEND_CONFIG_FILE="src/config/frontend.config.json"
FRONTEND_THEME_FILE="src/config/frontend.theme.json"

How to configure to connect the backend to other services¶

In scicatlive you find documentation on how to integrate your SciCat system with services providing identities, (e.g. KeyCloak) and authentication (OpenLDAP).

How to configure DOI minting¶

In SciCat one can publish selected datasets that triggers a DOI minting process. Find here a short introduction on SciCats Published Data class. Instructions how to configure this DOI minting service and in addition make datasets publicly via APIs follow this Link.

More advanced options¶

If you are compiling the application from source, you can edit the file src/config/configuration.ts with the correct values for your infrastructure. This option is still undocumented, although it is our intention to provide a detailed how-to guide as soon as we can.

Central Configuration of Backend: .env¶