scitacean.transfer.link.LinkFileTransfer#
- class scitacean.transfer.link.LinkFileTransfer(*, source_folder=None)[source]#
Upload / download files by creating symlinks.
This file transfer does not actually upload or download files. Instead, it requires that the ‘remote’ file system is directly accessible from the ‘local’ file system. It creates symlinks in the local download folder to the ‘remote’ files.
Note
A note on terminology: In Scitacean, ‘remote’ refers to the file server where the data files are stored that belong to SciCat datasets. In contrast, ‘local’ refers to the file system of the machine that runs the Python process. The two filesystems can be the same. However, Scitacean maintains a strict separation between the two and uses ‘downloaders’ and ‘uploaders’ to transfer between them even if that transfer is a simple symlink.
See also the documentation of
scitacean.File
.Warning
This file transfer does not work on Windows because it converts between
RemotePath
andpathlib.Path
. This requires that both use the same directory separators. SinceRemotePath
uses UNIX-style forward slashes, it is incompatible with Windows paths. In practice, this should not be a problem because SciCat file storage should never be a Windows server.Warning
This file transfer cannot upload files. Instead, consider copying or moving the files to the SciCat source folder, e.g., by using :scitacean.transfer.copy.CopyFileTransfer` or writing the files there directly from your workflow.
Attempting to upload files will raise
NotImplementedError
.Examples
Given a dataset with
source_folder="/dataset/source"
and a file with path"file1.dat"
, thisclient = Client.from_token( url="...", token="...", file_transfer=LinkFileTransfer() ) ds = client.get_dataset(pid="...") ds = client.download_files(ds, target="/downloads")
creates the following symlink:
/downloads/file1.dat -> /dataset/source/file1.dat
Constructors
__init__
(*[, source_folder])Construct a new Link file transfer.
Methods
connect_for_download
(dataset, ...)Create a connection for downloads, use as a context manager.
connect_for_upload
(dataset, ...)Create a connection for uploads, use as a context manager.
source_folder_for
(dataset)Return the source folder used for the given dataset.
- __init__(*, source_folder=None)[source]#
Construct a new Link file transfer.
- Parameters:
source_folder (
str
|RemotePath
|None
, default:None
) – Upload files to this folder if set. Otherwise, upload to the dataset’s source_folder. Ignored when downloading files.
- connect_for_download(dataset, representative_file_path)[source]#
Create a connection for downloads, use as a context manager.
- Parameters:
dataset (
Dataset
) – The dataset for which to download files.representative_file_path (
RemotePath
) – A path to a file that can be used to check whether files for this dataset are accessible. The transfer assumes that, if this path is accessible, all files for this dataset are.
- Returns:
Iterator
[LinkDownloadConnection
] – A connection object that can download files.- Raises:
FileNotAccessibleError – If files for the given dataset cannot be accessed based on
representative_file_path
.
- connect_for_upload(dataset, representative_file_path)[source]#
Create a connection for uploads, use as a context manager.
- Parameters:
dataset (
Dataset
) – The connection will be used to upload files of this dataset. Used to determine the target folder.representative_file_path (
RemotePath
) – A path on the remote to check whether files for this dataset can be written. The transfer assumes that, if it is possible to write to this path, it is possible to write to the paths of all files to be uploaded.
- Raises:
NotImplementedError – This file transfer does not implement uploading files.
- Return type:
Iterator
[LinkUploadConnection
]