scitacean.transfer.copy.CopyFileTransfer#
- class scitacean.transfer.copy.CopyFileTransfer(*, source_folder=None, hard_link=False)[source]#
Upload / download files by copying files on the same filesystem.
This file transfer requires that the ‘remote’ file system is directly accessible from the ‘local’ file system. It copies the ‘remote’ files directly to the local download folder.
Note
A note on terminology: In Scitacean, ‘remote’ refers to the file server where the data files are stored that belong to SciCat datasets. In contrast, ‘local’ refers to the file system of the machine that runs the Python process. The two filesystems can be the same. However, Scitacean maintains a strict separation between the two and uses ‘downloaders’ and ‘uploaders’ to transfer between them even if that transfer is a simple copy.
See also the documentation of
scitacean.File
.Warning
This file transfer does not work on Windows because it converts between
RemotePath
andpathlib.Path
. This requires that both use the same directory separators. SinceRemotePath
uses UNIX-style forward slashes, it is incompatible with Windows paths. In practice, this should not be a problem because SciCat file storage should never be a Windows server.Examples
Given a dataset with
source_folder="/dataset/source"
and a file with path"file1.dat"
, thisclient = Client.from_token( url="...", token="...", file_transfer=CopyFileTransfer() ) ds = client.get_dataset(pid="...") ds = client.download_files(ds, target="/downloads")
copies the file from
/dataset/source/file1.dat
to/downloads/file1.dat
.Constructors
__init__
(*[, source_folder, hard_link])Construct a new Copy file transfer.
Methods
connect_for_download
(dataset, ...)Create a connection for downloads, use as a context manager.
connect_for_upload
(dataset, ...)Create a connection for uploads, use as a context manager.
source_folder_for
(dataset)Return the source folder used for the given dataset.
- __init__(*, source_folder=None, hard_link=False)[source]#
Construct a new Copy file transfer.
Warning
When using hard links (with
hard_link = True
), the downloaded or uploaded files will refer to the same bytes. So if one is modified, the other will be modified as well. Use this feature with care!- Parameters:
source_folder (
str
|RemotePath
|None
, default:None
) – Upload files to this folder if set. Otherwise, upload to the dataset’s source_folder. Ignored when downloading files.hard_link (
bool
, default:False
) – If True, try to use hard links instead of copies.
- connect_for_download(dataset, representative_file_path)[source]#
Create a connection for downloads, use as a context manager.
- Parameters:
dataset (
Dataset
) – The dataset for which to download files.representative_file_path (
RemotePath
) – A path to a file that can be used to check whether files for this dataset are accessible. The transfer assumes that, if this path is accessible, all files for this dataset are.
- Returns:
Iterator
[CopyDownloadConnection
] – A connection object that can download files.- Raises:
FileNotAccessibleError – If files for the given dataset cannot be accessed based on
representative_file_path
.
- connect_for_upload(dataset, representative_file_path)[source]#
Create a connection for uploads, use as a context manager.
- Parameters:
dataset (
Dataset
) – The connection will be used to upload files of this dataset. Used to determine the target folder.representative_file_path (
RemotePath
) – A path on the remote to check whether files for this dataset can be written. The transfer assumes that, if it is possible to write to this path, it is possible to write to the paths of all files to be uploaded.
- Returns:
Iterator
[CopyUploadConnection
] – An openCopyUploadConnection
object.- Raises:
FileNotAccessibleError – If the remote folder cannot be accessed based on
representative_file_path
.