diff --git a/README.md b/README.md index 9d958732..cf7e01ce 100644 --- a/README.md +++ b/README.md @@ -24,7 +24,6 @@ And in contrast to [tarindexer](https://github.com/devsnd/tarindexer), which als *Capabilities:* - - **Random Access:** Care was taken to achieve fast random access inside compressed streams for bzip2, gzip, xz, and zstd and inside TAR files by building indices containing seek points. - **Highly Parallelized:** By default, all cores are used for parallelized algorithms like for the gzip, bzip2, and xz decoders. This can yield huge speedups on most modern processors but requires more main memory. It can be controlled or completely turned off using the `-P ` option. @@ -37,9 +36,44 @@ And in contrast to [tarindexer](https://github.com/devsnd/tarindexer), which als All changes below the mountpoint will be redirected to this folder and deletions are tracked so that all changes can be applied back to the archive. - **Remote Files and Folders:** A remote archive or whole folder structure can be mounted similar to tools like [sshfs](https://github.com/libfuse/sshfs) thanks to the [filesystem_spec](https://github.com/fsspec/filesystem_spec) project. These can be specified with URIs as explained in the section ["Remote Files"](#remote-files). - Supported remote protocols include: FTP, HTTP, HTTPS, SFTP, [SSH](https://github.com/fsspec/sshfs), Git, Github, [S3](https://github.com/fsspec/s3fs), Samba [v2 and v3](https://github.com/jborean93/smbprotocol), Dropbox, ... Many of these are very experimental and may be slow. Please open a feature request if further backends are desired. + Supported remote protocols include: FTP, SFTP, HTTP, HTTPS, SSH, Git, Github, S3, Samba, Azure Datalake, Dropbox, Google Cloud Storage (GCS), ... Many of these are very experimental and may be slow. Azure and GCS are not even tested. + + +*TAR compressions supported for random access:* + + - **BZip2** as provided by [indexed_bzip2](https://github.com/mxmlnkn/indexed_bzip2) as a backend, which is a refactored and extended version of [bzcat](https://github.com/landley/toybox/blob/c77b66455762f42bb824c1aa8cc60e7f4d44bdab/toys/other/bzcat.c) from [toybox](https://landley.net/code/toybox/). See also the [reverse engineered specification](https://github.com/dsnet/compress/blob/master/doc/bzip2-format.pdf). + - **Gzip** and **Zlib** as provided by [rapidgzip](https://github.com/mxmlnkn/rapidgzip) or [indexed_gzip](https://github.com/pauldmccarthy/indexed_gzip) by Paul McCarthy. See also [RFC1952](https://tools.ietf.org/html/rfc1952) and [RFC1950](https://tools.ietf.org/html/rfc1950). + - **Xz** as provided by [python-xz](https://github.com/Rogdham/python-xz) by Rogdham or [lzmaffi](https://github.com/r3m0t/backports.lzma) by Tomer Chachamu. See also [The .xz File Format](https://tukaani.org/xz/xz-file-format.txt). + - **Zstd** as provided by [indexed_zstd](https://github.com/martinellimarco/indexed_zstd) by Marco Martinelli. See also [Zstandard Compression Format](https://github.com/facebook/zstd/blob/master/doc/zstd_compression_format.md). + +*Other supported archive formats:* + + - **Rar** as provided by [rarfile](https://github.com/markokr/rarfile) by Marko Kreen. See also the [RAR 5.0 archive format](https://www.rarlab.com/technote.htm). + - **SquashFS, AppImage, Snap** as provided by [PySquashfsImage](https://github.com/matteomattei/PySquashfsImage) by Matteo Mattei. There seems to be no authoritative, open format specification, only [this nicely-done reverse-engineered description](https://dr-emann.github.io/squashfs/squashfs.html), I assume based on the [source code](https://github.com/plougher/squashfs-tools). Note that [Snaps](https://snapcraft.io/docs/the-snap-format) and [Appimages](https://github.com/AppImage/AppImageSpec/blob/master/draft.md#type-2-image-format) are both SquashFS images, with an executable prepended for AppImages. + - **Zip** as provided by [zipfile](https://docs.python.org/3/library/zipfile.html), which is distributed with Python itself. See also the [ZIP File Format Specification](https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT). + - **Many Others** as provided by [libarchive](https://github.com/libarchive/libarchive) via [python-libarchive-c](https://github.com/Changaco/python-libarchive-c). + - Formats with tests: + [7z](https://github.com/ip7z/7zip/blob/main/DOC/7zFormat.txt), + ar, + [cab](https://download.microsoft.com/download/4/d/a/4da14f27-b4ef-4170-a6e6-5b1ef85b1baa/[ms-cab].pdf), + compress, cpio, + [iso](http://www.brankin.com/main/technotes/Notes_ISO9660.htm), + [lrzip](https://github.com/ckolivas/lrzip), + [lzma](https://www.7-zip.org/a/lzma-specification.7z), + [lz4](https://github.com/lz4/lz4/blob/dev/doc/lz4_Frame_format.md), + [lzip](https://www.ietf.org/archive/id/draft-diaz-lzip-09.txt), + lzo, + [warc](https://iipc.github.io/warc-specifications/specifications/warc-format/warc-1.0/), + xar. + - Untested formats that might work or not: deb, grzip, + [rpm](https://refspecs.linuxbase.org/LSB_4.1.0/LSB-Core-generic/LSB-Core-generic/pkgformat.html), + [uuencoding](https://en.wikipedia.org/wiki/Uuencoding). + - Beware that libarchive has no performant random access to files and to file contents. + In order to seek or open a file, in general, it needs to be assumed that the archive has to be parsed from the beginning. + If you have a performance-critical use case for a format only supported via libarchive, + then please open a feature request for a faster customized archive format implementation. + The hope would be to add suitable stream compressors such as "short"-distance LZ-based compressions to [rapidgzip](https://github.com/mxmlnkn/rapidgzip). -A complete list of supported formats can be found [here](supported-formats). # Examples @@ -49,11 +83,6 @@ A complete list of supported formats can be found [here](supported-formats). - `ratarmount folder1 folder2 mountpoint` to bind-mount a merged view of two (or more) folders under `mountpoint`. - `ratarmount folder archive.zip folder` to mount a merged view of a folder on top of archive contents. - `ratarmount -o modules=subdir,subdir=squashfs-root archive.squashfs mountpoint` to mount an archive subfolder `squashfs-root` under `mountpoint`. - - `ratarmount http://server.org:80/archive.rar folder folder` Mount an archive that is accessible via HTTP range requests. - - `ratarmount ssh://hostname:22/relativefolder/ mountpoint` Mount a folder hierarchy via SSH. - - `ratarmount ssh://hostname:22//tmp/tmp-abcdef/ mountpoint` - - `ratarmount github://mxmlnkn:ratarmount@v0.15.2/tests/ mountpoint` Mount a github repo as if it was checked out at the given tag or SHA or branch. - - `AWS_ACCESS_KEY_ID=01234567890123456789 AWS_SECRET_ACCESS_KEY=0123456789012345678901234567890123456789 ratarmount s3://127.0.0.1/bucket/single-file.tar mounted` Mount an archive inside an S3 bucket reachable via a custom endpoint with the given credentials. Bogus credentials may be necessary for unsecured endpoints. # Table of Contents @@ -64,9 +93,6 @@ A complete list of supported formats can be found [here](supported-formats). 1. [Arch Linux](#arch-linux) 3. [System Dependencies for PIP Installation (Rarely Necessary)](#system-dependencies-for-pip-installation-rarely-necessary) 4. [PIP Package Installation](#pip-package-installation) -2. [Supported Formats](#supported-formats) - 1. [TAR compressions supported for random access](tar-compressions-supported-for-random-access) - 2. [Other supported archive formats](other-supported-archive-formats) 2. [Benchmarks](#benchmarks) 3. [The Problem](#the-problem) 4. [The Solution](#the-solution) @@ -112,9 +138,6 @@ chmod u+x -- "$appImageName" sudo cp -- "$appImageName" /usr/local/bin/ratarmount # Example installation ``` -
-Other Installation Methods - ## Installation via Package Manager [![Packaging status](https://repology.org/badge/vertical-allrepos/ratarmount.svg)](https://repology.org/project/ratarmount/versions) @@ -182,45 +205,6 @@ If there are troubles with the compression backend dependencies, you can try the Ratarmount will work without the compression backends. The hard requirements are `fusepy` and for Python versions older than 3.7.0 `dataclasses`. -
- -# Supported Formats - -## TAR compressions supported for random access - - - **BZip2** as provided by [indexed_bzip2](https://github.com/mxmlnkn/indexed_bzip2) as a backend, which is a refactored and extended version of [bzcat](https://github.com/landley/toybox/blob/c77b66455762f42bb824c1aa8cc60e7f4d44bdab/toys/other/bzcat.c) from [toybox](https://landley.net/code/toybox/). See also the [reverse engineered specification](https://github.com/dsnet/compress/blob/master/doc/bzip2-format.pdf). - - **Gzip** and **Zlib** as provided by [rapidgzip](https://github.com/mxmlnkn/rapidgzip) or [indexed_gzip](https://github.com/pauldmccarthy/indexed_gzip) by Paul McCarthy. See also [RFC1952](https://tools.ietf.org/html/rfc1952) and [RFC1950](https://tools.ietf.org/html/rfc1950). - - **Xz** as provided by [python-xz](https://github.com/Rogdham/python-xz) by Rogdham or [lzmaffi](https://github.com/r3m0t/backports.lzma) by Tomer Chachamu. See also [The .xz File Format](https://tukaani.org/xz/xz-file-format.txt). - - **Zstd** as provided by [indexed_zstd](https://github.com/martinellimarco/indexed_zstd) by Marco Martinelli. See also [Zstandard Compression Format](https://github.com/facebook/zstd/blob/master/doc/zstd_compression_format.md). - -## Other supported archive formats - - - **Rar** as provided by [rarfile](https://github.com/markokr/rarfile) by Marko Kreen. See also the [RAR 5.0 archive format](https://www.rarlab.com/technote.htm). - - **SquashFS, AppImage, Snap** as provided by [PySquashfsImage](https://github.com/matteomattei/PySquashfsImage) by Matteo Mattei. There seems to be no authoritative, open format specification, only [this nicely-done reverse-engineered description](https://dr-emann.github.io/squashfs/squashfs.html), I assume based on the [source code](https://github.com/plougher/squashfs-tools). Note that [Snaps](https://snapcraft.io/docs/the-snap-format) and [Appimages](https://github.com/AppImage/AppImageSpec/blob/master/draft.md#type-2-image-format) are both SquashFS images, with an executable prepended for AppImages. - - **Zip** as provided by [zipfile](https://docs.python.org/3/library/zipfile.html), which is distributed with Python itself. See also the [ZIP File Format Specification](https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT). - - **Many Others** as provided by [libarchive](https://github.com/libarchive/libarchive) via [python-libarchive-c](https://github.com/Changaco/python-libarchive-c). - - Formats with tests: - [7z](https://github.com/ip7z/7zip/blob/main/DOC/7zFormat.txt), - ar, - [cab](https://download.microsoft.com/download/4/d/a/4da14f27-b4ef-4170-a6e6-5b1ef85b1baa/[ms-cab].pdf), - compress, cpio, - [iso](http://www.brankin.com/main/technotes/Notes_ISO9660.htm), - [lrzip](https://github.com/ckolivas/lrzip), - [lzma](https://www.7-zip.org/a/lzma-specification.7z), - [lz4](https://github.com/lz4/lz4/blob/dev/doc/lz4_Frame_format.md), - [lzip](https://www.ietf.org/archive/id/draft-diaz-lzip-09.txt), - lzo, - [warc](https://iipc.github.io/warc-specifications/specifications/warc-format/warc-1.0/), - xar. - - Untested formats that might work or not: deb, grzip, - [rpm](https://refspecs.linuxbase.org/LSB_4.1.0/LSB-Core-generic/LSB-Core-generic/pkgformat.html), - [uuencoding](https://en.wikipedia.org/wiki/Uuencoding). - - Beware that libarchive has no performant random access to files and to file contents. - In order to seek or open a file, in general, it needs to be assumed that the archive has to be parsed from the beginning. - If you have a performance-critical use case for a format only supported via libarchive, - then please open a feature request for a faster customized archive format implementation. - The hope would be to add suitable stream compressors such as "short"-distance LZ-based compressions to [rapidgzip](https://github.com/mxmlnkn/rapidgzip). - # Benchmarks @@ -534,15 +518,13 @@ The [fsspec](https://github.com/fsspec/filesystem_spec) API backend adds support - `github://org:repo@[sha]/path-to/file-or-folder` E.g. github://mxmlnkn:ratarmount@v0.15.2/tests/single-file.tar - `http[s]://hostname[:port]/path-to/archive.rar` - - `s3://[endpoint-hostname[:port]]/bucket[/single-file.tar[?versionId=some_version_id]]` - Will default to AWS according to the Boto3 library defaults when no endpoint is specified. - Boto3 will check, among others, [these environment variables](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html), for credentials: + - `s3://[endpoint-hostname[:port]]/bucket/single-file.tar` + Will default to AWS according to the Boto3 library defaults + when no endpoint is specified. Boto3 will check these environment + variables for credentials: - `AWS_ACCESS_KEY_ID` - `AWS_SECRET_ACCESS_KEY` - `AWS_SESSION_TOKEN` - - `AWS_DEFAULT_REGION`, e.g., `us-west-1` - fsspec/s3fs furthermore supports these environment variables: - - [`FSSPEC_S3_ENDPOINT_URL`](https://github.com/fsspec/s3fs/pull/704), e.g., `http://127.0.0.1:8053` - `[s]ftp://[user[:password]@]hostname[:port]/path-to/archive.rar` - `ssh://[user[:password]@]hostname[:port]/path-to/archive.rar` - `smb://[workgroup;][user:password@]server[:port]/share/folder/file.tar` diff --git a/core/pyproject.toml b/core/pyproject.toml index 93f667f4..56bb07f6 100644 --- a/core/pyproject.toml +++ b/core/pyproject.toml @@ -77,7 +77,7 @@ full = [ # fsspec: "requests", "aiohttp", - "sshfs", # For performance, asyncssh > 2.17 would be recommended: https://github.com/ronf/asyncssh/issues/691 + "sshfs", # Need newer pyopenssl than comes with Ubuntu 22.04. # https://github.com/ronf/asyncssh/issues/690 "pyopenssl>=23", @@ -99,7 +99,7 @@ fsspec = [ # Copy-pasted from fsspec[full] list. Some were excluded because they are too unproportionally large. "requests", "aiohttp", - "sshfs", # For performance, asyncssh > 2.17 would be recommended: https://github.com/ronf/asyncssh/issues/691 + "sshfs", # Need newer pyopenssl than comes with Ubuntu 22.04. # https://github.com/ronf/asyncssh/issues/690 "pyopenssl>=23", diff --git a/core/ratarmountcore/factory.py b/core/ratarmountcore/factory.py index 414f97a5..4871b1fd 100644 --- a/core/ratarmountcore/factory.py +++ b/core/ratarmountcore/factory.py @@ -127,6 +127,9 @@ class FixedSSHFileSystem(SSHFileSystem): def openFsspec(url, options, printDebug: int) -> Optional[Union[MountSource, IO[bytes], str]]: + if not fsspec: + return None + splitURI = url.split('://', 1) protocol = splitURI[0] if len(splitURI) > 1 else '' if not protocol: @@ -135,11 +138,6 @@ def openFsspec(url, options, printDebug: int) -> Optional[Union[MountSource, IO[ if protocol == 'file': return splitURI[1] - if not fsspec: - print("[Warning] An URL was detected but fsspec is not installed. You may want to install it with:") - print("[Warning] python3 -m pip install ratarmount[fsspec]") - return None - result = None try: if printDebug >= 3: @@ -224,7 +222,7 @@ def newDel(): def openMountSource(fileOrPath: Union[str, IO[bytes]], **options) -> MountSource: printDebug = int(options.get("printDebug", 0)) if isinstance(options.get("printDebug", 0), int) else 0 - if isinstance(fileOrPath, str): + if fsspec and isinstance(fileOrPath, str): result = openFsspec(fileOrPath, options, printDebug=printDebug) if isinstance(result, MountSource): return result diff --git a/ratarmount.py b/ratarmount.py index b6946e2c..d9f5eb3b 100755 --- a/ratarmount.py +++ b/ratarmount.py @@ -1158,12 +1158,6 @@ def _parseArgs(rawArgs: Optional[List[str]] = None): - ratarmount folder1 folder2 mountpoint - ratarmount folder archive.zip folder - ratarmount -o modules=subdir,subdir=squashfs-root archive.squashfs mountpoint - - ratarmount http://server.org:80/archive.rar folder folder - - ratarmount ssh://hostname:22/relativefolder/ mountpoint - - ratarmount ssh://hostname:22//tmp/tmp-abcdef/ mountpoint - - ratarmount github://mxmlnkn:ratarmount@v0.15.2/tests/single-file.tar mountpoint - - AWS_ACCESS_KEY_ID=aaaaaaaaaaaaaaaaaaaa AWS_SECRET_ACCESS_KEY=bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb \\ - ratarmount s3://127.0.0.1/bucket/single-file.tar mounted For further information, see the ReadMe on the project's homepage: @@ -1460,9 +1454,8 @@ def _parseArgs(rawArgs: Optional[List[str]] = None): # This is a hack but because we have two positional arguments (and want that reflected in the auto-generated help), # all positional arguments, including the mountpath will be parsed into the tar file path's namespace and we have to # manually separate them depending on the type. - lastArgument = args.mount_source[-1] - if '://' not in lastArgument and (os.path.isdir(lastArgument) or not os.path.exists(lastArgument)): - args.mount_point = lastArgument + if os.path.isdir(args.mount_source[-1]) or not os.path.exists(args.mount_source[-1]): + args.mount_point = args.mount_source[-1] args.mount_source = args.mount_source[:-1] if not args.mount_source and not args.write_overlay: raise argparse.ArgumentTypeError( @@ -1516,8 +1509,6 @@ def checkMountSource(path): args.mount_point = os.path.splitext(args.mount_source[0])[0] else: args.mount_point = autoMountPoint - if '://' in args.mount_point: - args.mount_point = "ratarmount.mounted" args.mount_point = os.path.abspath(args.mount_point) # Preprocess the --index-folders list as a string argument diff --git a/tests/ratarmount-help.txt b/tests/ratarmount-help.txt index 6b75f319..f9738a53 100644 --- a/tests/ratarmount-help.txt +++ b/tests/ratarmount-help.txt @@ -205,12 +205,6 @@ Examples: - ratarmount folder1 folder2 mountpoint - ratarmount folder archive.zip folder - ratarmount -o modules=subdir,subdir=squashfs-root archive.squashfs mountpoint - - ratarmount http://server.org:80/archive.rar folder folder - - ratarmount ssh://hostname:22/relativefolder/ mountpoint - - ratarmount ssh://hostname:22//tmp/tmp-abcdef/ mountpoint - - ratarmount github://mxmlnkn:ratarmount@v0.15.2/tests/single-file.tar mountpoint - - AWS_ACCESS_KEY_ID=aaaaaaaaaaaaaaaaaaaa AWS_SECRET_ACCESS_KEY=bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb \ - ratarmount s3://127.0.0.1/bucket/single-file.tar mounted For further information, see the ReadMe on the project's homepage: diff --git a/tests/requirements-tests.txt b/tests/requirements-tests.txt index 5149275d..958e0693 100644 --- a/tests/requirements-tests.txt +++ b/tests/requirements-tests.txt @@ -32,4 +32,3 @@ pyftpdlib pyminizip pyopenssl>=23 rangehttpserver -boto3 diff --git a/tests/runtests.sh b/tests/runtests.sh index 30dcd6c3..376fe267 100755 --- a/tests/runtests.sh +++ b/tests/runtests.sh @@ -184,10 +184,10 @@ funmount() waitForMountpoint() { for (( i=0; i<10; ++i )); do - if mountpoint -q -- "$1"; then break; fi + if mountpoint -- "$1"; then break; fi sleep 1s done - if ! mountpoint -q -- "$1"; then return 1; fi + if ! mountpoint -- "$1"; then return 1; fi } @@ -1806,39 +1806,6 @@ checkURLProtocolFile() } -checkFileInTARForeground() -{ - # Similar to checkFileInTAR but calls ratarmount with -f as is necessary for some threaded fsspec backends. - # TODO make those fsspec backends work without -f, e.g., by only mounting them in FuseMount.init, maybe - # trying to open in __init__ and close them at the end of __init__ and reopen them in init for better - # error reporting, or even better, somehow find out how to close only those threads and restart them - # in FuseMount.init. - local archive="$1"; shift - local fileInTar="$1"; shift - local correctChecksum="$1" - - local startTime - startTime=$( date +%s ) - - rm -f ratarmount.{stdout,stderr}.log - - local mountFolder - mountFolder="$( mktemp -d )" || returnError "$LINENO" 'Failed to create temporary directory' - MOUNT_POINTS_TO_CLEANUP+=( "$mountFolder" ) - - $RATARMOUNT_CMD -c -f -d 3 "$archive" "$mountFolder" >ratarmount.stdout.log 2>ratarmount.stderr.log & - waitForMountpoint "$mountFolder" || returnError 'Waiting for mountpoint timed out!' - ! 'grep' -C 5 -Ei '(warn|error)' ratarmount.stdout.log ratarmount.stderr.log || - returnError "$LINENO" "Found warnings while executing: $RATARMOUNT_CMD $*" - - echo "Check access to $archive" - verifyCheckSum "$mountFolder" "$fileInTar" "$archive" "$correctChecksum" || returnError "$LINENO" 'Checksum mismatches!' - funmount "$mountFolder" - - safeRmdir "$mountFolder" -} - - checkURLProtocolHTTP() { local pid mountPoint protocol port @@ -1863,13 +1830,29 @@ checkURLProtocolHTTP() sleep 5 wget 127.0.0.1:$port + archive="$protocol://127.0.0.1:$port/tests/single-file.tar" + runAndCheckRatarmount -c -f -d 3 "$archive" "$mountPoint" & + waitForMountpoint "$mountPoint" || returnError 'Waiting for mountpoint timed out!' + echo "Check access to $archive" + verifyCheckSum "$mountPoint" "bar" "$archive" d3b07384d113edec49eaa6238ad5ff00 || + returnError "$LINENO" 'Checksum mismatches!' + funmount "$mountPoint" - checkFileInTARForeground "$protocol://127.0.0.1:$port/tests/single-file.tar" 'bar' d3b07384d113edec49eaa6238ad5ff00 || - returnError "$LINENO" 'Failed to read from HTTP server' - checkFileInTARForeground "$protocol://127.0.0.1:$port/tests/" 'single-file.tar' 1a28538854d1884e4415cb9bfb7a2ad8 || - returnError "$LINENO" 'Failed to read from HTTP server' - checkFileInTARForeground "$protocol://127.0.0.1:$port/tests" 'single-file.tar' 1a28538854d1884e4415cb9bfb7a2ad8 || - returnError "$LINENO" 'Failed to read from HTTP server' + archive="$protocol://127.0.0.1:$port/tests/" + runAndCheckRatarmount -c -f -d 3 "$archive" "$mountPoint" & + waitForMountpoint "$mountPoint" || returnError 'Waiting for mountpoint timed out!' + echo "Check access to $archive" + verifyCheckSum "$mountPoint" 'single-file.tar' "$archive" 1a28538854d1884e4415cb9bfb7a2ad8 || + returnError "$LINENO" 'Checksum mismatches!' + funmount "$mountPoint" + + archive="$protocol://127.0.0.1:$port/tests" + runAndCheckRatarmount -c -f -d 3 "$archive" "$mountPoint" & + waitForMountpoint "$mountPoint" || returnError 'Waiting for mountpoint timed out!' + echo "Check access to $archive" + verifyCheckSum "$mountPoint" 'single-file.tar' "$archive" 1a28538854d1884e4415cb9bfb7a2ad8 || + returnError "$LINENO" 'Checksum mismatches!' + funmount "$mountPoint" kill $pid &>/dev/null rmdir "$mountPoint" @@ -1902,7 +1885,7 @@ checkURLProtocolFTP() killRogueSSH() { local pid - for pid in $( pgrep -f start-asyncssh-server ) $( pgrep -f ssh:// ); do + for pid in $( pgrep start-asyncssh-server ) $( pgrep -f ssh:// ); do kill "$pid" sleep 1 kill -9 "$pid" @@ -2015,12 +1998,29 @@ EOF mountPoint=$( mktemp -d ) - checkFileInTARForeground "ssh://127.0.0.1:$port/tests/single-file.tar" 'bar' d3b07384d113edec49eaa6238ad5ff00 || - returnError "$LINENO" 'Failed to read from SSH server' - checkFileInTARForeground "ssh://127.0.0.1:$port/tests/" 'single-file.tar' 1a28538854d1884e4415cb9bfb7a2ad8 || - returnError "$LINENO" 'Failed to read from SSH server' - checkFileInTARForeground "ssh://127.0.0.1:$port/tests" 'single-file.tar' 1a28538854d1884e4415cb9bfb7a2ad8 || - returnError "$LINENO" 'Failed to read from SSH server' + archive="ssh://127.0.0.1:$port/tests/single-file.tar" + runAndCheckRatarmount -d 3 -c -f "$archive" "$mountPoint" & + waitForMountpoint "$mountPoint" || returnError 'Waiting for mountpoint timed out!' + echo "Check access to $archive" + verifyCheckSum "$mountPoint" "bar" "$archive" d3b07384d113edec49eaa6238ad5ff00 || + returnError "$LINENO" 'Checksum mismatches!' + funmount "$mountPoint" + + archive="ssh://127.0.0.1:$port/tests/" + runAndCheckRatarmount -c -f "$archive" "$mountPoint" & + waitForMountpoint "$mountPoint" || returnError 'Waiting for mountpoint timed out!' + echo "Check access to $archive" + verifyCheckSum "$mountPoint" 'single-file.tar' "$archive" 1a28538854d1884e4415cb9bfb7a2ad8 || + returnError "$LINENO" 'Checksum mismatches!' + funmount "$mountPoint" + + archive="ssh://127.0.0.1:$port/tests" + runAndCheckRatarmount -c -f "$archive" "$mountPoint" & + waitForMountpoint "$mountPoint" || returnError 'Waiting for mountpoint timed out!' + echo "Check access to $archive" + verifyCheckSum "$mountPoint" 'single-file.tar' "$archive" 1a28538854d1884e4415cb9bfb7a2ad8 || + returnError "$LINENO" 'Checksum mismatches!' + funmount "$mountPoint" kill $pid killRogueSSH @@ -2095,42 +2095,37 @@ checkURLProtocolGithub() checkURLProtocolS3() { - local mountPoint pid weedFolder port - mountPoint=$( mktemp -d ) - port=8053 + local pid weedFolder if [[ ! -f weed ]]; then - wget -q 'https://github.com/seaweedfs/seaweedfs/releases/download/3.74/linux_amd64_large_disk.tar.gz' + wget 'https://github.com/seaweedfs/seaweedfs/releases/download/3.74/linux_amd64_large_disk.tar.gz' tar -xf 'linux_amd64_large_disk.tar.gz' fi [[ -x weed ]] || chmod u+x weed weedFolder=$( mktemp -d ) TMP_FILES_TO_CLEANUP+=( "$weedFolder" ) - ./weed server -dir="$weedFolder" -s3 -s3.port "$port" -idleTimeout=30 -ip 127.0.0.1 2>weed.log & + ./weed server -dir="$weedFolder" -s3 -s3.port 8053 -idleTimeout=30 -ip 127.0.0.1 2>weed.log & pid=$! # Wait for port to open - echo "Waiting for seaweedfs to start up and port $port to open..." python3 -c ' import socket -import sys import time from contextlib import closing t0 = time.time() with closing(socket.socket(socket.AF_INET, socket.SOCK_STREAM)) as sock: for i in range(10): - if sock.connect_ex(("127.0.0.1", int(sys.argv[1]))) == 0: + if sock.connect_ex(("127.0.0.1", 8053)) == 0: print(f"Weed port opened after {time.time() - t0:.1f} s.") break time.sleep(5) -' "$port" +' kill "$pid" # Create bucket and upload test file python3 -c " -import sys import boto3 def list_buckets(client): @@ -2142,32 +2137,28 @@ def list_bucket_files(client, bucket_name): return [x['Key'] for x in result['Contents']] if 'Contents' in result else [] client = boto3.client( - 's3', endpoint_url='http://127.0.0.1:' + sys.argv[1], - aws_access_key_id = '01234567890123456789', - aws_secret_access_key = '0123456789012345678901234567890123456789' + 's3', endpoint_url='http://127.0.0.1:8053', + aws_access_key_id = 'aaaaaaaaaaaaaaaaaaaa', + aws_secret_access_key = 'bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb' ) bucket_name = 'bucket' if bucket_name not in list_buckets(client): client.create_bucket(Bucket=bucket_name) client.upload_file('tests/single-file.tar', bucket_name, 'single-file.tar') -" "$port" - - export FSSPEC_S3_ENDPOINT_URL="http://127.0.0.1:$port" - # Even though no credentials are configured for the seaweedfs server, we need dummy credentials for boto3 -.- - export AWS_ACCESS_KEY_ID=01234567890123456789 - export AWS_SECRET_ACCESS_KEY=0123456789012345678901234567890123456789 +" # At last, test ratarmount. - checkFileInTARForeground "s3://bucket/single-file.tar" 'bar' d3b07384d113edec49eaa6238ad5ff00 || - returnError "$LINENO" 'Failed to read from S3 server' - checkFileInTARForeground "s3://bucket/" 'single-file.tar' 1a28538854d1884e4415cb9bfb7a2ad8 || - returnError "$LINENO" 'Failed to read from S3 server' - checkFileInTARForeground "s3://bucket" 'single-file.tar' 1a28538854d1884e4415cb9bfb7a2ad8 || + checkFileInTAR 's3://127.0.0.1:8053/bucket/single-file.tar' bar d3b07384d113edec49eaa6238ad5ff00 || returnError "$LINENO" 'Failed to read from S3 server' + # TODO + #checkFileInTAR 's3://127.0.0.1:8053/bucket/' single-file.tar 1a28538854d1884e4415cb9bfb7a2ad8 || + # returnError "$LINENO" 'Failed to read from S3 server' + #checkFileInTAR 's3://127.0.0.1:8053/bucket' single-file.tar 1a28538854d1884e4415cb9bfb7a2ad8 || + # returnError "$LINENO" 'Failed to read from S3 server' - kill $pid &>/dev/null + kill $pid - 'rm' -rf "$weedFolder" + 'rm' -r "$weedFolder" } @@ -2235,12 +2226,12 @@ checkRemoteSupport() checkURLProtocolFTP || returnError 'Failed ftp:// check' checkURLProtocolHTTP || returnError 'Failed http:// check' - checkURLProtocolS3 || returnError 'Failed s3:// check' + #checkURLProtocolS3 # TODO suddenly broken again ... checkURLProtocolSSH || returnError 'Failed ssh:// check' checkURLProtocolSamba || returnError 'Failed smb:// check' - # TODO Add and test IPFS - # TODO look for other fsspec implementations in an automated manner + # TODO Add and test IPFS? + # TODO look for other fsspec implementations }