Modify download apis for minio mounted fs #117
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
APIs for File Downloads with MinIO Mounted Directory
This update modifies how files are downloaded in an extractor when the environment variable MINIO_MOUNTED_PATH is set. Instead of downloading files to the /tmp folder, the API uses the S3fs-mounted directory to load files directly. This improves performance by eliminating redundant downloads and enabling direct access to files stored in MinIO.
Testing Steps
To test this functionality, we need to set up an S3fs mount and configure the environment appropriately.
1. Prerequisites
Ensure the following are in place before testing:
2. Expose the minio-nginx Container
To expose MinIO at port 9000, add the following configuration to your docker-compose.yml file:
3. Set the MINIO_ENDPOINT Environment Variable
Set the MINIO_ENDPOINT environment variable to the URL of the exposed MinIO service. For local development, the value should be:
MINIO_ENDPOINT="http://localhost:9000"
4. Create a .miniocred File
In your home directory (or another preferred location), create a file named .miniocred. Populate it with your MinIO credentials in the format:
ACCESS_KEY_ID:SECRET_ACCESS_KEY
For default Docker values, use:
minioadmin:minioadmin
Ensure the file has secure permissions:
chmod 600 ~/.miniocred
5. Mount the MinIO Filesystem
mkdir ~/clowderfs
ls ~/clowderfs
You should see the clowder files in the MinIO bucket listed by fileids
6. Set the MINIO_MOUNTED_PATH Environment Variable
Set the MINIO_MOUNTED_PATH environment variable to the mounted directory:
MINIO_MOUNTED_PATH=~/clowderfs
Test the Download File API in PyClowder
Use the PyClowder API to test file downloads:
OR
Test with the Image-Classification-Dataset Extractor