How to mount a Google Cloud Storage (GCS) bucket as a folder

https://g.co/gemini/share/d2a1b5c3e021

Mounting a Google Cloud Storage (GCS) bucket as a folder in a Debian-based VM is a common and very useful task, typically accomplished using Cloud Storage FUSE (gcsfuse). This allows your applications to interact with GCS as if it were a local file system.

Here’s a step-by-step guide:

1. Prerequisites:

  • A Debian-based VM: This guide assumes you’re working on a Debian or Ubuntu VM in Google Cloud.
  • A Google Cloud Storage Bucket: You need an existing GCS bucket that you want to mount.
  • Appropriate IAM Permissions: The VM’s service account (or the user account if you’re using Application Default Credentials with gcloud auth application-default login) needs the necessary permissions to access the bucket. At a minimum, the Storage Object User role (roles/storage.objectUser) on the bucket is usually sufficient for read/write operations. For broader access, Storage Admin (roles/storage.admin) can be used.

2. Install Cloud Storage FUSE (gcsfuse):

First, you need to install the gcsfuse utility on your Debian VM.

Bash

# Add the Cloud Storage FUSE distribution URL as a package source
export GCSFUSE_REPO=gcsfuse-`lsb_release -c -s`
echo "deb https://packages.cloud.google.com/apt $GCSFUSE_REPO main" | sudo tee /etc/apt/sources.list.d/gcsfuse.list

# Import the Google Cloud public key
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -

# Update the list of available packages
sudo apt-get update

# Install Cloud Storage FUSE and its dependency FUSE
sudo apt-get install fuse gcsfuse -y

3. Authenticate Cloud Storage FUSE Requests:

Cloud Storage FUSE needs credentials to access your bucket. The most common methods are:

  • Using the VM’s Service Account (Recommended for GCP VMs): If your VM is running in Google Cloud, it likely has an associated service account. Ensure this service account has the necessary GCS permissions (e.g., Storage Object User). gcsfuse will automatically use these credentials.
  • Using Application Default Credentials: If you’re running outside of a Google Cloud VM or need to use a specific user’s credentials, you can authenticate using gcloud CLI:
    Bash

    gcloud auth application-default login
    

    Follow the prompts to complete the authentication in your browser. This creates local authentication credentials that gcsfuse can use.

  • Using a Service Account Key File (Less Recommended for VMs, more for local development/specific scenarios):
    1. Create a service account in your GCP project.
    2. Grant it the necessary GCS roles (e.g., Storage Object Admin).
    3. Create a new JSON key for the service account and download it.
    4. Copy this JSON key file to your VM.
    5. When mounting, you’ll specify the key file using --key-file /path/to/your-key.json.

4. Create a Mount Point Directory:

Create a directory on your VM where you want to mount the GCS bucket.

Bash

mkdir ~/my-gcs-bucket-mount

5. Mount the Bucket:

Now, use the gcsfuse command to mount your bucket. Replace YOUR_BUCKET_NAME with the actual name of your GCS bucket and ~/my-gcs-bucket-mount with your desired mount point.

Bash

gcsfuse YOUR_BUCKET_NAME ~/my-gcs-bucket-mount

If successful, you’ll see a message like “File system has been successfully mounted.”

Important gcsfuse Options:

  • --implicit-dirs: If your bucket has “simulated” directories (objects with / in their names), this option allows gcsfuse to infer these directories and list them. Highly recommended for most use cases.
  • --file-mode <octal_permissions>: Sets the default file permissions (e.g., 0644 for read/write by owner, read-only by others).
  • --dir-mode <octal_permissions>: Sets the default directory permissions (e.g., 0755 for read/write/execute by owner, read/execute by others).
  • --uid <user_id>: Sets the user ID of the owner of all files and directories in the mounted bucket.
  • --gid <group_id>: Sets the group ID of the owner of all files and directories in the mounted bucket.
  • --allow-other: Allows users other than the mounting user to access the mounted bucket. Use with caution as it can have security implications. If you need this, you might also need to configure user_allow_other in /etc/fuse.conf.

Example with common options:

Bash

gcsfuse --implicit-dirs --file-mode 0644 --dir-mode 0755 YOUR_BUCKET_NAME ~/my-gcs-bucket-mount

6. Verify the Mount:

You can verify that the bucket is mounted by listing its contents:

Bash

ls ~/my-gcs-bucket-mount

You should see the contents of your GCS bucket. You can also create files and directories within this mount point, and they will be reflected in your GCS bucket.

7. Persistent Mounting (Mount at Boot):

To make the mount persistent across VM reboots, you can add an entry to /etc/fstab.

  1. Unmount the bucket first if it’s currently mounted:
    Bash

    fusermount -u ~/my-gcs-bucket-mount
    
  2. Edit /etc/fstab:
    Bash

    sudo nano /etc/fstab
    
  3. Add the following line to the end of the file. Replace YOUR_BUCKET_NAME and ~/my-gcs-bucket-mount with your actual values.
    YOUR_BUCKET_NAME /home/your_username/my-gcs-bucket-mount gcsfuse rw,noatime,implicit_dirs,allow_other,_netdev 0 0
    

    Explanation of options in /etc/fstab:

    • rw: Read/write access.
    • noatime: Don’t update access times on files. This can improve performance.
    • implicit_dirs: Important for correctly handling “simulated” directories in GCS.
    • allow_other: Allows all users to access the mounted bucket. If you need stricter permissions, you might use uid=<user_id>,gid=<group_id>,file_mode=0644,dir_mode=0755 instead or in addition. Remember to enable user_allow_other in /etc/fuse.conf if using allow_other and you’re not mounting as root.
    • _netdev: This crucial option ensures that the network is available before attempting to mount the GCS bucket. This prevents boot issues if the network isn’t ready.
    • 0 0: The last two numbers are for dump and fsck passes, which are usually set to 0 for FUSE filesystems.
  4. Save and exit the fstab file.
  5. Test the fstab entry:
    Bash

    sudo mount -a
    

    This command attempts to mount all filesystems listed in /etc/fstab that are not already mounted. If there are no errors, your configuration is likely correct.

  6. Reboot your VM to confirm the persistent mount works as expected.

Troubleshooting Tips:

  • Permission Denied:
    • Ensure the service account or user has the correct IAM roles for the GCS bucket.
    • Check file-mode and dir-mode options if you’re experiencing issues with specific file/directory permissions.
    • If using allow_other, make sure user_allow_other is uncommented in /etc/fuse.conf.
  • gcsfuse: command not found: gcsfuse was not installed correctly or is not in your PATH. Double-check the installation steps.
  • Mount point not empty: If you’re trying to mount to a directory that already contains files, gcsfuse will refuse to mount it unless you specify the nonempty option (though it’s generally best to mount to an empty directory).
  • Debugging: Run gcsfuse with the --debug_fuse and --debug_gcs flags to get more detailed output if you’re facing issues.
  • Check logs: Review your system logs (journalctl -xe or /var/log/syslog) for any gcsfuse related errors.

By following these steps, you should be able to successfully mount your Google Cloud Storage bucket as a folder on your Debian-based VM.

 

Leave a Reply

Your email address will not be published. Required fields are marked *