Storage
In my current setup, I have a single FUSE-mountpoint that I can point software at, and it is none the wiser about what it really is.
My /data mountpoint
My /data
consists of:
- An ext2 partition on an mdraid stripe of 2x300GB NVMe SSD for caching and things that need to be really fast, mounted as
/scratch
. - A ZFS RAIDZ1/"RAID5" of 3x8TB, mounted as
/local
. It has redundancy and is fast enough. This is where I store most things initially. - A ZFS stripe of 2x4TB, mounted as
/mnt/raid1
. This has very fast sequential speeds, but low IOPS. This is for off-loading from the main pool. - A ZFS stripe of 3x1TB + 1x2TB, mounted as
/mnt/raid2
. This has more disks with slower sequential speeds, but higher IOPS. This is for off-loading from the main pool. - Google Drive mounted as
/remote
. This has very high overhead per file, but has unlimited storage (df
will show 1PB). Great for long-term storage.
With this I combine insane speed with redundancy and higher IOPS, all into one mount, without any applications knowing on which disk is what.
Local storage
I trust that you know how to set up your own local storage, but if you're looking to combine the Google Drive with the local storage, I suggest you mount it at a different location before starting. For instance, if you have /data
now, move it to something like /local
. You'll still be using it later to move things to Google, so might as well use an easy-to-remember name.
Google Suite with Google Drive and Unlimited Storage
Please note that you can just sign up with 1 user, and it will give you unlimited storage, regardless of the 5-user requirement that they state on the website!
Getting Google Drive
On the Google Suite website, they sell "Google for Business" packages. For this, you need a domain name to sign up. It will include e-mail, and all other Google features for you. You will need to get the most expensive version per user for unlimited storage. This is about € 10,40 excl. VAT. Yeah, 10 bucks per month for unlimited storage, and no longer having to worry about redundancy, disk replacements and the sort.
If you haven't already, just sign up with a domain you already have, or get one. Google also sells domains, and lets you quickly manage things, but any local domain provider will do (including my favorite TransIP.
What to do and not to do
I just use the Drive of the user itself, not a Team Drive. Team Drives can have additional benefits, but also have additional disavantages. I've had many people run into problems with Team Drives, so be careful. I'd stick with just the regular Drive of the user.
If you had the idea of putting a lot of media files on there, note that Google stated that your Drive is like your local harddrive and that they don't care what you put on it. However, Google being Google, I'd recommend using encryption. rclone
(which I'll get back to in a moment) comes with built-in encryption for you.
Do not use the "Share..." functionality on rclone
data on Drive if you can avoid it, this has multiple reasons, which I won't get into here.
Create a folder on Drive into which you will store your rclone stuff, I call mine "Backups".
Rate limits
- You can upload 1TB per authorized user per day.
- You can download 5TB per authorized user per day.
- There's a rate limit of actions/second so you can't just keep hammering the API.
rclone
is your friend in this.
Setting up rclone
You can get rclone
on its website. Debian/Ubuntu packages are available. Keep this up-to-date! It's for your own good!
Basics
This can seem a little complicated, and there is an official guide. What you basically do is this:
rclone config
- Choose
n
for "New remote". - Give it a name, I call mine
d
, but you can pick whatever. This remote you won't be using often when using encryption! - Tell
rclone
it's a "drive" storage (Google Drive). - You can leave the
client_id
andclient_secret
empty, though you can set up your own (see the guide). - Give
rclone
full access (option 1). - Since you're probably working remotely, choose "headless" or
N
here. Copy/paste the URL into your browser, log in with your new Google Suite account and authorize it. - Say no to "configure this as a team drive".
- It will print the config on the screen and ask you to accept.
You now have a remote that can access your Google Drive storage and it will be unencrypted only.
Encryption
Time to set up encryption, by creating another remote (this also has a guide:
rclone config
- Choose
n
for "New remote". - Give it a name, I call mine
x
since I type it quite often. You will be using this one relatively often, so pick something short and easy. - Tell
rclone
it's a "crypt" storage ("Encrypt/Decrypt a remote"). - If you created that "Backups" folder on Drive earlier, now tell it to point to "d:Backups" (assuming you picked
d
in the one above). - Encrypt filenames (option 2).
- Encrypt directory names (option 1).
- You can enter your own password, and pass phrase for salt. YOU CAN NOT CHANGE THIS LATER, PICK VERY SECURE ONES IF YOU DON'T LET IT GENERATE THEM FOR YOU.
- It will print the config on the screen and ask you to accept.
Now the "x"-remote will be fully encrypted and stuff will appear like hashed names in the Backups folder of your Google Drive when you start putting things there.
Caching
Google has pretty strict rate limits and you don't want (or need to!) ask it about everything on the Drive every single time.
If you're using the Drive in only one location (read: server), you can set up a pretty nice caching mechanism that rclone has. Do note that my setup requires about 25GB worth of (fast!) cache/scratch space, and I put mine on NVMe SSDs.
Anyway, time to configure it:
rclone config
- Choose
n
for "New remote". - Give it a name, I call mine
google
since it's easy to remember. You won't be using this name very often, mostly for setting up a mount point after this. - Tell
rclone
that you want one of type "cache" ("Cache a remote") - Point it to "x:" here, as that's your encrypted remote.
- The Plex options can be skipped, they're useless without Plex, and not so terribly useful with Plex, either.
- I kept chunk sizes default, and you can configure how much space you want to let the chunk cache have on disk. I use 25GB.
- It will print the config on the screen and ask you to accept.
Note that the cache will always get automatically updated by rclone
if and only if you use rclone
to upload or change things directly through the cache remote. More on that later.
In summary
We now have:
- Local storage mounted on
/local
. - The actual unencrypted Drive available to us as remote
d:
. - The
Backups
directory on Drive available to use with on-the-fly en-/decryption as remotex:
. - A cached version of that available to use as remote
google:
.
Testing
Some commands (in order) to check if everything is working:
rclone ls d:
rclone ls x:
rclone ls google:
mkdir /local/test
echo "This is a test." | tee /local/test/this_is_a_test.txt
rclone move /local/test google: # this moves the contents of inside /local/test into the root-level directory of google:
rclone cat google:this_is_a_test.txt
None should return errors, and the last command should show "This is a test.", pulled from the remote.
Mounting the remote
To test, you can do:
rclone mount --allow-other google: /remote
You can look there and you should find this_is_a_test.txt
. You can delete it, too.
When you're doing, do fusermount -uz /remote
.
Persistent mount:
I don't recommend mounting it as root. I use a user data
(uid/gid 1001004) for it. First, edit /etc/fuse.conf
and make sure user_allow_other
is enabled.
Be sure to replace my specific settings with yours below (like uid/gid). Also adapt your cache paths to suit your needs. Pick a fast disk (that gets TRIMmed) and/or tmpfs (RAM).
Make sure /remote
exists (and is empty), and then create a new systemd
-Unit, as /etc/systemd/system/rclone.service
as follows:
[Unit]
Description=RClone Mount
AssertPathIsDirectory=/remote
[Service]
Type=simple
User=root
Group=root
ExecStart=/usr/bin/rclone mount --config=/path/to/your/rclone/config/file --allow-other --log-file=/var/log/rclone.log --gid=1001004 --uid=1001004 --dir-cache-time 1h --fast-list --cache-chunk-path=/scratch/cache/rclone/chunks_mount --cache-db-path=/scratch/cache/rclone/db_mount --cache-dir=/scratch/cache/rclone/vfs google: /remote
ExecStop=/bin/fusermount -uz /remote
Restart=always
RestartSec=5
StartLimitInterval=60s
StartLimitBurst=3
[Install]
WantedBy=default.target
Do a systemctl daemon-reload
, followed by a systemctl start rclone
. When your Drive is getting bigger, it might take a while, but /remote
should be populated soon, showing up in df
, and contain this_is_a_test.txt
. if you're satisfied, make it start upon boot time by issueing systemctl enable rclone
.
Another summary
You should now have two main mounts:
/local
with your, well, locally stored things./remote
, a cached and encrypted Google Drive mount.
Combining the two (or more) with MergerFS
For this, you will want to use mergerfs
, which is available in Ubuntu as a package with that name. So: apt install mergerfs
.
MergerFS has multiple advantages over the older UnionFS, of which one is hardlinking that just works. And it's more versatile.
Next, try:
mergerfs -o defaults,allow_other,use_ino,hard_remove,category.create=ff,category.action=ff,category.search=ff,fsname=data: /local:/remote /data
You can add other disks if you want, for instance my /mnt/raid1
and /mnt/raid2
. Just :-seperate them into the /local:/remote
there. For me, I appended them after those as they're less of importance and I just manually do things on them.
You should now have a combined mount of /local
and /remote
as /data
. You can point applications to that and they won't know where's what.
Unmount it with fusermount -uz /data
.
Next, add it to systemd
in a similar fashion. Create /etc/systemd/system/merger.service
and do this:
[Unit]
Description=MergerFS Mount
AssertPathIsDirectory=/data
After=rclone.service
Requires=rclone.service
PartOf=rclone.service
[Service]
Type=forking
User=data
Group=data
ExecStart=/usr/bin/mergerfs -o defaults,allow_other,use_ino,hard_remove,category.create=ff,category.action=ff,category.search=ff,fsname=data: /local:/remote /data
ExecStop=/bin/fusermount -uz /data
Restart=on-abort
RestartSec=5
StartLimitInterval=60s
StartLimitBurst=3
[Install]
WantedBy=default.target
Do a systemctl daemon-reload
, followed by a systemctl start merger
. It should be immediately available in /data
and show up in df
. If you're satisfied, make it start upon boot time by issueing systemctl enable merger
.
Note: I make this service depend on rclone.service
.
Using rclone to move data
I've been using it in many ways and for a long time, the way I eventually settled upon was:
- Copy data from
/local
to/remote
withrclone copy /local google:
. - Later, move data from
/local
to/remote
withrclone move /local google:
. - I usually do some big things manually cherry-picked rather than the whole thing in one go, but that's me. Or just copy and delete data manually, too.
I often add --stats=5s -v -v
to see what it's doing or when it's hitting limits, but/and you can also specify a log file.
LEAVE MOVING TO TRASH ENABLED
LEAVE THE TRASH ENABLED, DO NOT DISABLE RCLONE'S 'MOVE TO TRASH`. It's easy to make mistakes and the Trash is sometimes the only way to fix it.