How to Build a Dynamic Filesystem With FUSE and Node.js: A Practical Approach

cover
13 Jun 2024

Do you ever wonder what happens when you run sshfs user@remote:~/ /mnt/remoteroot? How do files from a remote server appear on your local system and synchronize so quickly? Have you heard of WikipediaFS, which allows you to edit a Wikipedia article as if it were a file in your filesystem? It's not magic—it's the power of FUSE (Filesystem in Userspace). FUSE lets you create your own filesystem without needing deep knowledge of the OS kernel or low-level programming languages.

This article introduces a practical solution using FUSE with Node.js and TypeScript. We will explore how FUSE works under the hood and demonstrate its application by solving a real-world task. Join me on an exciting adventure into the world of FUSE and Node.js.

Introduction

I was responsible for media files (primarily images) in my work. This includes many things: side- or top-banners, media in chats, stickers, etc. Of course, there are a lot of requirements for these, such as "banner is PNG or WEBP, 300x1000 pixels." If the requirements are unmet, our back office will not let an image through. And an object deduplication mechanism is there: no image can enter the same river twice.

This leads us to a situation where we have a massive set of images for testing purposes. I used shell one-liners or aliases to make my life easier.

For instance:

convert -size 300x1000 xc:gray +noise random /tmp/out.png

Example of the noise image

A combination of bash and convert is a great tool, but obviously, this is not the most convenient way to address the problem. Discussing the QA team’s situation reveals further complications. Apart from the appreciable time spent on image generation, the first question when we investigate a problem is "Are you sure you uploaded a unique image?" I believe you understand how annoying this is.

Let’s Choose the Tech We Would Like to Use

You could take a simple approach: create a web service that serves a route with a self-explanatory file, like GET /image/1000x100/random.zip?imagesCount=100. The route would return a ZIP file with a set of unique images. This sounds good, but it doesn’t address our main issue: all uploaded files need to be unique for testing.

Your next thought might be "Can we replace a payload when sending it?" The QA team uses Postman for API calls. I investigated Postman internals and realized we can't change the request body "on the fly"

Another solution is to replace a file in the file system each time something tries to read the file. Linux has a notification subsystem called Inotify, which alerts you about file system events such as changes in directories or file modifications. If you were getting "Visual Studio Code is unable to watch for file changes in this large workspace," there is a problem with Inotify. It can fire an event when a directory is changed, a file is renamed, a file is opened, and so on.

So, the plan is:

  1. Listening to the IN_OPEN event and counting file descriptors.

  2. Listening to the IN_CLOSE event; if the count drops to 0, we will replace the file.

Sounds good, but there are a couple of problems with this:

  • Only Linux supports inotify.
  • Parallel requests to the file should return the same data.
  • If a file has intensive IO-operations, replacement would never happen.
  • If a service that serves Inotify events crashes, the files will stay in the user file system.

To address these problems, we can write our own file system. But there is another problem: the regular file system runs in OS kernel space. It requires us to know about OS kernel and using languages like C/Rust. Also, for each kernel, we should write a specific module (driver).

Therefore, writing a file system is overkill for the problem we want to solve; even if there is a long weekend ahead. Fortunately, there is a way to tame this beast: Filesystem in Userspace (FUSE). FUSE is a project that lets you create file systems without editing kernel code. This means that any program or script through FUSE, without any complex core-related logic, is able to emulate a flash, hard drive, or SSD.

In other words, an ordinary userspace process can create its own file system, which can be accessed normally through any ordinary program you wish – Nautilus, Dolphin, ls, etc.

Why is FUSE good for covering our requirements? FUSE-based file systems are built over user-spaced processes. Therefore, you can use any language you know that has a binding to libfuse. Also, you get a cross-platform solution with FUSE.

I have had a lot of experience with NodeJS and TypeScript, and I would like to choose this (wonderful) combination as an execution environment for our brand-new FS. Furthermore, TypeScript provides an excellent object-oriented base. This will allow me to show you not only the source code, which you can find on the public GitHub repo but also the structure of the project.

Deep Dive Into FUSE

Let me provide a speaking quote from the official FUSE page:

FUSE is a userspace filesystem framework. It consists of a kernel module (fuse.ko), a userspace library (libfuse.*), and a mount utility (fusermount).

A framework for writing file systems sounds exciting.

I should explain what each FUSE part means:

  1. fuse.ko is doing all kernel-related low-level jobs; this allows us to avoid intervention into an OS kernel.

  2. libfuse is a library that provides a high-level layer for communication with fuse.ko.

  3. fusermount allows users to mount/unmount userspace file systems (call me Captain Obvious!).

The general principles look like this:
The general principles of FUSE

The userspace process (ls in this case) makes a request to the Virtual File System kernel that routes the request to the FUSE kernel module. The FUSE module, in turn, routes the request back to userspace to the file system realization (./hello in the picture above).

Don't be deceived by the Virtual File System name. It isn't directly related to the FUSE. It is the software layer in the kernel that provides the filesystem interface to userspace programs. For the sake of simplicity, you can perceive it as a Composite pattern.

libfuse offers two types of APIs: high-level and low-level. They have similarities but crucial differences. The low-level one is asynchronous and works only with inodes. Asynchronous, in this case, means that a client that uses low-level API should call the response methods by itself.

The high-level one provides the ability to use convenient paths (for example, /etc/shadow) instead of more "abstract" inodes and returns responses in a sync way. In this article, I will explain how the high-level works rather than the low-level and inodes.

If you want to implement your own file system, you should implement a set of methods responsible for requests serving from VFS. The most common methods are:

  • open(path, accessFlags): fd -- open a file by path. The method shall return a number identifier, the so-called File Descriptor (from hereon fd). An access flags is a binary mask that describes which operation the client program wants to perform (read-only, write-only, read-write, execute, or search).

  • read(path, fd, Buffer, size, offset): count of bytes read -- read size bytes from a file linked with fd File Descriptor to the passed Buffer. The path argument is ignored because we will use fd.

  • write(path, fd, Buffer, size, offset): count of bytes written -- write size bytes from the Buffer to a file linked with fd.

  • release(fd) -- close the fd.

  • truncate(path, size) -- change a file size. The method should be defined if you want to rewrite files (and we do).

  • getattr(path) -- returns file parameters such as size, created at, accessed at, etc. The method is the most callable method by the file system, so make sure you create the optimal one.

  • readdir(path) -- read all subdirectories.

The methods above are vital for each fully operable file system built on top of high-level FUSE API. But the list is not complete; the full list you can find on https://libfuse.github.io/doxygen/structfuse__operations.html

To revisit the concept of a file descriptor: In UNIX-like systems, including MacOS, a file descriptor is an abstraction for files and other I/O resources like sockets and pipes. When a program opens a file, the OS returns a numerical identifier called a file descriptor. This integer serves as an index in the OS's file descriptor table for each process. When implementing a filesystem using FUSE, we will need to generate file descriptors ourselves.

Let's consider call flow when the client opens a file:

  1. getattr(path: /random.png) → { size: 98 }; the client got the file size.

  2. open(path: /random.png) → 10; opened file by path; FUSE implementation returns a file descriptor number.

  3. read(path: /random.png, fd: 10 buffer, size: 50, offset: 0) → 50; read the first 50 bytes.

  4. read(path: /random.png, fd: 10 buffer, size: 50, offset: 50) → 48; read the next 50. The 48 bytes were read due to file size.

  5. release(10); all data was read, so close to the fd.

Let's Write a Minimum-Viable Product and Check Postman’s Reaction to It

Our next step is to develop a minimal file system based on libfuse to test how Postman will interact with a custom filesystem.

Acceptance requirements for the FS are straightforward: The root of the FS should contain a random.txt file, whose content should be unique each time it is read (let's call this "always unique read"). The content should contain a random UUID and a current time in ISO format, separated by a new line. For example:

 3790d212-7e47-403a-a695-4d680f21b81c
 2012-12-12T04:30:30

The minimal product will consist of two parts. The first is a simple web service that will accept HTTP POST requests and print a request body to the terminal. The code is quite simple and isn't worth our time, mainly because the article is about FUSE, not Express. The second part is the implementation of the file system that meets the requirements. It has only 83 lines of code.

For the code, we will use the node-fuse-bindings library, which provides bindings to the high-level API of libfuse.

You can skip the code below; I’m going to write a code summary below.

const crypto = require('crypto');
const fuse = require('node-fuse-bindings');

// MOUNT_PATH is the path where our filesystem will be available. For Windows, this will be a path like 'D://'
const MOUNT_PATH = process.env.MOUNT_PATH || './mnt';

function getRandomContent() {
  const txt = [crypto.randomUUID(), new Date().toISOString(), ''].join('\n');
  return Buffer.from(txt);
}

function main() {
  // fdCounter is a simple counter that increments each time a file is opened
  // using this we can get the file content, which is unique for each opening
  let fdCounter = 0;

  // fd2ContentMap is a map that stores file content by fd
  const fd2ContentMap = new Map();

  // Postman does not work reliably if we give it a file with size 0 or just the wrong size,
  // so we precompute the file size
  // it is guaranteed that the file size will always be the same within one run, so there will be no problems with this
  const randomTxtSize = getRandomContent().length;

  // fuse.mount is a function that mounts the filesystem
  fuse.mount(
    MOUNT_PATH,
    {
      readdir(path, cb) {
        console.log('readdir(%s)', path);

        if (path === '/') {
          return cb(0, ['random.txt']);
        }

        return cb(0, []);
      },
      getattr(path, cb) {
        console.log('getattr(%s)', path);

        if (path === '/') {
          return cb(0, {
            // mtime is the file modification time
            mtime: new Date(),
            // atime is the file access time
            atime: new Date(),
            // ctime is the metadata or file content change time
            ctime: new Date(),
            size: 100,
            // mode is the file access flags
            // this is a mask that defines access rights to the file for different types of users
            // and the type of file itself
            mode: 16877,
            // file owners
            // in our case, it will be the owner of the current process
            uid: process.getuid(),
            gid: process.getgid(),
          });
        }

        if (path === '/random.txt') {
          return cb(0, {
            mtime: new Date(),
            atime: new Date(),
            ctime: new Date(),
            size: randomTxtSize,
            mode: 33188,
            uid: process.getuid(),
            gid: process.getgid(),
          });
        }

        cb(fuse.ENOENT);
      },
      open(path, flags, cb) {
        console.log('open(%s, %d)', path, flags);

        if (path !== '/random.txt') return cb(fuse.ENOENT, 0);

        const fd = fdCounter++;
        fd2ContentMap.set(fd, getRandomContent());
        cb(0, fd);
      },
      read(path, fd, buf, len, pos, cb) {
        console.log('read(%s, %d, %d, %d)', path, fd, len, pos);

        const buffer = fd2ContentMap.get(fd);
        if (!buffer) {
          return cb(fuse.EBADF);
        }

        const slice = buffer.slice(pos, pos + len);
        slice.copy(buf);

        return cb(slice.length);
      },
      release(path, fd, cb) {
        console.log('release(%s, %d)', path, fd);

        fd2ContentMap.delete(fd);
        cb(0);
      },
    },
    function (err) {
      if (err) throw err;
      console.log('filesystem mounted on ' + MOUNT_PATH);
    },
  );
}

// Handle the SIGINT signal separately to correctly unmount the filesystem
// Without this, the filesystem will not be unmounted and will hang in the system
// If for some reason unmount was not called, you can forcibly unmount the filesystem using the command
// fusermount -u ./MOUNT_PATH
process.on('SIGINT', function () {
  fuse.unmount(MOUNT_PATH, function () {
    console.log('filesystem at ' + MOUNT_PATH + ' unmounted');
    process.exit();
  });
});

main();

I suggest refreshing our knowledge about permission bits in a file. Permission bits are a set of bits that are associated with a file; they are a binary representation of who is allowed to read/write/execute the file. "Who" includes three groups: the owner, the owner group, and others.

Permissions can be set for each group separately. Usually, each permission is represented by a three-digit number: read (4 or '100' in binary number system), write (2 or '010'), and execution (1 or '001'). If you add these numbers together, you will create a combined permission. For example, 4 + 2 (or '100' + '010') will make 6 ('110'), which means read + write (RO) permission.

If the file owner has an access mask of 7 (111 in binary, meaning read, write, and execute), the group has 5 (101, meaning read and execute), and others have 4 (100, meaning read-only). Therefore, the complete access mask for the file is 754 in decimal. Bear in mind that execution permission becomes read permission for directories.

Let's go back to the file system implementation and make a text version of this: Each time a file is opened (via an open call), the integer counter increments, producing the file descriptor returned by the open call. Random content is then created and saved in a key-value store with the file descriptor as the key. When a read call is made, the corresponding content portion is returned.

Upon a release call, the content is deleted. Remember to handle SIGINT to unmount the filesystem after pressing Ctrl+C. Otherwise, we'll have to do it manually in the terminal using fusermount -u ./MOUNT_PATH.

Now, jump into testing. We run the web server, then create an empty folder as a root folder for the upcoming FS, and run the main script. After the "Server listening on port 3000" line prints, open Postman, and send a couple of requests to the web-server in a row without changing any parameters.
Left side is the FS, right one is the web-server

Everything looks good! Each request has unique file content, as we foresaw. The logs also prove that the flow of file open calls described above in the "Deep dive into FUSE" section is correct.

The GitHub repo with MVP: https://github.com/pinkiesky/node-fuse-mvp. You can run this code on your local environment or use this repo as a boilerplate for your own file system implementation.

The Core Idea

The approach is checked-now it’s time for the primary implementation.

Before the "always unique read" implementation, the first thing we should implement is create and delete operations for original files. We will implement this interface through a directory within our virtual filesystem. The user will put original images that they want to make "always unique" or "randomized," and the filesystem will prepare the rest.

Here and in the following sections, "always unique read", "random image," or "random file" refers to a file that returns unique content in a binary sense each time it is read, while visually, it remains as similar as possible to the original.

The file system's root will contain two directories: Image Manager and Images. The first one is a folder for managing the user's original files (you can think of it as a CRUD repository). The second one is the unmanaged directory from the user's point of view that contains random images.
User interact with file system

FS tree as terminal output

As you can see in the image above, we will also implement not only "always unique" images but also a file converter! That's an added bonus.

The core idea of our implementation is that the program will contain an object tree, with each node and leaf providing common FUSE methods. When the program receives an FS call, it should find a node or a leaf in the tree by the corresponding path. For example, the program gets the getattr(/Images/1/original/) call and then tries to find the node to which the path is addressed.

Something like this: FS tree example

The next question is how we will store the original images. An image in the program will consist of binary data and meta information (a meta includes an original filename, file mime-type, etc.). Binary data will be stored in binary storage. Let's simplify it and build binary storage as a set of binary files in the user (or the host) file system. Meta information will be stored similarly: JSON inside text files in the user file system.

As you may remember, in the "Let's write a minimum-viable product" section, we created a file system that returns a text file by a template. It contains a random UUID plus a current date, so the data's uniqueness wasn't the problem—uniqueness was achieved by the data's definition. However, from this point, the program should work with preloaded user images. So, how can we create images that are similar but always unique (in terms of bytes and consequently hashes) based on the original one?

The solution I suggest is quite simple. Let's put an RGB noise square in the top-left corner of an image. The noise square should be 16x16 pixels. This provides almost the same picture but guarantees a unique sequence of bytes. Will it be enough to ensure a lot of different images? Let's do some math. The size of the square is 16. 16×16 = 256 RGB pixels in a single square. Each pixel has 256×256×256 = 16,777,216 variants.

Thus, the count of unique squares is 16,777,216^256 -- a number with 1,558 digits, which is much more than the number of atoms in the observable universe. Does that mean we can reduce the square size? Unfortunately, lossy compression like JPEG would significantly reduce the number of unique squares, so 16x16 is the optimal size.

Example of images with noise squares

Passthrough Over Classes

The Tree
UML class diagram showing interfaces and classes for a FUSE-based system. Includes interfaces IFUSEHandler, ObjectTreeNode, and IFUSETreeNode, with FileFUSETreeNode and DirectoryFUSETreeNode implementing IFUSETreeNode. Each interface and class lists attributes and methods, illustrating their relationships and hierarchy

IFUSEHandler is an interface that serves common FUSE calls. You can see that I replaced read/write with readAll/writeAll, respectively. I did this to simplify read and write operations: when IFUSEHandler makes read/write for a whole part, we are able to move partial read/write logic to another place. This means IFUSEHandler does not need to know anything about file descriptors, binary data, etc.

The same thing happened with the open FUSE method as well. A notable aspect of the tree is that it is generated on demand. Instead of storing the entire tree in memory, the program creates nodes only when they are accessed. This behavior allows the program to avoid a problem with tree rebuilding in case of node creation or removal.

Check the ObjectTreeNode interface, and you will find that children is not an array but a method, so this is how they are generated on demand.FileFUSETreeNode and DirectoryFUSETreeNode are abstract classes where some methods throw a NotSupported error (obviously, FileFUSETreeNode should never implement readdir).

FUSEFacade

UML class diagram showing interfaces and their relationships for a FUSE system. The diagram includes the IFUSEHandler, IFUSETreeNode, IFileDescriptorStorage interfaces, and the FUSEFacade class. IFUSEHandler has attributes name and methods checkAvailability, create, getattr, readAll, remove, and writeAll. IFileDescriptorStorage has methods get, openRO, openWO, and release. IFUSETreeNode extends IFUSEHandler. FUSEFacade includes constructor, create, getattr, open, read, readdir, release, rmdir, safeGetNode, unlink, and write methods, and interacts with both IFUSETreeNode and IFileDescriptorStorage.

FUSEFacade is the most crucial class that implements the program's main logic and binds different parts together. node-fuse-bindings has a callback-based API, but FUSEFacade methods are made with a Promise-based one. To address this inconvenience, I used a code like this:

  const handleResultWrapper = <T>(
    promise: Promise<T>,
    cb: (err: number, result: T) => void,
  ) => {
    promise
      .then((result) => {
        cb(0, result);
      })
      .catch((err) => {
        if (err instanceof FUSEError) {
          fuseLogger.info(`FUSE error: ${err}`);
          return cb(err.code, null as T);
        }

        fuseLogger.warn(err);
        cb(fuse.EIO, null as T);
      });
  };

// Ex. usage: 
// open(path, flags, cb) {
//   handleResultWrapper(fuseFacade.open(path, flags), cb);
// },

The FUSEFacade methods are wrapped in handleResultWrapper. Each method of FUSEFacade that uses a path simply parses the path, finds a node in the tree, and calls the requested method.

Consider a couple of methods from the FUSEFacade class.

async create(path: string, mode: number): Promise<number> {
  this.logger.info(`create(${path})`);

  // Convert path `/Image Manager/1/image.jpg` in 
  //   `['Image Manager', '1', 'image.jpg']`
  // splitPath will throw error if something goes wrong
  const parsedPath = this.splitPath(path); // `['Image Manager', '1', 'image.jpg']`
  const name = parsedPath.pop()!; // ‘image.jpg’

  // Get node by path (`/Image Manager/1` after `pop` call)
  //   or throw an error if node not found
  const node = await this.safeGetNode(parsedPath);

  // Call the IFUSEHandler method. Pass only a name, not a full path!
  await node.create(name, mode);
  // Create a file descriptor
  const fdObject = this.fdStorage.openWO();

  return fdObject.fd;
}

async readdir(path: string): Promise<string[]> {
  this.logger.info(`readdir(${path})`);

  const node = await this.safeGetNode(path);
  // As you see, the tree is generated on the fly
  return (await node.children()).map((child) => child.name);
}

async open(path: string, flags: number): Promise<number> {
  this.logger.info(`open(${path}, ${flags})`);

  const node = await this.safeGetNode(path);

  // A leaf node is a directory
  if (!node.isLeaf) {
    throw new FUSEError(fuse.EACCES, 'invalid path');
  }

  // Usually checkAvailability checks access
  await node.checkAvailability(flags);

  // Get node content and put it in created file descriptor
  const fileData: Buffer = await node.readAll();
  // fdStorage is IFileDescriptorStorage, we will consider it below
  const fdObject = this.fdStorage.openRO(fileData);

  return fdObject.fd;
}

A File Descriptor

Before taking the next step, let’s take a closer look at what a file descriptor is in the context of our program.

UML class diagram showing interfaces and their relationships for file descriptors in a FUSE system. The diagram includes the IFileDescriptor, IFileDescriptorStorage interfaces, and the ReadWriteFileDescriptor, ReadFileDescriptor, and WriteFileDescriptor classes. IFileDescriptor has attributes binary, fd, size, and methods readToBuffer, writeToBuffer. IFileDescriptorStorage has methods get, openRO, openWO, and release. ReadWriteFileDescriptor implements IFileDescriptor with additional constructor, readToBuffer, and writeToBuffer methods. ReadFileDescriptor and WriteFileDescriptor extend ReadWriteFileDescriptor, with ReadFileDescriptor having a writeToBuffer method and WriteFileDescriptor having a readToBuffer method

ReadWriteFileDescriptor is a class that stores file descriptors as a number and binary data as a buffer. The class has readToBuffer and writeToBuffer methods that provide the ability to read and write data into a file descriptor buffer. ReadFileDescriptor and WriteFileDescriptor are implementations of read-only and write-only descriptors.

IFileDescriptorStorage is an interface that describes file descriptor storage. The program has only one implementation for this interface: InMemoryFileDescriptorStorage. As you can tell from the name, it stores file descriptors in memory because we don't need persistence for descriptors.

Let's check how FUSEFacade uses file descriptors and storage:

async read(
  fd: number,     // File descriptor to read from
  buf: Buffer,    // Buffer to store the read data
  len: number,    // Length of data to read
  pos: number,    // Position in the file to start reading from
): Promise<number> {
  // Retrieve the file descriptor object from storage
  const fdObject = this.fdStorage.get(fd);
  if (!fdObject) {
    // If the file descriptor is invalid, throw an error
    throw new FUSEError(fuse.EBADF, 'invalid fd');
  }
  // Read data into the buffer and return the number of bytes read
  return fdObject.readToBuffer(buf, len, pos);
}
async write(
  fd: number,     // File descriptor to write to
  buf: Buffer,    // Buffer containing the data to write
  len: number,    // Length of data to write
  pos: number,    // Position in the file to start writing at
): Promise<number> {
  // Retrieve the file descriptor object from storage
  const fdObject = this.fdStorage.get(fd);
  if (!fdObject) {
    // If the file descriptor is invalid, throw an error
    throw new FUSEError(fuse.EBADF, 'invalid fd');
  }
  // Write data from the buffer and return the number of bytes written
  return fdObject.writeToBuffer(buf, len, pos);
}
async release(path: string, fd: number): Promise<0> {
  // Retrieve the file descriptor object from storage
  const fdObject = this.fdStorage.get(fd);
  if (!fdObject) {
    // If the file descriptor is invalid, throw an error
    throw new FUSEError(fuse.EBADF, 'invalid fd');
  }
  // Safely get the node corresponding to the file path
  const node = await this.safeGetNode(path);
  // Write all the data from the file descriptor object to the node
  await node.writeAll(fdObject.binary);
  // Release the file descriptor from storage
  this.fdStorage.release(fd);
  // Return 0 indicating success
  return 0;
}

The code above is straightforward. It defines methods to read from, write to, and release file descriptors, ensuring the file descriptor is valid before performing operations. The release method also writes data from a file descriptor object to the filesystem node and frees the file descriptor.

We are done with the code around libfuse and the tree. It’s time to dive into the image-related code.

Images: "Data Transfer Object" Part
UML class diagram showing interfaces and their relationships for image handling. The diagram includes the ImageBinary, ImageMeta, Image, and IImageMetaStorage interfaces. ImageBinary has attributes buffer and size. ImageMeta has attributes id, name, originalFileName, and originalFileType. Image has attributes binary and meta, where binary is of type ImageBinary and meta is of type ImageMeta. IImageMetaStorage has methods create, get, list, and remove

ImageMeta is an object that stores meta information about an image.IImageMetaStorage is an interface that describes a storage for meta. The program has only one implementation for the interface: the FSImageMetaStorage class implements the IImageMetaStorage interface to manage image metadata stored in a single JSON file.

It uses a cache to store metadata in memory and ensures the cache is hydrated by reading from the JSON file when needed. The class provides methods to create, retrieve, list, and delete image metadata, and it writes changes back to the JSON file to persist updates. The cache improves performance by reducing an IO operation count.

ImageBinary, obviously, is an object that has binary image data. The Image interface is the composition of ImageMeta and ImageBinary.

Images: Binary Storage and Generators

UML class diagram showing interfaces and their relationships for image generation and binary storage. The diagram includes the IBinaryStorage, IImageGenerator interfaces, and FSBinaryStorage, ImageGeneratorComposite, PassThroughImageGenerator, TextImageGenerator, and ImageLoaderFacade classes. IBinaryStorage has methods load, remove, and write. FSBinaryStorage implements IBinaryStorage and has an additional constructor. IImageGenerator has a method generate. PassThroughImageGenerator and TextImageGenerator implement IImageGenerator. ImageGeneratorComposite has methods addGenerator and generate. ImageLoaderFacade has a constructor and a load method, and interacts with IBinaryStorage and IImageGenerator

IBinaryStorage is an interface for binary data storage. Binary storage should be unlinked from images and can store any data: images, video, JSON, or text. This fact is important to us, and you will see why.

IImageGenerator is an interface that describes a generator. The generator is an important part of the program. It takes raw binary data plus meta and generates an image based on it. Why does the program need generators? Can the program work without them?

It can, but generators will add flexibility to the implementation. Generators allow users to upload pictures, text data, and broadly speaking, any data for which you write a generator.

Diagram showing the process of converting a text file to an image using the IImageGenerator interface. On the left, there is an icon for a text file labeled 'myfile.txt' with the content 'Hello, world!'. An arrow labeled 'IImageGenerator' points to the right, where there is an icon for an image file labeled 'myfile.png' with the same text 'Hello, world!' displayed in the image

The flow is as follows: binary data is loaded from storage (myfile.txt in the picture above), and then the binary passes to a generator. It generates an image "on the fly." You can perceive it as a converter from one format to another which is more convenient for us.

Let’s check out an example of a generator:

import { createCanvas } from 'canvas';  // Import createCanvas function from the canvas library to create and manipulate images

const IMAGE_SIZE_RE = /(\d+)x(\d+)/;  // Regular expression to extract width and height dimensions from a string

export class TextImageGenerator implements IImageGenerator {

  // method to generate an image from text
  async generate(meta: ImageMeta, rawBuffer: Buffer): Promise<Image | null> {
    // Step 1: Verify the MIME type is text
    if (meta.originalFileType !== MimeType.TXT) {
      // If the file type is not text, return null indicating no image generation
      return null;
    }
    // Step 2: Determine the size of the image
    const imageSize = {
      width: 800,  // Default width
      height: 600,  // Default height
    };

    // Extract dimensions from the name if present
    const imageSizeRaw = IMAGE_SIZE_RE.exec(meta.name);
    if (imageSizeRaw) {
      // Update the width and height based on extracted values, or keep defaults
      imageSize.width = Number(imageSizeRaw[1]) || imageSize.width;
      imageSize.height = Number(imageSizeRaw[2]) || imageSize.height;
    }
    // Step 3: Convert the raw buffer to a string to get the text content
    const imageText = rawBuffer.toString('utf-8');
    // Step 4: Create a canvas with the determined size
    const canvas = createCanvas(imageSize.width, imageSize.height);
    const ctx = canvas.getContext('2d');  // Get the 2D drawing context
    // Step 5: Prepare the canvas background
    ctx.fillStyle = '#000000';  // Set fill color to black
    ctx.fillRect(0, 0, imageSize.width, imageSize.height);  // Fill the entire canvas with the background color
    // Step 6: Draw the text onto the canvas
    ctx.textAlign = 'start';  // Align text to the start (left)
    ctx.textBaseline = 'top';  // Align text to the top
    ctx.fillStyle = '#ffffff';  // Set text color to white
    ctx.font = '30px Open Sans';  // Set font style and size
    ctx.fillText(imageText, 10, 10);  // Draw the text with a margin
    // Step 7: Convert the canvas to a PNG buffer and create the Image object
    return {
      meta,  // Include the original metadata
      binary: {
        buffer: canvas.toBuffer('image/png'),  // Convert canvas content to a PNG buffer
      },
    };
  }
}

The ImageLoaderFacade class is a facade that logically combines the storage and the generator–in other words, it implements the flow you read above.

Images: Variants

UML class diagram showing interfaces and their relationships for image generation and binary storage. The diagram includes the IBinaryStorage, IImageGenerator interfaces, and FSBinaryStorage, ImageGeneratorComposite, PassThroughImageGenerator, TextImageGenerator, and ImageLoaderFacade classes. IBinaryStorage has methods load, remove, and write. FSBinaryStorage implements IBinaryStorage and has an additional constructor. IImageGenerator has a method generate. PassThroughImageGenerator and TextImageGenerator implement IImageGenerator. ImageGeneratorComposite has methods addGenerator and generate. ImageLoaderFacade has a constructor and a load method, and interacts with IBinaryStorage and IImageGenerator

IImageVariant is an interface for creating various image variants. In this context, a variant is an image generated "on the fly" that will be displayed to the user when viewing files in our filesystem. The main difference from generators is that it takes an image as input rather than raw data.

The program has three variants: ImageAlwaysRandom, ImageOriginalVariant, and ImageWithText. ImageAlwaysRandom returns the original image with a random RGB noise square.

export class ImageAlwaysRandomVariant implements IImageVariant {
  // Define a constant for the size of the random square edge in pixels
  private readonly randomSquareEdgeSizePx = 16;
  // Constructor takes the desired output format for the image
  constructor(private readonly outputFormat: ImageFormat) {}
  // Asynchronous method to generate a random variant of an image
  async generate(image: Image): Promise<ImageBinary> {
    // Step 1: Load the image using the sharp library
    const sharpImage = sharp(image.binary.buffer);
    // Step 2: Retrieve metadata and raw buffer from the image
    const metadata = await sharpImage.metadata(); // Get image metadata
    const buffer = await sharpImage.raw().toBuffer(); // Get raw pixel data
    // the buffer size is plain array with size of image width * image height * channels count (3 or 4)
    // Step 3: Apply random pixel values to a small square region in the image
    for (let y = 0; y < this.randomSquareEdgeSizePx; y++) {
      for (let x = 0; x < this.randomSquareEdgeSizePx; x++) {
        // Calculate the buffer offset for the current pixel
        const offset = y * metadata.width! * metadata.channels! + x * metadata.channels!;

        // Set random values for RGB channels
        buffer[offset + 0] = randInt(0, 255); // Red channel
        buffer[offset + 1] = randInt(0, 255); // Green channel
        buffer[offset + 2] = randInt(0, 255); // Blue channel
        // If the image has an alpha channel, set it to 255 (fully opaque)
        if (metadata.channels === 4) {
          buffer[offset + 3] = 255; // Alpha channel
        }
      }
    }
    // Step 4: Create a new sharp image from the modified buffer and convert it to the desired format
    const result = await sharp(buffer, {
      raw: {
        width: metadata.width!,
        height: metadata.height!,
        channels: metadata.channels!,
      },
    })
      .toFormat(this.outputFormat) // Convert to the specified output format
      .toBuffer(); // Get the final image buffer
    // Step 5: Return the generated image binary data
    return {
      buffer: result, // Buffer containing the generated image
    };
  }
}

I use the sharp library as the most convenient way to operate over images in NodeJS: https://github.com/lovell/sharp.

ImageOriginalVariant returns an image without any change (but it can return an image in a different compression format). ImageWithText returns an image with written text over it. This will be helpful when we create predefined variants of a single image. For example, if we need 10 random variations of one image, we must distinguish these variations from each other.

The solution here is to create 10 pictures based on the original, where we render a sequential number from 0 to 9 in the top-left corner of each image.

A sequence of images showing a white and black cat with wide eyes. The images are labeled with numbers starting from 0 on the left, incrementing by 1, and continuing with ellipses until 9 on the right. The cat's expression remains the same in each image

The ImageCacheWrapper has a different purpose from the variants and acts as a wrapper by caching the results of the particular IImageVariant class. It will be used to wrap entities that do not change, like an image converter, text-to-image generators, and so on. This caching mechanism enables faster data retrieval, mainly when the same images are read multiple times.

Well, we have covered all primary parts of the program. It's time to combine everything together.

The Tree Structure

UML class diagram showing the hierarchy and relationships between various FUSE tree nodes related to image management. Classes include ImageVariantFileFUSETreeNode, ImageCacheWrapper, ImageItemAlwaysRandomDirFUSETreeNode, ImageItemOriginalDirFUSETreeNode, ImageItemCounterDirFUSETreeNode, ImageManagerItemFileFUSETreeNode, ImageItemDirFUSETreeNode, ImageManagerDirFUSETreeNode, ImagesDirFUSETreeNode, and RootDirFUSETreeNode. Each class has attributes and methods relevant to image metadata, binary data, and file operations like create, readAll, writeAll, remove, and getattr

The class diagram below represents how the tree classes are combined with their image counterparts. The diagram should be read from bottom to top. RootDir (let me avoid the FUSETreeNode postfix in names) is the root dir for the file system that the program is implementing. Moving to the upper row, see two dirs: ImagesDir and ImagesManagerDir. ImagesManagerDir contains the user images list and allows control of them. Then, ImagesManagerItemFile is a node for a particular file. This class implements CRUD operations.

Consider ImagesManagerDir as a usual implementation of a node:

class ImageManagerDirFUSETreeNode extends DirectoryFUSETreeNode {
  name = 'Image Manager';  // Name of the directory

  constructor(
    private readonly imageMetaStorage: IImageMetaStorage,
    private readonly imageBinaryStorage: IBinaryStorage,
  ) {
    super();  // Call the parent class constructor
  }

  async children(): Promise<IFUSETreeNode[]> {
    // Dynamically create child nodes
    // In some cases, dynamic behavior can be problematic, requiring a cache of child nodes
    // to avoid redundant creation of IFUSETreeNode instances
    const list = await this.imageMetaStorage.list();
    return list.map(
      (meta) =>
        new ImageManagerItemFileFUSETreeNode(
          this.imageMetaStorage,
          this.imageBinaryStorage,
          meta,
        ),
    );
  }

  async create(name: string, mode: number): Promise<void> {
    // Create a new image metadata entry
    await this.imageMetaStorage.create(name);
  }

  async getattr(): Promise<Stats> {
    return {
      // File modification date
      mtime: new Date(),
      // File last access date
      atime: new Date(),
      // File creation date
      // We do not store dates for our images,
      // so we simply return the current date
      ctime: new Date(),
      // Number of links
      nlink: 1,
      size: 100,
      // File access flags
      mode: FUSEMode.directory(
        FUSEMode.ALLOW_RWX, // Owner access rights
        FUSEMode.ALLOW_RX,  // Group access rights
        FUSEMode.ALLOW_RX,  // Access rights for all others
      ),
      // User ID of the file owner
      uid: process.getuid ? process.getuid() : 0,
      // Group ID for which the file is accessible
      gid: process.getgid ? process.getgid() : 0,
    };
  }

  // Explicitly forbid deleting the 'Images Manager' folder
  remove(): Promise<void> {
    throw FUSEError.accessDenied();
  }
}

Moving forward, the ImagesDir contains subdirectories named after the user's images. ImagesItemDir is responsible for each directory. It includes all available variants; as you remember, the variants count is three. Each variant is a directory that contains the final image files in different formats (currently: jpeg, png, and webm). ImagesItemOriginalDir and ImagesItemCounterDir wrap all spawned ImageVariantFile instances in a cache.

This is necessary to avoid constant re-encoding of the original images because encoding is CPU-consuming. At the top of the diagram is the ImageVariantFile. It is the crown jewel of the implementation and the composition of the previously described IFUSEHandler and IImageVariant. This is the file that all our efforts have been building towards.

Testing

Let’s test how the final filesystem handles parallel requests to the same file. To do this, we will run the md5sum utility in multiple threads, which will read files from the filesystem and calculate their hashes. Then, we’ll compare these hashes. If everything is working correctly, the hashes should be different.

#!/bin/bash
# Loop to run the md5sum command 5 times in parallel
for i in {1..5}
do
  echo "Run $i..."
  # `&` at the end of the command runs it in the background
  md5sum ./mnt/Images/2020-09-10_22-43/always_random/2020-09-10_22-43.png &
done
echo 'wait...'
# Wait for all background processes to finish
wait

I ran the script and checked the following output (cleaned up a bit for clarity):

Run 1...
Run 2...
Run 3...
Run 4...
Run 5...
wait...
bcdda97c480db74e14b8779a4e5c9d64
0954d3b204c849ab553f1f5106d576aa
564eeadfd8d0b3e204f018c6716c36e9
73a92c5ef27992498ee038b1f4cfb05e
77db129e37fdd51ef68d93416fec4f65

Excellent! All the hashes are different, meaning the filesystem returns a unique image each time!

ĐĄonclusion

I hope this article has inspired you to write your own FUSE implementation. Remember, the source code for this project is available here: https://github.com/pinkiesky/node-fuse-images.

The filesystem we’ve built is simplified to demonstrate the core principles of working with FUSE and Node.js. For example, it doesn't take into account the correct dates. There's plenty of room for enhancement. Imagine adding functionalities like frame extraction from user GIF files, video transcoding, or even parallelizing tasks through workers.

However, perfect is the enemy of good. Start with what you have, get it working, and then iterate. Happy coding!