Presenting an easy-to-use, easy-to-understand, and reliable Few-Shot Learning code base, for newcomers and experimented few-shot learners.
Few-Shot Learning is the sub-field of Machine Learning focusing on the problem of learning from a few examples. Few-Shot Image Recognition consists in designing a model able to recognize an object after seeing only a handful of labeled instances for this particular class of objects.
I’m not going to develop any further on why we care about Few-Shot Learning (FSL for those in the know) and what are the basic concepts and methods associated with it because I have already done it in this other article (definitely a must-read if it’s the first time you read about FSL).
From this point further, I’m going to assume that 1) you have a basic knowledge of FSL and the most common methods to tackle it; and 2) you are (or want to be) an FSL practitioner.
Easy and reliable code for Few-Shot Learning
Since 2020 I have been developing, maintaining, and improving EasyFSL. It’s a PyTorch library into which I’ve put a lot of thought and effort, so you don’t need to. It’s got:
- Everything you need to download and handle standard Few-Shot Learning datasets because there is nothing more important than starting from clean data;
- Implementations for 8 (and counting) state-of-the-art Few-Shot Learning methods, so you can make nice and comprehensive benchmarks;
- Example Few-Shot Learning code in the shape of tutorial notebooks for all the most essential gestures, like training or evaluating a model;
Every class and method is documented, so you need never ask: “What the hell did they try to do here?”.
OK then, enough self-promotion. If you want, you can stop reading and go play with EasyFSL. In the rest of this article, I will focus on the challenges of developing a unified framework for Few-Shot Learning research.
Building code that’s useful to others
As a researcher, I use two kinds of Machine Learning and Few-Shot Learning code:
- Libraries like scikit, PyTorch, or pyod. They provide the standard tools that everyone will use. If I design a new convolutional network, I know that the convolution layers I use will follow the same implementations as everyone else’s.
- Research code released to let the community reproduce the results of a published paper. Researchers developed codes like this with the sole purpose of running the experiments they needed. These implementations usually come with many specificities that are not necessarily relevant for any other project.
Let’s not sugarcoat it: using other people’s research code is the dirtiest part of my job. When you’re designing and running experiments for your paper, you want to iterate fast. When you have a new idea, you want to test it in the fastest possible way. You don’t care if you’re breaking things (you can always fix it later?). You don’t care if you’re choosing the smartest and most seamless way to integrate it into your existing code. And you definitely don’t have time to document it (maybe an inline comment, but that’s it).
The following gist comes from the implementation of PT-MAP, which is state-of-the-art in Few-Shot Image Classification.
It’s quite clear that the main goal of this code is not to allow other researchers to run their own experiments based on PT-MAP. Otherwise:
tmpargument would have a more informative name;
- the requirement for
params.modelto have its value in
('WideResNet28_10', 'ResNet18')wouldn’t be defined in the scope of this particular function;
- the optimizer would be an argument of the function, so future users can easily switch to Stochastic Gradient Descent, for instance.
It’s a fair assumption that this code was designed to iterate fast in order to crush previous methods on most few-shot learning benchmarks. In that way, it was very effective since PT-MAP did exactly that.
However, when I needed to reproduce these results and iterate on this method for my own research, I literally spent hours trying to understand the code, what was represented in objects like
tmp, what was necessary or what was a legacy of abandoned experiments, what were the side effects in every function...
This motivates the need for a Few-Shot Learning library: a code base that is not intended for specific experiments but rather offers a toolkit for a wide range of experiments that no one has thought of yet.
The Specification / Modularity Trade-Off
Once I’ve decided to work on a library to facilitate other researchers’ experiments, I had to face a critical question: how specific should my Few-Shot Learning code be?
In EasyFSL’s v0, there was only one
Dataset object, called
EasySet. I meant it to be a single class that could be used for all existing and future datasets used in Few-Shot Learning. The user just needs to feed a JSON specification file listing the classes of the dataset and the folder where the classes’ images are. EasySet finds all files in these folders and builds the dataset.
After a year of using my own product, I realized that I rarely used
EasySet. Instead, I always created a specific class for each dataset I used. In most of my projects, I end up having a class
TieredImageNet, a class
MiniImageNet, a class
CUB, a class
Omniglot, and a class
CifarFewShot. Because all datasets are very different from one another and come with different challenges. For instance, some datasets like miniImageNet are small enough to be entirely stored in RAM, so you want an implementation of this dataset that will actually do that and not read images on disk during training. In that aspect,
EasySet is too general and therefore brings little value.
Interestingly enough, in some aspects,
EasySet was too specific. In the above gist, you can see that the dataset’s transformations are defined in an
EasySet.compose_transforms() method. Here it is.
So you have a very specific and arbitrary transformation that will be applied to your dataset. It is non-parameterizable: you don’t have any method or argument in the constructor to choose what transformations you’d like. Although it might be good practice to have standard, default transformations, here we force the user to either create a new class overwriting this method or to update the transformations manually:
dataset = EasySet("specs.json")
dataset.transform = my_transform
This just feels like there could be a better way. A library that forces you to use those workarounds is doing something wrong.
This is why in EasyFSL 1.0, I tried to find a new balance between specification and modularity. There is now a separate class for each commonly used Few-Shot Learning dataset. For those who want to build new datasets following the standard file structure,
EasySet is still there. And all those classes extend an abstract class called
FewShotDataset, which only defines the signature of datasets intended to be used in a few-shot setting. To sum up, you have:
- An abstract class to standardize the minimal requirements of an object;
- A modular class to allow fast adaptation to new use cases;
- And specific classes to address the use cases that we know will often be encountered.
This is the balance that I have found for Few-Shot Learning datasets, and I try to apply this same principle to the rest of the library.
Adapting to a fast-evolving research area
When I wrote the first lines of code for EasyFSL, the state-of-the-art in Few-Shot Learning was built around the concept of Meta-Learning, and, more specifically, the technique of episodic training. As a result, all Few-Shot classifiers implemented in EasyFSL 0.1 extended the abstract class
AbstractMetaLearner. It’s in the name: to solve Few-Shot Learning, you need Meta-Learning. All methods related to episodic training were implemented in
AbstractMetaLearner. As a user, if I want to design a new classifier, it needs to extend
AbstractMetaLearner, and therefore it needs to be compatible with episodic training.
However, in recent years, some works have cast doubt on the relevance of episodic training for few-shot classification and inspired the research community to look for solutions outside of the prism of episodic training. Recent state-of-the-art methods such as Finetune, Transductive Fine-tuning, or BD-CSPN are agnostic of the training method: you can use absolutely any network to extract features from the images. These methods will classify queries using the information from the support set.
This makes it irrelevant to have the episodic training logic implemented in
AbstractMetaLearner, and even irrelevant for the base class for all few-shot classifiers to be named
I chose to remove this logic and rename the abstract class to
FewShotClassifier in EasyFSL 1.0. Still, it raises an interesting question: how much should the code in my Few-Shot Learning library mirror the latest trends in the research area? If good practices in Few-Shot Learning change every year, do I need to rethink EasyFSL every year completely?
Let’s meet again in 2023 and find out!
Thanks for reading! I hope that this little dissertation was instructive. It’s the reflection of my current interrogations, so don’t hesitate to reach out to me (on Twitter, for instance) if you want to discuss it.
In the meantime, enjoy EasyFSL!