Best Machine Learning Research of 2021 So Far

16 June 2025

0

The start of 2021 saw many prominent research groups extending the state of machine learning science to consistently greater heights. In my efforts to keep pace with this accelerated progress, I’ve noticed a number of hot topics that are gaining the attention of researchers: explainable/interpretable ML, federated learning, gradient boosting, causal inference, ROC analysis, and many others. In this article, we’ll take a journey through my top picks of papers for the first half of 2021 that I found compelling and worthwhile. Through my effort to stay current with the field’s research advancement, I found the directions represented in these papers to be very promising. I hope you enjoy my selections as much as I have. (Check my lists from 2019 and 2020).

Scaling Hierarchical Agglomerative Clustering to Billion-sized Datasets

Hierarchical Agglomerative Clustering (HAC) is one of the oldest but still most widely used clustering methods. However, HAC is notoriously hard to scale to large data sets as the underlying complexity is at least quadratic in the number of data points and many algorithms to solve HAC are inherently sequential. This paper proposes Reciprocal Agglomerative Clustering (RAC), a distributed algorithm for HAC that uses a novel strategy to efficiently merge clusters in parallel. The paper proves theoretically that RAC recovers the exact solution of HAC. Furthermore, under clusterability and balancedness assumption it’s shown provable speedups in total runtime due to the parallelism. It’s also shown that these speedups are achievable for certain probabilistic data models. In extensive experiments, it’s shown that this parallelism is achieved on real-world data sets and that the proposed RAC algorithm can recover the HAC hierarchy on billions of data points connected by trillions of edges in less than an hour.

Deep ROC Analysis and AUC as Balanced Average Accuracy to Improve Model Selection, Understanding and Interpretation

Optimal performance is critical for decision-making tasks from medicine to autonomous driving, however common performance measures may be too general or too specific. For binary classifiers, diagnostic tests, or prognosis at a timepoint, measures such as the area under the receiver operating characteristic (ROC) curve, or the area under the precision-recall curve, are too general because they include unrealistic decision thresholds. On the other hand, measures such as accuracy, sensitivity or the F1 score are measures at a single threshold that reflect an individual single probability or predicted risk, rather than a range of individuals or risk. This paper proposes a method in between, deep ROC analysis that examines groups of probabilities or predicted risks for more insightful analysis. The research translates esoteric measures into familiar terms: AUC and the normalized concordant partial AUC are balanced average accuracy (a new finding); the normalized partial AUC is average sensitivity, and the normalized horizontal partial AUC is average specificity.

Hardware Acceleration of Explainable Machine Learning using Tensor Processing Units

Machine learning (ML) is successful in achieving human-level performance in various fields. However, it lacks the ability to explain an outcome due to its black-box nature. While existing explainable ML is promising, almost all of these methods focus on formatting interpretability as an optimization problem. Such a mapping leads to numerous iterations of time-consuming complex computations, which limits their applicability in real-time applications. This paper proposes a novel framework for accelerating explainable ML using Tensor Processing Units (TPUs). The proposed framework exploits the synergy between matrix convolution and Fourier transform, and takes full advantage of TPU’s natural ability in accelerating matrix computations. Specifically, this paper makes three important contributions: (i) the proposed work is the first attempt in enabling hardware acceleration of explainable ML using TPUs; (ii) the proposed approach is applicable across a wide variety of ML algorithms, and effective utilization of TPU-based acceleration can lead to real-time outcome interpretation; and (iii) extensive experimental results demonstrate that the proposed approach can provide an order-of-magnitude speedup in both classification time (25x on average) and interpretation time (13x on average) compared to state-of-the-art techniques.

Learning to Optimize: A Primer and A Benchmark

Learning to optimize (L2O) is an emerging approach that leverages machine learning to develop optimization methods, aiming at reducing the laborious iterations of hand engineering. It automates the design of an optimization method based on its performance on a set of training problems. This data-driven procedure generates methods that can efficiently solve problems similar to those in the training. In sharp contrast, the typical and traditional designs of optimization methods are theory-driven, so they obtain performance guarantees over the classes of problems specified by the theory. The difference makes L2O suitable for repeatedly solving a certain type of optimization problems over a specific distribution of data, while it typically fails on out-of-distribution problems. The practicality of L2O depends on the type of target optimization, the chosen architecture of the method to learn, and the training procedure. This new paradigm has motivated a community of researchers to explore L2O and report their findings. This paper is poised to be the first comprehensive survey and benchmark of L2O for continuous optimization. The GitHub repo associated with this paper can be found HERE.

Federated Quantum Machine Learning

Distributed training across several quantum computers could significantly improve the training time and if we could share the learned model, not the data, it could potentially improve the data privacy as the training would happen where the data is located. It is believed that no work has been done in quantum machine learning (QML) in a federation setting yet. This paper presents the federated training on hybrid quantum-classical machine learning models although the framework could be generalized to a pure quantum machine learning model. Specifically, it was considered the quantum neural network (QNN) coupled with the classical pre-trained convolutional model. The distributed federated learning scheme demonstrated almost the same level of trained model accuracies and yet significantly faster-distributed training. It demonstrates a promising future research direction for scaling and privacy aspects.

Towards Causal Representation Learning

The two fields of machine learning and graphical causality arose and developed separately. However, there is now cross-pollination and increasing interest in both fields to benefit from the advances of the other. This paper reviews fundamental concepts of causal inference and relate them to crucial open problems of machine learning, including transfer and generalization, thereby assaying how causality can contribute to modern machine learning research. This also applies in the opposite direction: it’s noted that most work in causality starts from the premise that the causal variables are given. A central problem for AI and causality is, thus, causal representation learning, the discovery of high-level causal variables from low-level observations. Finally, the paper delineates some implications of causality for machine learning and proposes key research areas at the intersection of both communities.

QuPeL: Quantized Personalization with Applications to Federated Learning

Traditionally, federated learning (FL) aims to train a single global model while collaboratively using multiple clients and a server. Two natural challenges that FL algorithms face are heterogeneity in data across clients and collaboration of clients with diverse resources. This machine learning research paper introduces a quantized and personalized FL algorithm QuPeL that facilitates collective training with heterogeneous clients while respecting resource diversity. For personalization, clients are allowed to learn compressed personalized models with different quantization parameters depending on their resources. Towards this, an algorithm is proposed for learning quantized models through a relaxed optimization problem, where quantization values are also optimized over. When each client participating in the (federated) learning process has different requirements of the quantized model (both in value and precision), a quantized personalization framework is formulated by introducing a penalty term for local client objectives against a globally trained model to encourage collaboration.

Explainable Artificial Intelligence Approaches: A Survey

The lack of explainability of a decision from an Artificial Intelligence (AI) based “black box” system/model, despite its superiority in many real-world applications, is a key stumbling block for adopting AI in many high stakes applications of different domain or industry. While many popular Explainable Artificial Intelligence (XAI) methods or approaches are available to facilitate a human-friendly explanation of the decision, each has its own merits and demerits, with a plethora of open challenges. This machine learning research paper demonstrates popular XAI methods with a mutual case study/task (i.e. credit default prediction), analyze for competitive advantages from multiple perspectives (e.g. local, global), provide meaningful insight on quantifying explainability, and recommend paths towards responsible or human-centered AI using XAI as a medium. Practitioners can use this work as a catalog to understand, compare, and correlate competitive advantages of popular XAI methods. In addition, this survey elicits future research directions towards responsible or human-centric AI systems, which is crucial to adopt AI in high-stakes applications.

tf.data: A Machine Learning Data Processing Framework

Training machine learning models requires feeding input data for models to ingest. Input pipelines for machine learning jobs are often challenging to implement efficiently as they require reading large volumes of data, applying complex transformations, and transferring data to hardware accelerators while overlapping computation and communication to achieve optimal performance. This machine learning research paper presents tf.data, a framework for building and executing efficient input pipelines for machine learning jobs. The tf.data API provides operators which can be parameterized with user-defined computation, composed, and reused across different machine learning domains. These abstractions allow users to focus on the application logic of data processing, while tf.data’s runtime ensures that pipelines run efficiently. The paper demonstrates that input pipeline performance is critical to the end-to-end training time of state-of-the-art machine learning models. tf.data delivers the high performance required, while avoiding the need for manual tuning of performance knobs.

Boost then Convolve: Gradient Boosting Meets Graph Neural Networks

Graph neural networks (GNNs) are powerful models that have been successful in various graph representation learning tasks. Whereas gradient boosted decision trees (GBDT) often outperform other machine learning methods when faced with heterogeneous tabular data. But what approach should be used for graphs with tabular node features? Previous GNN models have mostly focused on networks with homogeneous sparse features and are suboptimal in the heterogeneous setting. This paper proposes a novel architecture that trains GBDT and GNN jointly to get the best of both worlds: the GBDT model deals with heterogeneous features, while GNN accounts for the graph structure. The model benefits from end-to-end optimization by allowing new trees to fit the gradient updates of GNN. With an extensive experimental comparison to the leading GBDT and GNN models, the researchers demonstrate a significant increase in performance on a variety of graphs with tabular features. The GitHub repo associated with this paper can be found HERE.

Notes from the editor:

How to Learn More about Machine Learning Research

At our upcoming event this November 16th-18th in San Francisco, ODSC West 2021 will feature a plethora of talks, workshops, and training sessions on machine learning and machine learning research. You can register now for 60% off all ticket types before the discount drops to 40% in a few weeks. Some highlighted sessions on machine learning include:

Towards More Energy-Efficient Neural Networks? Use Your Brain!: Olaf de Leeuw | Data Scientist | Dataworkz
Practical MLOps: Automation Journey: Evgenii Vinogradov, PhD | Head of DHW Development | YooMoney
Applications of Modern Survival Modeling with Python: Brian Kent, PhD | Data Scientist | Founder The Crosstab Kite
Using Change Detection Algorithms for Detecting Anomalous Behavior in Large Systems: Veena Mendiratta, PhD | Adjunct Faculty, Network Reliability and Analytics Researcher | Northwestern University

Sessions on MLOps:

Tuning Hyperparameters with Reproducible Experiments: Milecia McGregor | Senior Software Engineer | Iterative
MLOps… From Model to Production: Filipa Peleja, PhD | Lead Data Scientist | Levi Strauss & Co
Operationalization of Models Developed and Deployed in Heterogeneous Platforms: Sourav Mazumder | Data Scientist, Thought Leader, AI & ML Operationalization Leader | IBM
Develop and Deploy a Machine Learning Pipeline in 45 Minutes with Ploomber: Eduardo Blancas | Data Scientist | Fidelity Investments

Sessions on Deep Learning:

GANs: Theory and Practice, Image Synthesis With GANs Using TensorFlow: Ajay Baranwal | Center Director | Center for Deep Learning in Electronic Manufacturing, Inc
Machine Learning With Graphs: Going Beyond Tabular Data: Dr. Clair J. Sullivan | Data Science Advocate | Neo4j
Deep Dive into Reinforcement Learning with PPO using TF-Agents & TensorFlow 2.0: Oliver Zeigermann | Software Developer | embarc Software Consulting GmbH
Get Started with Time-Series Forecasting using the Google Cloud AI Platform: Karl Weinmeister | Developer Relations Engineering Manager | Google

Best Machine Learning Research of 2021 So Far

How to Learn More about Machine Learning Research

Asynchronous Advantage Actor Critic (A3C) algorithm

How to open a website in a Tkinter window?

Create a SQL table from Pandas dataframe using SQLAlchemy

LEAVE A REPLY Cancel reply

Most Popular

YouTube testing new feature that users say is ‘bordering on usable’

Is Samsung about to ruin the Galaxy S26’s big reveal? Place your bets now

It’s hard to resist these earbuds dripping with style when they’re now over 40% off

You can skip the Galaxy S26 Ultra. [Video]

EDITOR PICKS

YouTube testing new feature that users say is ‘bordering on usable’

Is Samsung about to ruin the Galaxy S26’s big reveal? Place your bets now

It’s hard to resist these earbuds dripping with style when they’re now over 40% off

POPULAR POSTS

YouTube testing new feature that users say is ‘bordering on usable’

Is Samsung about to ruin the Galaxy S26’s big reveal? Place your bets now

It’s hard to resist these earbuds dripping with style when they’re now over 40% off

POPULAR CATEGORY

ABOUT US

FOLLOW US