Bayesian machine learning • neurosymbolic AI • AI for science and scientific reasoning
Since July 2024: Chancellor's Fellow @ University of Edinburgh, School of Informatics
Expand for affiliation details
Member of Institute for Adaptive and Neural Computation
Affiliate member of Institute for Language, Cognition and Computation
Affiliate member of Institute for Action, Perception and Behaviour
Generative AI Laboratory Fellow
Call for students
I am looking for motivated students to fill fully-funded PhD positions in the School of Informatics, University of Edinburgh, ideally to begin in September 2025.
Expand for details
The research topics are in the areas of Bayesian machine learning (including probabilistic reasoning in language, generative models, and neurosymbolic methods) and applications in the sciences. I am especially happy to work with students who have a strong background in mathematics and an interest in robust and interpretable AI. Please see my research interests and recent publications for examples of the kind of work we might do together.
The studentships include:
The typical length of a PhD in the UK is 3-3.5 years. All students have at least one secondary supervisor – please let me know if you have a specific person or lab in mind – and there are many opportunities for collaboration.
Interested students should contact me to discuss research directions and the application process. Those with backgrounds and identities underrepresented in maths/ML/AI are particularly encouraged and welcome.
(I regret that I cannot engage in individual discussions with everyone who contacts me, despite my best intentions. If I do not respond, please remind me. However, note that I am not likely to respond to messages that contain information hallucinated by a language model. No need for polished essays: I just want to know about you, your interests, and how you think we might work together.)
Research
I work on algorithms for deep-learning-based reasoning and their applications. I am specifically interested in the following subjects:
- Machine learning for generative models, in particular, induction of compositional structure in generative models and modeling of posteriors over high-dimensional explanatory variables (including with continuous-time (diffusion) generative models). Much of my recent work is on generative flow networks, which are a path towards inference machines that build structured, uncertainty-aware explanations for observed data.
- Applications to natural language processing and reasoning in language: what large language models can do, what they cannot do, and how to overcome their limitations with improved inference procedures. I view human-like symbolic, formal, and mathematical reasoning via Bayesian neurosymbolic methods as a long-term aspiration for artificial intelligence.
- Applications to computer vision: notably, below you can find my work on AI for remote sensing (land cover mapping and change detection), which can be used for tracking land use patterns over time and monitoring the effects of climate change.
The prior distribution
Before moving to Edinbugh, I was a postdoc at Mila – Québec AI institute and Department of Informatics and Operations Research, Université de Montréal, where I was fortunate to work with Prof. Yoshua Bengio and also to collaborate with Profs. Aaron Courville, Gauthier Gidel, and Guillaume Lajoie, among others:
Expand for collaborators and students
I have also been lucky to work with many fellow postdoctoctoral researchers (including Kilian Fatras, Alex Hernández-García, Pablo Lemos, Alex Tong) and M.S./Ph.D. students and interns (including Tristan Deleu, Edward Hu, Moksh Jain, Minsu Kim, Salem Lahlou, Jarrid Rector-Brooks, Alexandra Volokhova, Dinghuai Zhang, among many others).
I was formally trained as a pure mathematician: at the University of Washington (Seattle) (B.S., 2015) and Yale University (M.S. and Ph.D., 2021). In addition, I believe many individuals and societies could benefit from a dose of friendly mathematical education. Some organizations I have been involved in: UW Math Circles (Seattle), Math-M-Addicts (New York City). I am also a coauthor of this collection of problems and puzzles.
Publications and preprints
(also here and here)
In submission / preparation
-
From discrete-time policies to continuous-time diffusion samplers: Asymptotic equivalences and faster training
Julius Berner*, Lorenz Richter*, Marcin Sendera*, Jarrid Rector-Brooks, Nikolay Malkin
preprint TBA
-
Mixtures of in-context learners
Giwon Hong, Emile Van Krieken, Nikolay Malkin, Edoardo Ponti, Pasquale Minervini
preprint
-
Action abstractions for amortized sampling
Oussama Boussif, Léna Néhale Ezzine, Joseph Viviano, Michał Koziarski, Moksh Jain, Nikolay Malkin, Emmanuel Bengio, Rim Assouel, Yoshua Bengio
preprint
-
Adaptive teachers for amortized samplers
Minsu Kim*, Sanghyeok Choi*, Taeyoung Yun, Emmanuel Bengio, Leo Feng, Jarrid Rector-Brooks, Sungsoo Ahn, Jinkyoo Park, Nikolay Malkin, Yoshua Bengio
preprint
-
Discrete, compositional, and symbolic representations through attractor dynamics
Andrew J. Nam, Eric Elmoznino, Nikolay Malkin, James L. McClelland, Yoshua Bengio, Guillaume Lajoie
preprint
-
Can a Bayesian oracle prevent harm from an agent?
Yoshua Bengio*, Michael K. Cohen*, Nikolay Malkin*, Matt MacDermott, Damiano Fornasiere, Pietro Greiner, Younesse Kaddar
preprint
-
Learning diverse attacks on large language models for robust red-teaming and safety tuning
Seanie Lee, Minsu Kim, Lynn Cherif, David Dobre, Juho Lee, Sung Ju Hwang, Kenji Kawaguchi, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Moksh Jain
preprint
-
PQMass: Probabilistic assessment of the quality of generative models using probability mass estimation
Pablo Lemos, Sammy Nasser Sharief, Nikolay Malkin, Laurence Perreault-Levasseur, Yashar Hezaveh
preprint
2024
Accepted / published
-
Amortizing intractable inference in diffusion models for vision, language, and control
Siddarth Venkatraman*, Moksh Jain*, Luca Scimeca*, Minsu Kim*, Marcin Sendera*, Mohsin Hasan, Luke Rowe, Sarthak Mittal, Pablo Lemos, Emmanuel Bengio, Alexandre Adam, Jarrid Rector-Brooks, Yoshua Bengio, Glen Berseth, Nikolay Malkin
NeurIPS 2024
-
Improved off-policy training of diffusion samplers
Marcin Sendera, Minsu Kim, Sarthak Mittal, Pablo Lemos, Luca Scimeca, Jarrid Rector-Brooks, Alexandre Adam, Yoshua Bengio, Nikolay Malkin
NeurIPS 2024
-
Proof Flow: Preliminary study on generative flow network language model tuning for formal reasoning
Matthew Ho, Vincent Zhu, Xiaoyin Chen, Moksh Jain, Nikolay Malkin, Edwin Zhang
NeurIPS 2024 “System-2 Reasoning at Scale” workshop
-
Amortizing intractable inference in diffusion models for Bayesian inverse problems [extension of conference paper]
Siddarth Venkatraman, Moksh Jain, Luca Scimeca, Minsu Kim, Marcin Sendera, Mohsin Hasan, Luke Rowe, Sarthak Mittal, Pablo Lemos, Emmanuel Bengio, Alexandre Adam, Jarrid Rector-Brooks, Yashar Hezaveh, Laurence Perreault-Levasseur, Yoshua Bengio, Glen Berseth, Nikolay Malkin
NeurIPS 2024 “Machine Learning and the Physical Sciences” workshop
-
Path-filtering in path-integral simulations of open quantum systems using GFlowNets
Jeremy Lackman-Mincoff, Moksh Jain, Nikolay Malkin, Yoshua Bengio, Lena Simine
Journal of Chemical Physics 161(14), 2024
-
V-STaR: Training verifiers for self-taught reasoners
Arian Hosseini, Xingdi Yuan, Nikolay Malkin, Aaron Courville, Alessandro Sordoni, Rishabh Agarwal
COLM 2024
-
Machine learning and information theory concepts towards an AI Mathematician
Yoshua Bengio, Nikolay Malkin
Bulletin of the American Mathematical Society, 2024
-
Iterated denoising energy matching for sampling from Boltzmann densities
Tara Akhound-Sadegh*, Jarrid Rector-Brooks*, Joey Bose*, Sarthak Mittal, Pablo Lemos, Cheng-Hao Liu, Marcin Sendera, Siamak Ravanbakhsh, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Alexander Tong
ICML 2024
-
Improving gradient-guided nested sampling for posterior inference
Pablo Lemos, Nikolay Malkin, Will Handley, Yoshua Bengio, Yashar Hezaveh, Laurence Perreault-Levasseur
ICML 2024
-
Discrete probabilistic inference as control in multi-path environments
Tristan Deleu, Padideh Nouri, Nikolay Malkin, Doina Precup, Yoshua Bengio
UAI 2024
-
Amortizing intractable inference in large language models
Edward Hu*, Moksh Jain*, Eric Elmoznino, Younesse Kaddar, Guillaume Lajoie, Yoshua Bengio, Nikolay Malkin
ICLR 2024; best paper honourable mention
-
Delta-AI: Local objectives for amortized inference in sparse graphical models
Jean-Pierre Falet*, Hae-Beom Lee*, Nikolay Malkin*, Chen Sun, Dragos Secrieru, Dinghuai Zhang, Guillaume Lajoie, Yoshua Bengio
ICLR 2024
-
Expected flow networks in stochastic environments and two-player zero-sum games
Marco Jiralerspong*, Bilun Sun*, Danilo Vucetic*, Tianyu Zhang, Yoshua Bengio, Gauthier Gidel, Nikolay Malkin
ICLR 2024
-
PhyloGFN: Phylogenetic inference with GFlowNets
Ming Yang Zhou, Zichao Yan, Elliot Layne, Nikolay Malkin, Dinghuai Zhang, Moksh Jain, Mathieu Blanchette, Yoshua Bengio
ICLR 2024
-
Simulation-free Schrödinger bridges via score and flow matching
Alexander Tong*, Nikolay Malkin*, Kilian Fatras*, Lazar Atanackovic, Yanlei Zhang, Guillaume Huguet, Hananeh Aliee, Guy Wolf, Yoshua Bengio
AISTATS 2024
-
Improving and generalizing flow-based generative models with minibatch optimal transport
Alexander Tong*, Kilian Fatras*, Nikolay Malkin*, Guillaume Huguet, Yanlei Zhang, Jarrid Rector-Brooks, Guy Wolf, Yoshua Bengio
TMLR, 2024
Preprints / notes
-
On generalization for generative flow networks
Anas Krichel, Nikolay Malkin, Salem Lahlou, Yoshua Bengio
preprint
2023
Accepted / published
-
Joint Bayesian inference of graphical structure and parameters with a single generative flow network
Tristan Deleu, Mizu Nishikawa-Toomey, Jithendaraa Subramanian, Nikolay Malkin, Laurent Charlin, Yoshua Bengio
NeurIPS 2023
-
Let the flows tell: Solving graph combinatorial problems with GFlowNets
Dinghuai Zhang, Hanjun Dai, Nikolay Malkin, Aaron Courville, Yoshua Bengio, Ling Pan
NeurIPS 2023
-
Discrete, compositional, and symbolic representations through attractor dynamics [workshop version]
Andrew J. Nam, Eric Elmoznino, Nikolay Malkin, Chen Sun, Yoshua Bengio, Guillaume Lajoie
NeurIPS 2023 “Information-Theoretic Principles in Cognitive Systems” workshop
-
Donor activity is associated with US legislators’ attention to political issues
Pranav Goel, Nikolay Malkin*, SoRelle Gaynor*, Nebojsa Jojic, Kristina Miler, Philip Resnik
PLOS One, 2023
-
GFlowNet-EM for learning compositional latent variable models
Edward Hu*, Nikolay Malkin*, Moksh Jain, Katie Everett, Alexandros Graikos, Yoshua Bengio
ICML 2023
-
A theory of continuous generative flow networks
Salem Lahlou, Tristan Deleu, Pablo Lemos, Dinghuai Zhang, Alexandra Volokhova, Alex Hernández-García, Léna Néhale Ezzine, Yoshua Bengio, Nikolay Malkin
ICML 2023
-
Learning GFlowNets from partial episodes for improved convergence and stability
Kanika Madan, Jarrid Rector-Brooks*, Maksym Korablyov*, Emmanuel Bengio, Moksh Jain, Andrei Nica, Tom Bosc, Yoshua Bengio, Nikolay Malkin
ICML 2023
-
Better training of GFlowNets with local credit and incomplete trajectories
Ling Pan, Nikolay Malkin, Dinghuai Zhang, Yoshua Bengio
ICML 2023
-
GFlowOut: Dropout with generative flow networks
Dianbo Liu, Moksh Jain, Bonaventure Dossou, Qianli Shen, Salem Lahlou, Anirudh Goyal, Nikolay Malkin, Chris Emezue, Dinghuai Zhang, Nadhir Hassen, Xu Ji, Kenji Kawaguchi, Yoshua Bengio
ICML 2023
-
Thompson sampling for improved exploration in GFlowNets
Jarrid Rector-Brooks, Kanika Madan, Moksh Jain, Maksym Korablyov, Cheng-Hao Liu, Sarath Chandar, Nikolay Malkin, Yoshua Bengio
ICML 2023 “Structured Probabilistic Inference and Generative Modeling” workshop
-
BatchGFN: Generative flow networks for batch active learning
Shreshth Malik, Salem Lahlou, Andrew Jesson, Moksh Jain, Nikolay Malkin, Tristan Deleu, Yoshua Bengio, Yarin Gal
ICML 2023 “Structured Probabilistic Inference and Generative Modeling” workshop
-
Probabilistic reasoning over sets using large language models
Batu Ozturkler, Nikolay Malkin, Zhen Wang, Nebojsa Jojic
ACL 2023
-
GFlowNets and variational inference
Nikolay Malkin*, Salem Lahlou*, Tristan Deleu*, Xu Ji, Edward Hu, Katie Everett, Dinghuai Zhang, Yoshua Bengio
ICLR 2023
Preprints / notes
-
Unifying generative models with GFlowNets and beyond
Dinghuai Zhang, Ricky T. Q. Chen, Nikolay Malkin, Yoshua Bengio
preprint
2022
Accepted / published
-
Diffusion models as plug-ang-play priors
Alexandros Graikos, Nikolay Malkin, Nebojsa Jojic, Dimitris Samaras
NeurIPS 2022
-
Trajectory balance: Improved credit assignment in GFlowNets
Nikolay Malkin, Moksh Jain, Emmanuel Bengio, Chen Sun, Yoshua Bengio
NeurIPS 2022
-
Posterior samples of source galaxies in strong gravitational lenses with score-based priors
Alexandre Adam, Adam Coogan, Nikolay Malkin, Ronan Legin, Laurence Perreault Levasseur, Yashar Hezaveh, Yoshua Bengio
NeurIPS 2022 “Machine Learning for the Physical Sciences” workshop
-
Resolving label uncertainty with implicit posterior models
Esther Rolf*, Nikolay Malkin*, Alexandros Graikos, Ana Jojic, Caleb Robinson, Nebojsa Jojic
UAI 2022
-
Generative flow networks for discrete probabilistic modeling
Dinghuai Zhang, Nikolay Malkin, Zhen Liu, Alexandra Volokhova, Aaron Courville, Yoshua Bengio
ICML 2022
-
Unifying generative models with GFlowNets
Dinghuai Zhang, Ricky T. Q. Chen, Nikolay Malkin, Yoshua Bengio
ICML 2022 “Beyond Bayes: Paths Towards Universal Reasoning Systems” workshop
-
Coherence boosting: When your pretrained language model is not paying enough attention
Nikolay Malkin, Zhen Wang, Nebojsa Jojic
ACL 2022
-
The outcome of the 2021 IEEE GRSS Data Fusion Contest - Track MSD: Multitemporal semantic change detection
Zhuohong Li, Fangxiao Lu, Hongyan Zhang, Lilin Tu, Jiayi Li, Xin Huang, Caleb Robinson, Nikolay Malkin, Nebojsa Jojic, Pedram Ghamisi, Ronny Hänsch, Naoto Yokoya
JSTARS vol.15
2021
Accepted / published
-
Studying word order through iterative shuffling
Nikolay Malkin, Sameera Lanka, Pranav Goel, Nebojsa Jojic
EMNLP 2021
-
GPT Perdetry Test: Generating new meanings for new words
Nikolay Malkin, Sameera Lanka, Pranav Goel, Sudha Rao, Nebojsa Jojic
NAACL 2021
-
From local algorithms to global results: Human-machine collaboration for robust analysis of geographically diverse imagery
Nebojsa Jojic, Nikolay Malkin, Caleb Robinson, Anthony Ortiz
IGARSS 2021
-
On the Galois action on motivic fundamental groups of punctured elliptic and rational curves
Nikolay Malkin; thesis advisor A.B. Goncharov
PhD thesis
-
Global land cover mapping with weak supervision: Outcome of the 2020 IEEE GRSS Data Fusion Contest
Caleb Robinson, Nikolay Malkin, Nebojsa Jojic, Huijun Chen, Rongjun Qin, Changlin Xiao, Michael Schmitt, Pedram Ghamisi, Ronny Hänsch, Naoto Yokoya
JSTARS vol.14
Preprints / notes
-
High-resolution land cover change from low-resolution labels: Simple baselines for the 2021 IEEE GRSS Data Fusion Contest
Nikolay Malkin, Caleb Robinson, Nebojsa Jojic
preprint
2020
Accepted / published
-
Weakly supervised semantic segmentation in the 2020 IEEE GRSS Data Fusion Contest
Caleb Robinson, Nikolay Malkin, Lucas Hu, Bistra Dilkina, Nebojsa Jojic
IGARSS 2020; contest winner
-
Mining self-similarity: Label super-resolution with epitomic representations
Nikolay Malkin, Anthony Ortiz, Nebojsa Jojic
ECCV 2020
-
Human-machine collaboration for fast land cover mapping
Caleb Robinson, Anthony Ortiz, Nikolay Malkin, Blake Elias, Andi Peng, Dan Morris, Bistra Dilkina, Nebojsa Jojic
AAAI 2020
Preprints / notes
-
Learning intersecting representations of short random walks on graphs
Nikolay Malkin, Nebojsa Jojic
preprint
-
Motivic fundamental groups of CM elliptic curves and geometry of Bianchi hyperbolic threefolds
Nikolay Malkin
preprint
-
Shuffle relations for Hodge and motivic correlators
Nikolay Malkin
preprint
2019
Accepted / published
-
Large scale high-resolution land cover mapping with multi-resolution data
Caleb Robinson, Le Hou, Nikolay Malkin, Rachel Soobitsky, Jacob Czawlytko, Bistra Dilkina, Nebojsa Jojic
CVPR 2019
-
Label super-resolution networks
Nikolay Malkin, Caleb Robinson, Le Hou, Rachel Soobitsky, Jacob Czawlytko, Dimitris Samaras, Joel Saltz, Lucas Joppa, Nebojsa Jojic
ICLR 2019
Preprints / notes
-
Label super-resolution with inter-instance loss
Maozheng Zhao, Le Hou, Han Le, Dimitris Samaras, Nebojsa Jojic, Danielle Fassler, Tahsin Kurc, Rajarsi Gupta, Nikolay Malkin, Shroyer Kenneth, Joel Saltz
preprint