I am a chief engineer at Pasteur Labs, where we apply ML to engineering design and simulations. I am also a visiting researcher in the ML@CL group at the University of Cambridge. My interests lie somewhere between machine learning and software systems, leaning towards the latter.
I have received a PhD in 2024 while being a part of the same research group, supervised by Neil Lawrence. Somehow I could not keep still, and while working on my thesis at various times also spent time with Secondmind, Seldon, and DELVE.
Before jumping into the world of academia I have spent more than a decade as a software engineer, developing everything from small webapps to data center network software.
software
Emukit is the Python package for all kinds of sequential decision making methods under uncertainty: optimization, quadrature, experiment design, sensitivity analysis. It was developed and released by our research group in Amazon Cambridge, but now lives in a neutral territory. I am the lead developer and main maintainer of Emukit.
[website]
[github]
TTI Explorer is the simulation package we developed in DELVE to study effects of "test-trace-isolate" (TTI) strategies on spread of COVID-19. Our report on it was released shortly before TTI was deployed in the UK, and received wide press coverage.
[github]
GPyOpt is one of the first Python packages for Bayesian optimization. GPyOpt was initially developed mostly by Javier González while he was with Neil Lawrence's group at the University of Sheffield. I took over ownership of GPyOpt from Javier, and lead the package development for few years. GPyOpt is now archived.
[website]
[github]
Trieste is a Bayesian optimization package built on Tensorflow. I became involved in Trieste during my placement at Secondmind, added several major features and became one of the main overall contributors to the project.
[website]
[github]
Seldon Core is an ML inference platform. I co-led the design of the second version of it, focusing the dataflow architecture and data streaming.
[announcement]
[website]
[github]
I was fortunate enough to contribute to many different parts of Amazon. Some highlights:
Data center network (my code likely powers your internet!)
AWS CloudWatch
AWS SageMaker
Supply chain optimization technologies
Prime Air
papers
Selected papers are mentioned here. For a complete list please check out the Google Scholar profile.
Andrei Paleyes, Neil D. Lawrence
3rd Workshop on Machine Learning and Systems (EuroMLSys), EuroSys 2023
[video]
[paper]
[code]
Andrei Paleyes, Siyuan Guo, Bernhard Schölkopf, Neil D. Lawrence
2nd International Conference on AI Engineering - Software Engineering for AI (CAIN), ICSE 2023
[paper]
[code]
Andrei Paleyes, Christian Cabrera-Jojoa, Neil D. Lawrence
1st International Conference on AI Engineering - Software Engineering for AI (CAIN), ICSE 2022
[paper (IEEE library)]
[paper (ACM library)]
[code]
Andrei Paleyes, Mark Pullin, Maren Mahsereci, Cliff McCollum, Neil D. Lawrence, Javier González
Second workshop on machine learning and the physical sciences, NeurIPS, 2019
[paper]
Brendan Avent, Javier González, Tom Diethe, Andrei Paleyes, Borja Balle
Proceedings on Privacy Enhancing Technologies, 2020 (Andreas Pfitzmann Best Student Paper Award)
Privacy Preserving Machine Learning Workshop, 2019
[video]
[code]
[paper]
Andrei Paleyes, Raoul-Gabriel Urma, Neil D. Lawrence
The ML-Retrospectives, Surveys & Meta-Analyses Workshop, NeurIPS, 2020
ACM Computing Surveys (CSUR), 2022
[paper]
Virginia Aglietti, Xiaoyu Lu, Andrei Paleyes, Javier González
International Conference on Artificial Intelligence and Statistics, 2020
[paper]
Bobby He, Sheheryar Zaidi, Bryn Elesedy, Michael Hutchinson, Andrei Paleyes, Guy Harling, Anne M. Johnson, Yee Whye Teh, Royal Society's DELVE group
Royal Society open science 8 (3), 2021
[paper]
[code]
talks
AI Cafe, St Edmund's College, Cambridge, November 2024
[event]
[slides]
PyData meetup, Cambridge, January 2024
[slides]
AI for the study of Environmental Risks (AI4ER), UKRI CDT, Cambridge, November 2023
[slides]
Pasteur Labs Invited Speaker Series, October 2023
[slides]
SciPy 2023
[slides]
SAP Inspiration Sessions (online), June 2023
[slides]
Together with Alex Rakowski
Kafka Summit, London, May 2023
[event]
[video]
[slides]
Data Science Africa Summer School, Kigali, Rwanda, May 2023
[event]
[slides]
Causal Digital Twins workshop, ELLIS Unconference, January 2023
[event]
[slides]
The Ocean Cleanup Challenge (online), Kili Technology, December 2022
[event]
[slides]
Together with Neil Lawrence
Data Science Africa Summer School, Arusha, Tanzania, July 2022
[event]
[slides]
Industry Expert Insights (online), Cambridge Spark, August 19, 2021
[slides]
RSE Lunch Bytes (online), University of Sheffield, July 05, 2021
[video]
[event]
[slides]
DSAIL Research Day at DeKUT (online), June 19, 2021
[event]
[slides]
Gaussian Processes meetup (online), January 21, 2021
[event]
[slides]
Together with Bryn Elesedy
ML and the Physical World course at the University of Cambridge (online), November 12, 2020
[video]
[event]
[slides]
Data Science Africa COVID-19 Webinar (online), April 8, 2020
[event]
ITShare: High load проекты на .Net, Minsk, December 8, 2012
[event]
misc
During the initial phase of the COVID-19 pandemic I became a member of the action team of the DELVE group (thanks to Neil Lawrence for the invite!). DELVE (Data Evaluation and Learning for Viral Epidemics) is a multi-disciplinary group, convened by the Royal Society, to support a data-driven approach to learning from the different approaches countries are taking to managing the pandemic. Over the course of 2020 we produced a number of reports, software and datasets, and provided advice to SAGE and ultimately the UK Government.
[website]
I am co-organizing a series of workshops (led by Alessandra Tosi) where we discuss all aspects surrounding running ML in production.
[ICML 2021]
[NeurIPS 2022]
I often participate in DSA summer school, where with my colleagues we deliver lectures and practicals on the topic of design of ML systems.
[website 2020]
[website 2021]
[website 2022]
[website 2023]
I have been a member of PCs for the following events: ECML 2021, ECML 2022, ECML 2023, SciPy 2023, SciPy 2024, EuroMLSys workshop at EuroSys 2024, ML-RSA workshop at NeurIPS 2020, DMML workshop at ICML 2020, PPAI workshop at AAAI 2023. I have reviewed articles for Nature Communications, Journal of Decision Systems, and Natural Language Processing Journal. I semi-regularly review ML and software engineering books for Manning.