Daniele Paliotta

Hi 👋 ! I’m Daniele. I am PhD candidate in the Machine Learning Group at the University of Geneva, under the supervision of François Fleuret. My focus is on transformers, LLM systems and efficiency, and alternative architectures.

Recently, I was a research intern at Cartesia working on multimodal and TTS foundation models (transformers, state space models, linear RNNs) with a strong focus on efficiency.

Previously, I was a researcher at Together AI supervised by Tri Dao, where I worked on LLM distillation, efficient inference, and developed speculative decoding for Mamba/linear RNNs.

In a previous life, I have worked as a software engineer, I have done machine learning at Truelayer, and I have played in Capture the Flag competitions.

I also love sailing, reading, writing fiction, cooking and playing guitar.

You can follow me on Twitter, or find me on Linkedin.

selected publications

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Junxiong Wang, Daniele Paliotta, Avner May, and 2 more authors

In Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024, 2024
Fast Causal Attention with Dynamic Sparsity

Daniele Paliotta, Matteo Pagliardini, Martin Jaggi, and 1 more author

In Workshop on Efficient Systems for Foundation Models @ ICML2023, 2023
Fast Attention Over Long Sequences With Dynamic Sparse Flash Attention

Matteo Pagliardini, Daniele Paliotta, Martin Jaggi, and 1 more author

In Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023, 2023
Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners

Daniele Paliotta, Junxiong Wang, Matteo Pagliardini, and 6 more authors

2025
Understanding and Minimising Outlier Features in Transformer Training

Bobby He, Lorenzo Noci, Daniele Paliotta, and 2 more authors

In Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024, 2024
Leveraging the true depth of LLMs

Ramón Calvo González, Daniele Paliotta, Matteo Pagliardini, and 2 more authors

2025
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Junxiong Wang, Wen-Ding Li, Daniele Paliotta, and 3 more authors

2025