publications
publications by categories in reversed chronological order. generated by jekyll-scholar.
2025
- Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners2025
- Leveraging the true depth of LLMs2025
- M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models2025
2024
- The Mamba in the Llama: Distilling and Accelerating Hybrid ModelsIn Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024, 2024
- Understanding and Minimising Outlier Features in Transformer TrainingIn Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10 - 15, 2024, 2024
2023
- Graph Neural Networks Go Forward-Forward2023
- Fast Causal Attention with Dynamic SparsityIn Workshop on Efficient Systems for Foundation Models @ ICML2023, 2023
- Fast Attention Over Long Sequences With Dynamic Sparse Flash AttentionIn Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023, 2023
- SUPA: A Lightweight Diagnostic Simulator for Machine Learning in Particle PhysicsIn Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023, 2023