Hi! 👋 I am Philippe Bich, a Research Scientist at Huawei Research in Zürich, where I am part of the AI Team and focus on model compression and quantization for LLMs/VLMs.

Before joining Huawei, I completed my Ph.D. at Politecnico di Torino under the supervision of Prof. Gianluca Setti, working on AI model compression and on making deep neural networks more efficient for resource-constrained platforms. During my Master’s thesis at the Boston University Robotics Lab with Prof. John Baillieul, I built a strong interest in AI at the edge, which later guided my doctoral research.

On this page, I try to keep track of my most recent works, talks, and publications. Feel free to reach out if any of it sparks your curiosity!

🔥 News

  • Jun 2026
    🚀 Coming soon: “KVarN: Variance-Normalized KV-Cache Quantization Mitigates Error Accumulation in Reasoning Tasks”. Stay tuned!
  • May 2026
    🎉 My paper “SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights” has been accepted at ICML 2026!
  • Feb 2026
    🤗 SINQ is now integrated into Hugging Face Transformers! Check out the code and docs on GitHub.

📝 Selected Publications

  • KVarN preview Coming soon
    KVarN: Variance-Normalized KV-Cache Quantization Mitigates Error Accumulation in Reasoning Tasks Lorenz K. Mueller, Philippe Bich, Chiara Boretti, Hyun Min Chang, Jiawei Zhuang, Lukas Cavigelli Preprint, 2026.

    TL;DR: A novel state-of-the-art variance-normalized KV-cache quantization scheme that limits compounding error in long reasoning traces and beats TurboQuant by Google Research with better accuracy and lower bits.

  • SINQ preview ICML 2026
    SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights Lorenz K. Mueller, Philippe Bich, Jiawei Zhuang, Ahmet Çalik, Luca Benfenati, Lukas Cavigelli International Conference on Machine Learning (ICML), 2026.

    TL;DR: A calibration-free quantization method based on Sinkhorn normalization that delivers strong low-bit LLM weights out of the box. Integrated into 🤗 Hugging Face Transformers.

  • TPAMI preview IEEE TPAMI 2025
    On the Universal Approximation Properties of Deep Neural Networks using MAM Neurons Philippe Bich, Andriy Enttsel, Luciano Prono, Alex Marchioni, Fabio Pareschi, Mauro Mangia, Gianluca Setti, Riccardo Rovatti IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025.

    TL;DR: Theoretical foundations showing that Multiply-And-Max/min (MAM) neurons preserve the universal approximation property while enabling aggressive structured pruning.

Browse all publications

📖 Education

  • Politecnico di Torino
    Politecnico di Torino
    • Ph.D. in Electrical, Electronics and Communications Engineering Nov 2021 – 2025 · Advisor: Prof. Gianluca Setti · AI model compression and quantization
    • M.Sc. in Mechatronics Engineering — 110/110 cum laude 2018 – 2021 · Master's thesis at the Boston University Robotics Lab with Prof. John Baillieul
    • B.Sc. in Computer Engineering — 109/110 2015 – 2018

📚 All Publications

2026
  • ICML
    SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights International Conference on Machine Learning (ICML), 2026.
  • J-STARS
    FOREST-GC: A conFOrmable Rendering Engine for Synthetic Tree Generation and Counting IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2026.
2025
  • TNNLS
    A Multiply-And-Max/min Neuron Paradigm for Aggressively Prunable Deep Neural Networks IEEE Transactions on Neural Networks and Learning Systems, vol. 36, no. 8, pp. 14414–14427, 2025.
  • TPAMI
    On the Universal Approximation Properties of Deep Neural Networks using MAM Neurons IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025.
  • MLJ
    Linearly-Interpretable Concept Embedding Models for Text Analysis Machine Learning, vol. 114, no. 10, art. 224, 2025.
  • xAI
    V-CEM: Bridging Performance and Intervenability in Concept-based Models World Conference on Explainable Artificial Intelligence (xAI), pp. 48–67, 2025.
  • ECML PKDD
    Towards Better Generalization and Interpretability in Unsupervised Concept-Based Models Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD), 2025.
  • TCAS-II
    MESA: A Dynamical Attention-based Pre-processing Pipeline for High-throughput Event-based Computer Vision Tasks IEEE Transactions on Circuits and Systems II: Express Briefs, 2025.
2024
  • CVPRW
    Event-based Eye Tracking: AIS 2024 Challenge Survey CVPR 2024 Workshops — AIS: Vision, Graphics and AI for Streaming.
  • BioCAS
    Memory in Motion: Exploring Leaky Integration of Time Surfaces for Event-Based Eye-Tracking IEEE Biomedical Circuits and Systems Conference (BioCAS), pp. 1–5, 2024.
  • AICAS
    Optimizing Vision Transformers: Leveraging Max and Min Operations for Efficient Pruning IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2024.
2023
  • CVPRW
    Pedro: an Event-based Dataset for Person Detection in Robotics CVPR 2023 Workshops — 4th International Workshop on Event-Based Vision.
  • MWSCAS
    Multiply-and-Max/min Neurons at the Edge: Pruned Autoencoder Implementation IEEE International Midwest Symposium on Circuits and Systems (MWSCAS), 2023.
2022
  • ICRA
    Visual Navigation Using Sparse Optical Flow and Time-to-Transit IEEE International Conference on Robotics and Automation (ICRA), pp. 9397–9403, 2022.
  • BioCAS
    Aggressively Prunable MAM²-based Deep Neural Oracle for ECG Acquisition by Compressed Sensing IEEE Biomedical Circuits and Systems Conference (BioCAS), pp. 163–167, 2022.