Blogs |

von Mises-Fisher Mixture Models and Parameter Estimation with Expectation-Maximization

parameter estimation for von Mises-Fisher Mixture Models (vMFMM)

7 min read · June 17, 2025

2025 · Expectation-Maximization Spherical Clustering Directional Statistics machine learning · research
LLMs Efficient Inference with KV-cache

KV-cache in Large Language Models and its implementation (GPT-2 model as an example)

29 min read · January 16, 2025

2025 · Transformer LLM · research
Container for Deep Learning Environment

containerizing and deploying deep learning environments (e.g., Docker, Apptainer)

45 min read · September 10, 2024

2024 · Docker · engineering
ViT model from scratch

a step-by-step implementation of Vision Transformer (ViT) model introduced in the paper ``An Image is Worth 16x16 Words Transformers for Image Recognition at Scale`` (ICLR 2021) with PyTorch.

28 min read · May 21, 2022

2022 · Transformer · research
Transformer model from scratch

a step-by-step implementation of Transformer model introduced in the paper ``Attention is All You Need`` (NeurIPS 2017) with PyTorch.

17 min read · April 21, 2022

2022 · Transformer · research
Distributed Training with PyTorch

a summary of some experiences about distributed training with PyTorch.

21 min read · November 23, 2021

2021 · PyTorch · engineering