aplux/Facial-Landmark-Detection

Facial-Landmark-Detection: Pose Estimation

Facial-Landmark-Detection is a lightweight deep learning model for real-time facial keypoint detection (e.g., eyes, nose tip, mouth corners), optimized via multi-task learning and attention mechanisms for robustness in complex scenarios. It employs a hybrid backbone (e.g., MobileNetV3-HRNet) with dynamic coordinate regression to handle occlusion, lighting variations, and extreme poses, supporting 68/106-point high-precision localization. Through knowledge distillation, the model is compressed below 1MB parameters, achieving NRMSE <4.5% on 300W and WFLW datasets with 30+ FPS on mobile devices—10x faster than traditional Dlib. Ideal for AR virtual makeup, expression analysis, face alignment, and medical facial assessment, it balances edge deployment efficiency and sub-millimeter accuracy, with INT8 quantization for ultra-low latency.

Source model

Input shape: 1x3x128x128
Number of parameters: 5.17M
Model size: 20.95M
Output shape: 1x265

The source model can be found here

Performance Reference

Please search model by model name in Model Farm

Inference & Model Conversion

Please search model by model name in Model Farm

License

Source Model: BSD-3-CLAUSE
Deployable Model: APLUX-MODEL-FARM-LICENSE