Excited to announce PatientSeek (whyhow-ai/PatientSeek), the first open-source fine-tuned DeepSeek reasoning model for the MED-LEGAL space, designed to run securely and privately on local systems, and trained on one of the largest accessible datasets of patient records.
It is purpose-built for MED-LEGAL workflows, focusing on disease and diagnosis identification and correlation reasoning—critical tasks that require the intersection of healthcare and legal expertise.
My latest project is the outcome of the last 2+ years working with TPUs from the amazing TPU Research Cloud (TRC) program and training Encoder-only LMs with the TensorFlow Model Garden library.
- Cheatsheet for setting-up a TPU VM Pod (with all necessary dependencies) to pretrain LMs with TF Model Garden - Conversion scripts that convert TF Model Garden weights to Hugging Face Transformers-compatible models - Supported architectures include BERT, BERT with Token Dropping and TEAMS
I also released BERT-based models pretrained on the great Hugging Face FineWeb and FineWeb-Edu datasets (10BT subset). With more to come!