metadata
license: mit
A semi custom network based on the follow paper Simpler Diffusion (SiD2)
This network uses the optimal transport flow matching objective outlined Flow Matching for Generative Modeling
xATGLU Layers are used instead of linears for entry into the transformer MLP layer Expanded Gating Ranges Improve Activation Functions