IRRA / pages /losses.py
grostaco's picture
feat: add implementation details page
d54ea16
raw
history blame
1.31 kB
import streamlit as st
from st_pages import add_indentation
add_indentation()
st.title('Loss functions')
st.subheader('SDM Loss')
st.markdown('''
The similarity distribution matching (SDM) loss, which is the KL divergence
of the image to text and text to image to the label distribution.
We define $f^v$ and $f^t$ to be the global representation of the visual and textual features respectively.
The cosine similarity $sim(u, v) = \\frac{u \\cdot v}{|u||v|}$ will be used to compute the probability of the labels.
We define $y_{i, j}=1$ if the visual feature $f^v_i$ matches the textual feature $f^t_j$, else $y_{i, j}=0$.
The predicted label distribution can be formulated by''')
st.latex(r'''
p_{i} = \sigma(sim(f^v_i, f^t))
''')
st.markdown('''
We can define the image to text loss as
''')
st.latex(r'''
\mathcal{L}_{i2t} = KL(\mathbf{p_i} || \mathbf{q_i})
''')
st.markdown('Where $\\mathbf{q_i}$, the true probability distribution, is defined as')
st.latex(r'''
q_{i, j} = \frac{y_{i, j}}{\sum_{k=1}^{N} y_{i, k}}
''')
st.markdown('It should be noted that the reason this computation is needed is because there could be multiple correct labels.')
st.subheader('IRR (MLM) Loss')
st.subheader('ID Loss')