lorenpe2/next-sentence-probability · Apply for community grant: Academic project

PhD project: Study the possibilities of out-of-domain detection in the conversational domain. Many conversational agents (when you don't look at the LLM hype) rely on text classification as part of dialog management. But these agents work with the closed world assumption where we define all kinds of responses and cluster them into multiple intents. However, interaction with real users leads to the open world assumption when we need to detect whether the provided response belongs to one of our predefined class or to no class at all. It's called out-of-domain detection (or out-of-scope detection) and is mainly solved by two approaches: thresholding the confidence of the classifier (or variants of that) or using LM to generate a possible OOD response (and using it as additional class during training). However, I focus on the unsupervised approach to detecting these OOD queries: I want to explore the use of the BERT model that was trained in the next sentence prediction task. This task is very similar to OOD detection and is even more suitable for the conversational domain as it also takes context into account. Currently, the model is trained on DailyDialog and Commonsense-Dialogues Dataset
following the NSP task. I created this space to also leave other doctorates. students interact with this model and test whether it can successfully predict OOD during the dialogue. Unfortunately the CPU computation is very slow as I am trying different context sizes and entering conversations with about 20 turns. Also be aware that this is also first step in this direction as we detect OOD without taking into account defined in-domain classes - therefore described (and tried approach) is more similar to unsupervised dialogue evaluation and I am taking it as an umbrella term for following work.

To show that I'm tackling the issue of OOD detection longer, there are two of my papers:

https://arxiv.org/pdf/2204.10849.pdf
https://arxiv.org/pdf/2109.07968.pdf (section 7)