- LLM Discussion: Enhancing the Creativity of Large Language Models via Discussion Framework and Role-Play Large language models (LLMs) have shown exceptional proficiency in natural language processing but often fall short of generating creative and original responses to open-ended questions. To enhance LLM creativity, our key insight is to emulate the human process of inducing collective creativity through engaging discussions with participants from diverse backgrounds and perspectives. To this end, we propose LLM Discussion, a three-phase discussion framework that facilitates vigorous and diverging idea exchanges and ensures convergence to creative answers. Moreover, we adopt a role-playing technique by assigning distinct roles to LLMs to combat the homogeneity of LLMs. We evaluate the efficacy of the proposed framework with the Alternative Uses Test, Similarities Test, Instances Test, and Scientific Creativity Test through both LLM evaluation and human study. Our proposed framework outperforms single-LLM approaches and existing multi-LLM frameworks across various creativity metrics. 6 authors · May 10, 2024
- AXNav: Replaying Accessibility Tests from Natural Language Developers and quality assurance testers often rely on manual testing to test accessibility features throughout the product lifecycle. Unfortunately, manual testing can be tedious, often has an overwhelming scope, and can be difficult to schedule amongst other development milestones. Recently, Large Language Models (LLMs) have been used for a variety of tasks including automation of UIs, however to our knowledge no one has yet explored their use in controlling assistive technologies for the purposes of supporting accessibility testing. In this paper, we explore the requirements of a natural language based accessibility testing workflow, starting with a formative study. From this we build a system that takes as input a manual accessibility test (e.g., ``Search for a show in VoiceOver'') and uses an LLM combined with pixel-based UI Understanding models to execute the test and produce a chaptered, navigable video. In each video, to help QA testers we apply heuristics to detect and flag accessibility issues (e.g., Text size not increasing with Large Text enabled, VoiceOver navigation loops). We evaluate this system through a 10 participant user study with accessibility QA professionals who indicated that the tool would be very useful in their current work and performed tests similarly to how they would manually test the features. The study also reveals insights for future work on using LLMs for accessibility testing. 6 authors · Oct 3, 2023
- Artifact: Measuring and Mitigating Gaps in Structural Testing The artifact used for evaluating the experimental results of Measuring and Mitigating Gaps in Structural Testing is publicly available on GitHub, Software Heritage and figshare, and is reusable. The artifact consists of necessary data, tools, scripts, and detailed documentation for running the experiments and reproducing the results shown in the paper. We have also provided a VirtualBox VM image allowing users to quickly setup and reproduce the results. Users are expected to be familiar using the VirtualBox software and Linux platform for evaluating or reusing the artifact. 4 authors · Aug 1, 2023