--- title: Paper Hero emoji: πŸ’ͺ colorFrom: indigo colorTo: yellow sdk: docker app_port: 7860 pinned: true license: apache-2.0 --- # πŸ’ͺ Paper Hero A toolkit to help search for papers from aclanthology, arXiv and dblp. ## 🌴 Setup 1. Make sure you have [Git](https://git-scm.com/) and [Python](https://www.python.org/downloads/) 3.10.8 installed (or Python >= 3.9). 2. Install dependencies: `pip install -r requirements.txt` ## πŸš€ QuickStart Run the example in `run.py`: ```bash $ # clone this repo $ git clone https://github.com/Spico197/paper-hero.git $ cd paper-hero $ # get ready for the acl data, since it is cache-based $ bash scripts/get_aclanthology.sh $ python run.py $ # the results will be saved into `results/`, check them out πŸŽ‰ $ ls results ``` ```python from src.interfaces.aclanthology import AclanthologyPaperList from src.utils import dump_paper_list_to_markdown_checklist if __name__ == "__main__": # use `bash scripts/get_aclanthology.sh` to download and prepare anthology data first paper_list = AclanthologyPaperList("cache/aclanthology.json") ee_query = { "title": [ # Any of the strings below is matched ["information extraction"], ["event", "extraction"], # title must include `event` and `extraction` ["event", "argument", "extraction"], ["event", "detection"], ["event", "classification"], ["event", "tracking"], ["event", "relation", "extraction"], ], # Besides the title constraint, venue must also meet the needs "venue": [ ["acl"], ["emnlp"], ["naacl"], ["coling"], ["findings"], ["tacl"], ["cl"], ], } ee_papers = paper_list.search(ee_query) dump_paper_list_to_markdown_checklist(ee_papers, "results/ee-paper-list.md") ``` ## πŸ—ΊοΈ Roadmap - [x] aclanthology - [x] arXiv - [x] dblp - [x] add frontend support for building a demo - [x] year and month searching