--- license: cc-by-nc-4.0 datasets: - rfcx/frugalai language: - en metrics: - accuracy pipeline_tag: audio-classification tags: - acoustics - lgbm - frugality - signal-processing - climate - chainsaw --- # Quefrency Guardian: Chainsaw Noise Detector An efficient model to detect chainsaw activity in forest soundscapes using spectral and cepstral audio features. The model is designed for environmental conservation and is based on a LightGBM classifier, capable of low-energy inference on both CPU and GPU devices. This repository provides the complete code and configuration for feature extraction, model implementation, and deployment. ## Installation You can install and use the model in two different ways: ### Option 1: Clone the repository To download the entire repository containing the code, model, and associated files, follow these steps: ```bash git clone https://huggingface.co/tlmk22/QuefrencyGuardian cd QuefrencyGuardian pip install -r requirements.txt ``` Once installed, you can directly import the files into your existing project and use the model. --- ### Option 2: Dynamically load from the Hub If you only want to download the required files to use the model (without cloning the full repository), you can use the `hf_hub_download` function provided by Hugging Face. This method downloads only what is necessary directly from the Hub. Here's an example: ```python import os import sys from huggingface_hub import hf_hub_download import importlib.util # Specify the repository repo_id = "tlmk22/QuefrencyGuardian" # Download the Python file containing the model class and add it to your path model_path = hf_hub_download(repo_id=repo_id, filename="model.py") model_dir = os.path.dirname(model_path) if model_dir not in sys.path: sys.path.append(model_dir) # Dynamically load the class from the downloaded file spec = importlib.util.spec_from_file_location("model", model_path) model_module = importlib.util.module_from_spec(spec) spec.loader.exec_module(model_module) # Import the FastModelHuggingFace class FastModelHuggingFace = model_module.FastModelHuggingFace # Load the pre-trained model fast_model = FastModelHuggingFace.from_pretrained(repo_id) # Perform predictions result = fast_model.predict("path/to/audio.wav", device="cpu") map_labels = {0: "chainsaw", 1: "environment"} print(f"Prediction Result: {map_labels[result[0]]}") ``` Depending on your needs, you can either clone the repository for a full installation or use Hugging Face's dynamic download functionalities for lightweight and direct usage. --- ## Model Overview ### Features The model uses: - **Spectrogram Features** - **Cepstral Features**: Calculated as the FFT of the log spectrogram between [`f_min`-`f_max`] in a filtered quefrency range [`fc_min`-`fc_max`]. - **Time Averaging**: Both feature sets are averaged across the entire audio clip for robustness in noisy settings (Welch methodology). --- ### LightGBM Model The model is a **binary classifier** (chainsaw vs environment) trained on the `rfcx/frugalai` dataset. Key model parameters are included in `model/lgbm_params.json`. --- ## Usage Two example scripts demonstrating how to use the repository or the model downloaded from Hugging Face are available in the `examples` directory. --- ### Performance - **Accuracy**: Achieved 95% on the test set with a 4.5% FPR at the default threshold during the challenge. - **Environmental Impact**: Inference energy consumption was measured at **0.21 Wh**, tracked using CodeCarbon. This metric is dependent on the challenge's infrastructure, as the code was executed within a Docker container provided by the platform. --- ### License This project is licensed under the [Creative Commons Attribution Non-Commercial 4.0 International](https://creativecommons.org/licenses/by-nc/4.0/). You are free to share and adapt the work for non-commercial purposes, provided attribution is given. --- ## Dataset The model was trained and evaluated on the [Rainforest Connection (RFCx) Frugal AI](https://huggingface.co/datasets/rfcx/frugalai) dataset. #### Labels: - `0`: Chainsaw - `1`: Environment --- ## Limitations - **Audio Length**: The classifier is designed for 1 to 3 seconds of audio sampled at either 12 kHz or 24 kHz. - **Environmental Noise**: The model might misclassify if recordings are noisy or if machinery similar to chainsaws is present.