Audio-To-MIDI-And-Advanced-Renderer

Running

avans06 commited on 6 days ago

Commit

0676790

1 Parent(s): 8adc333

feat(synth): Add advanced frequency management for 8-bit synth MIDI and effects

This commit introduces a suite of new tools specifically for the **8-bit synthesizer**, providing fine-grained control over the frequency spectrum to address issues with both excessive harshness in the high-end and muddiness in the low-end.

These features empower users to create cleaner, more balanced, and professional-sounding mixes directly within the synthesizer.

**1. MIDI Pre-processing (For 8-bit Synth): Low-Pitch Management**
Complements the existing high-pitch taming by adding a new "Low-Pitch Attenuation" rule.
Users can now define a low pitch threshold (e.g., C2) and a velocity scale.
The pre-processor will automatically reduce the velocity of any notes falling below this threshold before they are sent to the **8-bit synthesizer**.
This is a preventative measure to control excessive sub-bass energy, reduce muddiness, and prevent clipping.

**2. Delay/Echo Effect (For 8-bit Synth): Full Frequency Spectrum Control**
The Delay/Echo effect has been upgraded with a comprehensive "Frequency Management" toolkit, allowing users to shape the timbre of the echoes generated by the **8-bit synthesizer**.
**High-Pass Filter:** A configurable high-pass filter can be applied to the echo layer, cleanly removing low frequencies to prevent echoes from adding mud to the mix.
**Low-Pass Filter:** A new, corresponding low-pass filter can be applied to the echo layer, removing high frequencies to make echoes sound darker, warmer, and less harsh.
**Flexible Pitch Shifting (Transposer):** The previous fixed octave checkboxes have been replaced with powerful pitch-shifting sliders for both bass and treble notes, allowing for the creation of complex harmonies in the echo trails.
The audio filtering pipeline in `Render_MIDI` was refactored to support this multi-filter audio processing on the dedicated echo layer.

Files changed (1) hide show

app.py +180 -17

app.py CHANGED Viewed

@@ -182,6 +182,10 @@ class AppParameters:
     s8bit_enable_midi_preprocessing: bool = True       # Master switch for this feature
     s8bit_high_pitch_threshold: int = 84               # Pitch (C6) above which velocity is scaled
     s8bit_high_pitch_velocity_scale: float = 0.8       # Velocity multiplier for high notes (e.g., 80%)
     s8bit_chord_density_threshold: int = 4             # Min number of notes to be considered a dense chord
     s8bit_chord_velocity_threshold: int = 100          # Min average velocity for a chord to be tamed
     s8bit_chord_velocity_scale: float = 0.75           # Velocity multiplier for loud, dense chords
@@ -202,6 +206,12 @@ class AppParameters:
     s8bit_delay_division: str = "Dotted 8th Note"
     s8bit_delay_feedback: float = 0.5                  # Velocity scale for each subsequent echo (50%)
     s8bit_delay_repeats: int = 3                       # Number of echoes to generate
 # =================================================================================================
 # === Helper Functions ===
@@ -341,32 +351,42 @@ def format_params_for_metadata(params: AppParameters, transcription_log: dict =
 def preprocess_midi_for_harshness(midi_data: pretty_midi.PrettyMIDI, params: AppParameters):
     """
     Analyzes and modifies a PrettyMIDI object in-place to reduce characteristics
-    that can cause harshness in simple synthesizers.
     Args:
         midi_data: The PrettyMIDI object to process.
         params: The AppParameters object containing the control thresholds.
     """
-    print("Running MIDI pre-processing to reduce harshness...")
-    notes_modified = 0
     chords_tamed = 0
-    # Rule 1: High-Pitch Attenuation
     for instrument in midi_data.instruments:
         for note in instrument.notes:
             if note.pitch > params.s8bit_high_pitch_threshold:
-                original_velocity = note.velocity
                 note.velocity = int(note.velocity * params.s8bit_high_pitch_velocity_scale)
                 if note.velocity < 1: note.velocity = 1
-                notes_modified += 1
-    if notes_modified > 0:
-        print(f"  - Tamed {notes_modified} individual high-pitched notes.")
-    # Rule 2: Chord Compression
     # This is a simplified approach: group notes by near-simultaneous start times
     all_notes = sorted([note for instrument in midi_data.instruments for note in instrument.notes], key=lambda x: x.start)
     time_window = 0.02  # 20ms window to group notes into a chord
     i = 0
     while i < len(all_notes):
@@ -660,12 +680,21 @@ def create_delay_effect(midi_data: pretty_midi.PrettyMIDI, params: AppParameters
         print("  - No notes found to apply delay to. Skipping.")
         return processed_midi
-    # --- Step 3: Generate echo notes using the calculated delay time ---
     echo_notes = []
     for i in range(1, params.s8bit_delay_repeats + 1):
         for original_note in notes_to_echo:
             # Create a copy of the note for the echo
             echo_note = copy.copy(original_note)
             # Use the tempo-synced time and velocity
             time_offset = i * delay_time_s
@@ -696,6 +725,42 @@ def create_delay_effect(midi_data: pretty_midi.PrettyMIDI, params: AppParameters
     return processed_midi
 def one_pole_lowpass(x, cutoff_hz, fs):
     """Simple one-pole lowpass filter (causal), stable and cheap."""
     if cutoff_hz <= 0 or cutoff_hz >= fs/2:
@@ -1832,18 +1897,83 @@ def Render_MIDI(*, input_midi_path: str, params: AppParameters, progress: gr.Pro
                 arpeggiated_midi = arpeggiate_midi(base_midi, params)
             # --- Step 2: Render the main (original) layer ---
-            print("  - Rendering main synthesis layer...")
             # Synthesize the waveform, passing new FX parameters to the synthesis function
-            main_waveform = synthesize_8bit_style(
                 midi_data=base_midi,
                 fs=srate,
                 params=params,
                 progress=progress
             )
-            final_waveform = main_waveform
-            # --- Step 3: Render the arpeggiator layer (if enabled) ---
             if arpeggiated_midi and arpeggiated_midi.instruments:
                 print("  - Rendering and mixing arpeggiator layer...")
                 # Temporarily override panning for the arpeggiator synth call
@@ -3764,6 +3894,28 @@ if __name__ == "__main__":
                                         label="Number of Repeats",
                                         info="The total number of echoes to generate for each note."
                                     )
                         # --- Section 2: MIDI Pre-processing (Corrective Tool) ---
                         with gr.Accordion("MIDI Pre-processing (Corrective Tool)", open=False):
@@ -3783,6 +3935,17 @@ if __name__ == "__main__":
                                     label="High Pitch Velocity Scale",
                                     info="Multiplier for high notes' velocity (e.g., 0.8 = 80% of original velocity)."
                                 )
                                 s8bit_chord_density_threshold = gr.Slider(
                                     2, 10, value=4, step=1,
                                     label="Chord Density Threshold",

     s8bit_enable_midi_preprocessing: bool = True       # Master switch for this feature
     s8bit_high_pitch_threshold: int = 84               # Pitch (C6) above which velocity is scaled
     s8bit_high_pitch_velocity_scale: float = 0.8       # Velocity multiplier for high notes (e.g., 80%)
+    # --- Low-pitch management parameters ---
+    s8bit_low_pitch_threshold: int = 36                # Low pitch threshold (C2)
+    s8bit_low_pitch_velocity_scale: float = 0.9        # Low pitch velocity scale
     s8bit_chord_density_threshold: int = 4             # Min number of notes to be considered a dense chord
     s8bit_chord_velocity_threshold: int = 100          # Min average velocity for a chord to be tamed
     s8bit_chord_velocity_scale: float = 0.75           # Velocity multiplier for loud, dense chords
     s8bit_delay_division: str = "Dotted 8th Note"
     s8bit_delay_feedback: float = 0.5                  # Velocity scale for each subsequent echo (50%)
     s8bit_delay_repeats: int = 3                       # Number of echoes to generate
+    # --- NEW: Low-End Management for Delay ---
+    s8bit_delay_highpass_cutoff_hz: int = 100          # High-pass filter frequency for delay echoes (removes low-end rumble from echoes)
+    s8bit_delay_bass_pitch_shift: int = 0              # Pitch shift (in semitones) applied to low notes in delay echoes
+    # --- High-End Management for Delay ---
+    s8bit_delay_lowpass_cutoff_hz: int = 5000          # Lowpass filter frequency for delay echoes (removes harsh high frequencies from echoes)
+    s8bit_delay_treble_pitch_shift: int = 0            # Pitch shift (in semitones) applied to high notes in delay echoes
 # =================================================================================================
 # === Helper Functions ===
 def preprocess_midi_for_harshness(midi_data: pretty_midi.PrettyMIDI, params: AppParameters):
     """
     Analyzes and modifies a PrettyMIDI object in-place to reduce characteristics
+    that can cause harshness or muddiness in simple synthesizers.
+    Now includes both high and low pitch attenuation.
     Args:
         midi_data: The PrettyMIDI object to process.
         params: The AppParameters object containing the control thresholds.
     """
+    print("Running MIDI pre-processing to reduce harshness and muddiness...")
+    high_notes_tamed = 0
+    low_notes_tamed = 0
     chords_tamed = 0
+    # Rule 1 & 2: High and Low Pitch Attenuation
     for instrument in midi_data.instruments:
         for note in instrument.notes:
+            # Tame very high notes to reduce harshness/aliasing
             if note.pitch > params.s8bit_high_pitch_threshold:
                 note.velocity = int(note.velocity * params.s8bit_high_pitch_velocity_scale)
                 if note.velocity < 1: note.velocity = 1
+                high_notes_tamed += 1
+            # Tame very low notes to reduce muddiness/rumble
+            if note.pitch < params.s8bit_low_pitch_threshold:
+                note.velocity = int(note.velocity * params.s8bit_low_pitch_velocity_scale)
+                if note.velocity < 1: note.velocity = 1
+                low_notes_tamed += 1
+    if high_notes_tamed > 0:
+        print(f"  - Tamed {high_notes_tamed} individual high-pitched notes.")
+    if low_notes_tamed > 0:
+        print(f"  - Tamed {low_notes_tamed} individual low-pitched notes.")
+    # Rule 3: Chord Compression
     # This is a simplified approach: group notes by near-simultaneous start times
     all_notes = sorted([note for instrument in midi_data.instruments for note in instrument.notes], key=lambda x: x.start)
     time_window = 0.02  # 20ms window to group notes into a chord
     i = 0
     while i < len(all_notes):
         print("  - No notes found to apply delay to. Skipping.")
         return processed_midi
+    # --- Step 3: Generate echo notes with optional octave shift using the calculated delay time ---
     echo_notes = []
+    bass_note_threshold = 48 # MIDI note for C3
+    treble_note_threshold = 84 # MIDI note for C6
     for i in range(1, params.s8bit_delay_repeats + 1):
         for original_note in notes_to_echo:
             # Create a copy of the note for the echo
             echo_note = copy.copy(original_note)
+            # --- Octave Shift Logic for both Bass and Treble ---
+            if params.s8bit_delay_bass_pitch_shift and original_note.pitch < bass_note_threshold:
+                echo_note.pitch += params.s8bit_delay_bass_pitch_shift
+            elif params.s8bit_delay_treble_pitch_shift and original_note.pitch > treble_note_threshold:
+                echo_note.pitch += params.s8bit_delay_treble_pitch_shift
             # Use the tempo-synced time and velocity
             time_offset = i * delay_time_s
     return processed_midi
+def butter_highpass(cutoff, fs, order=5):
+    nyq = 0.5 * fs
+    normal_cutoff = cutoff / nyq
+    b, a = signal.butter(order, normal_cutoff, btype='high', analog=False)
+    return b, a
+def apply_butter_highpass_filter(data, cutoff, fs, order=5):
+    """Applies a Butterworth highpass filter to a stereo audio signal."""
+    if cutoff <= 0:
+        return data
+    b, a = butter_highpass(cutoff, fs, order=order)
+    # Apply filter to each channel independently
+    filtered_data = np.zeros_like(data)
+    for channel in range(data.shape[1]):
+        filtered_data[:, channel] = signal.lfilter(b, a, data[:, channel])
+    return filtered_data
+def butter_lowpass(cutoff, fs, order=5):
+    nyq = 0.5 * fs
+    normal_cutoff = cutoff / nyq
+    b, a = signal.butter(order, normal_cutoff, btype='low', analog=False)
+    return b, a
+def apply_butter_lowpass_filter(data, cutoff, fs, order=5):
+    """Applies a Butterworth lowpass filter to a stereo audio signal."""
+    # A cutoff at or above Nyquist frequency is pointless
+    if cutoff >= fs / 2:
+        return data
+    b, a = butter_lowpass(cutoff, fs, order=order)
+    filtered_data = np.zeros_like(data)
+    for channel in range(data.shape[1]):
+        filtered_data[:, channel] = signal.lfilter(b, a, data[:, channel])
+    return filtered_data
 def one_pole_lowpass(x, cutoff_hz, fs):
     """Simple one-pole lowpass filter (causal), stable and cheap."""
     if cutoff_hz <= 0 or cutoff_hz >= fs/2:
                 arpeggiated_midi = arpeggiate_midi(base_midi, params)
             # --- Step 2: Render the main (original) layer ---
+            print("  - Rendering main synthesis layer (including echoes)...")
             # Synthesize the waveform, passing new FX parameters to the synthesis function
+            main_and_echo_waveform = synthesize_8bit_style(
                 midi_data=base_midi,
                 fs=srate,
                 params=params,
                 progress=progress
             )
+            # --- Isolate and filter the echo part if it exists ---
+            echo_instrument = None
+            for inst in base_midi.instruments:
+                if inst.name == "Echo Layer":
+                    echo_instrument = inst
+                    break
+            # --- Step 3: Render the delay layers (if enabled) ---
+            if echo_instrument:
+                print("  - Processing echo layer audio effects...")
+                # Create a temporary MIDI object with ONLY the echo instrument
+                echo_only_midi = pretty_midi.PrettyMIDI()
+                echo_only_midi.instruments.append(echo_instrument)
+                # Render ONLY the echo layer to an audio waveform
+                echo_waveform_raw = synthesize_8bit_style(midi_data=echo_only_midi, fs=srate, params=params)
+                # --- Start of the Robust Filtering Block ---
+                # Apply both High-Pass and Low-Pass filters
+                unfiltered_echo = echo_waveform_raw
+                filtered_echo = echo_waveform_raw
+                # --- Apply Filters if requested ---
+                # Convert to a format filter function expects (samples, channels)
+                # This is inefficient, we should only do it once.
+                # Let's assume the filter functions are adapted to take (channels, samples)
+                # For now, we'll keep the transpose for simplicity.
+                # We will apply filters on a temporary copy to avoid chaining issues.
+                temp_filtered_echo = echo_waveform_raw.T
+                should_filter = False
+                # Apply High-Pass Filter
+                if params.s8bit_delay_highpass_cutoff_hz > 0:
+                    print(f"    - Applying high-pass filter at {params.s8bit_delay_highpass_cutoff_hz} Hz...")
+                    temp_filtered_echo = apply_butter_highpass_filter(temp_filtered_echo, params.s8bit_delay_highpass_cutoff_hz, srate)
+                    should_filter = True
+                # Apply Low-Pass Filter
+                if params.s8bit_delay_lowpass_cutoff_hz < srate / 2:
+                     print(f"    - Applying low-pass filter at {params.s8bit_delay_lowpass_cutoff_hz} Hz...")
+                     temp_filtered_echo = apply_butter_lowpass_filter(temp_filtered_echo, params.s8bit_delay_lowpass_cutoff_hz, srate)
+                     should_filter = True
+                # Convert back and get the difference
+                if should_filter:
+                    filtered_echo = temp_filtered_echo.T
+                # To avoid re-rendering, we subtract the unfiltered echo and add the filtered one
+                # Ensure all waveforms have the same length before math ---
+                target_length = main_and_echo_waveform.shape[1]
+                # Pad the unfiltered echo if it's shorter
+                len_unfiltered = unfiltered_echo.shape[1]
+                if len_unfiltered < target_length:
+                    unfiltered_echo = np.pad(unfiltered_echo, ((0, 0), (0, target_length - len_unfiltered)))
+                # Pad the filtered echo if it's shorter
+                len_filtered = filtered_echo.shape[1]
+                if len_filtered < target_length:
+                    filtered_echo = np.pad(filtered_echo, ((0, 0), (0, target_length - len_filtered)))
+                # Now that all shapes are guaranteed to be identical, perform the operation.
+                main_and_echo_waveform -= unfiltered_echo[:, :target_length]
+                main_and_echo_waveform += filtered_echo[:, :target_length]
+            final_waveform = main_and_echo_waveform
+            # --- Step 4: Render the arpeggiator layer (if enabled) ---
             if arpeggiated_midi and arpeggiated_midi.instruments:
                 print("  - Rendering and mixing arpeggiator layer...")
                 # Temporarily override panning for the arpeggiator synth call
                                         label="Number of Repeats",
                                         info="The total number of echoes to generate for each note."
                                     )
+                                    # --- UI controls for low-end management ---
+                                    s8bit_delay_highpass_cutoff_hz = gr.Slider(
+                                        0, 500, value=100, step=10,
+                                        label="Echo High-Pass Filter (Hz)",
+                                        info="Filters out low frequencies from the echoes to prevent muddiness. Set to 0 to disable. 80-120Hz is a good range to clean up bass."
+                                    )
+                                    s8bit_delay_bass_pitch_shift = gr.Slider(
+                                        -12, 24, value=12, step=1,
+                                        label="Echo Pitch Shift for Low Notes (Semitones)",
+                                        info="Shifts the pitch of echoes for very low notes (below C3). +12 is one octave up, +7 is a perfect fifth. 0 to disable."
+                                    )
+                                    # --- UI controls for high-end management ---
+                                    s8bit_delay_lowpass_cutoff_hz = gr.Slider(
+                                        1000, 20000, value=5000, step=500,
+                                        label="Echo Low-Pass Filter (Hz)",
+                                        info="Filters out high frequencies from the echoes to reduce harshness. Set to 20000 to disable. 4k-8kHz is a good range to make echoes sound 'darker'."
+                                    )
+                                    s8bit_delay_treble_pitch_shift = gr.Slider(
+                                        -24, 12, value=-12, step=1,
+                                        label="Echo Pitch Shift for High Notes (Semitones)",
+                                        info="Shifts the pitch of echoes for very high notes (above C6). -12 is one octave down. 0 to disable."
+                                    )
                         # --- Section 2: MIDI Pre-processing (Corrective Tool) ---
                         with gr.Accordion("MIDI Pre-processing (Corrective Tool)", open=False):
                                     label="High Pitch Velocity Scale",
                                     info="Multiplier for high notes' velocity (e.g., 0.8 = 80% of original velocity)."
                                 )
+                                # --- UI controls for low-pitch management ---
+                                s8bit_low_pitch_threshold = gr.Slider(
+                                    21, 60, value=36, step=1,
+                                    label="Low Pitch Threshold (MIDI Note)",
+                                    info="Notes below this pitch will have their velocity reduced to prevent muddiness. 36 = C2."
+                                )
+                                s8bit_low_pitch_velocity_scale = gr.Slider(
+                                    0.1, 1.0, value=0.9, step=0.05,
+                                    label="Low Pitch Velocity Scale",
+                                    info="Multiplier for low notes' velocity. Use this to gently tame excessive sub-bass."
+                                )
                                 s8bit_chord_density_threshold = gr.Slider(
                                     2, 10, value=4, step=1,
                                     label="Chord Density Threshold",