Dr. Marc A. Kastner

Über mich

Transformer-Based Audio Generation Conditioned by 2D Latent Maps: A Demonstration

Zurück zu Veröffentlichungen

Authoren: Christian Limberg, Zhe Zhang, Marc A. Kastner

Abstrakt:

This paper presents a demonstration of an improved framework for audio sample generation using interactive 2D latent maps. Building upon the foundational work "Mapping the Audio Landscape for Innovative Music Sample Generation", we enhance the framework by introducing visualization techniques for exploring the 2D audio landscape through different audio features such as energy and bandwidth. Additionally, we train a t-SNE embedding over these features to create a more abstract visualization of the audio samples on the map. This demo also significantly improves usability and user interactivity, allowing for a more intuitive and efficient exploration of the generated audio samples. The demo, remotely accessible via https://limchr.github.io/gesam_demo/ showcases these improvements in real-time, providing users with an enhanced novel interface for generating high-quality audio samples.

Typ: 31th Intl. Conf. on MultiMedia Modeling (MMM2025)

Veröffentlichungsdatum: To be published in Jan 2025

Links: [ demo ]


Wenn Sie Fragen oder Kommentare zu dieser Forschung haben, zögern Sie nicht einen Kommentar zu hinterlassen oder mir eine email zu schreiben. Ich werde mich zeitnahe zurückmelden.
© 2013-2023 Marc A. Kastner. Powered by KirbyCMS. Some rights reserved. Privacy policy.