D-ro Marc A. Kastner

Pri mi

Transformer-Based Audio Generation Conditioned by 2D Latent Maps: A Demonstration

Reen al la antaŭa paĝo

Aŭtoroj: Christian Limberg, Zhe Zhang, Marc A. Kastner

Resumo:

This paper presents a demonstration of an improved framework for audio sample generation using interactive 2D latent maps. Building upon the foundational work "Mapping the Audio Landscape for Innovative Music Sample Generation", we enhance the framework by introducing visualization techniques for exploring the 2D audio landscape through different audio features such as energy and bandwidth. Additionally, we train a t-SNE embedding over these features to create a more abstract visualization of the audio samples on the map. This demo also significantly improves usability and user interactivity, allowing for a more intuitive and efficient exploration of the generated audio samples. The demo, remotely accessible via https://limchr.github.io/gesam_demo/ showcases these improvements in real-time, providing users with an enhanced novel interface for generating high-quality audio samples.

Tipo: 31th Intl. Conf. on MultiMedia Modeling (MMM2025)

Dato de publikigo: To be published in Jan 2025

Linkoj: [ demo ]


Se vi havas demandojn aŭ komentojn pri ĉi tiu esplorado, bonvolu lasi komenton sube aŭ sendi al mi retpoŝton. Mi respondos rapide.
© 2013-2023 Marc A. Kastner. Powered by KirbyCMS. Some rights reserved. Privacy policy.