Audio2Guitarist-GAN

A two-stage generative adversarial network that generates images of guitarists playing guitar from audio.

Descriptions

To be updated.

Architecture

Stage 1: Audio to binary mask

Stage 2: Binary mask to color image

More information in this blog post.

Result

1. Video output

Case 1: Test on the same guitarist, 南澤大介.
- Video 1: トゥ・ビー・ウィズ・ユー (acoustic guitar solo)
  - Source YouTube video here.
- Video 2: 愛はかげろうのように (acoustic guitar solo)
  - Source YouTube video here.
Case 2: Test on different guitarist, 伍々慧. The playing style and recording are different from the training data.
- Video 3: Autumn Leaves (early version) / Satoshi Gogo
  - Source YouTube video here.
- Video 4: I got rhythm / Satoshi Gogo
  - Source YouTube video here.
Case 3: Test on different instruments and human voice.
- [TBU]

Here are the official website of 南澤大介 and 伍々慧.

2. Conditional output

The following gifs are result images generated from an audio that the model had never seen.

Source video (audio): tupliのテーマ (acoustic guitar solo) 作曲／編曲：南澤大介
Top to bottom: audio visualization, stage-1 output, stage-2 output, ground truth.

3. Pose-guided generation

The following gifs show outputs of 2nd-stage model given conditional poses.

Source video (audio): John Pizzarelli - "I Got Rhythm" (solo) at the Fretboard Journal
Top: Reference video; Middle: conditional hands input; Bottom: stage 2 output.

References