Interactive Digital Violin Tutor: Setup and Demonstration at NUS

Convert Your Audio To Text

4.9/5

3723 customer reviews

Learn how to set up and use the Interactive Digital Violin Tutor (iDVT) in a home environment, featuring hardware, software, and calibration steps.

iDVT Demo Automatic Audio-Visual Violin Transcription - 22

Added on 09/06/2024

Speakers

Add new speaker

Speaker 1: Next, let me show you the IDPT setup in the home environment. In this part, we are going to introduce you the system setup for the interactive digital widening tutor. We are going to do this in four steps. First, we show you the whole environment in the room, and second, we introduce you the whole hardware system, and then comes with the software system, and last, the video camera calibration. Okay. First, let me show you the home. This is a very ordinary office in the School of Computing, National University of Singapore. Even though we put some curtains on the wall, but it's not very necessary. We can set up the system in any ordinary home environment. Okay. Then we introduce you the hardware system. The interactive digital widening tutor comprises, it contains, first, a laptop PC for the processing or recording of our playing, widening playing, and then we need a microphone to record the audio part, and two normal video cameras to record the hand motion and the finger motion. Okay. So, later I will act as a beginning widening learner. And my partner, Huang Huang, will show you the software part. At last, we will show you the showcase of the IDVT in use.

Speaker 2: This is the interface of our system. The first window is for reference piece display. You can right-click in this window and play a reference piece using the pop-up menu. The second window is for student piece display. You can right-click in this window and conduct student piece recording, transcription, display, and playback here. The third window is for video processing display. You can right-click in this window and conduct video processing, namely fingering analysis and hand tracking. To use the system, the first thing to do is to calibrate the two cameras. Click Options, Input Source, Live Recording. Click here to start video calibration. The views of the two cameras are shown in this window. The calibration of the cameras is really flexible. The only requirement is that the camera capturing the finger should capture the bird's view of the violin from neck to bridge. The camera capturing the hand should capture the movement of the right hand. After calibration, right-click here to start record audio-visual to begin recording. After live recording, we will have one audio file and two video files saved on a hard disk. Now we can do the transcription using these three files as inputs. For better demonstration, here we use the audio and video files captured from professionals. Open student audio. Choose the audio file recorded. Now you have two choices, audio-only transcription and audio-video transcription. Let's click transcription, audio-only and start with audio-only transcription first. Processing complete. The audio-only transcription is displayed in this window. We can open the reference piece to see the difference and find out if the player played correctly. Now we can use video processing to improve audio-only transcription. Choose transcribe audio-visual. Audio processing and video processing can run concurrently. But since we have already done audio processing, we just need to start video processing. Open finger track. Open hand track. Hit play to start processing. Finger tracking result and hand tracking result are shown in this window. Video processing complete. Wait for a few seconds for audio-visual fusion. Fusion complete. Now the display has been updated. Now if you change between the two transcription modes, you can see the difference. If you think the difference seems minute, let's listen. Play the reference piece. Then audio-only result. And audio-visual result. See the difference? This is the showcase of the iDVT in use. By introducing the visual information, that is finger and hand motion, the iDVT can produce more accurate feedback to the violin learners.

Speaker 1: Here are the references mentioned in this video demo. Thank you for your attention. Here is the iDVT demonstration, automatic violin transcription using audio-visual fusion.