# Introduction The Sound blocks are undoubtedly very cool. But wait, you can only play specific sounds you have put before? That's pretty boring. What if you can play any sound, stored in something like a list? This demo shows what may be possible on Scratch in terms of sound reproduction. The music that this project is creating is made solely by a list (which is freqs). It produces a recognizable sound (although with a lot of artifacts). # How it works ## The limitations As we have discussed earlier, we can't put custom sounds on Scratch, so instead we need to somehow recreate sounds from sounds. You might have seen those MIDI players on Scratch that plays music from Scratch's Music extension, however those usually aren't enough to recreate arbitrary sounds. This demo uses something similar, however instead of instruments, it uses sine waves to reconstruct the sound. A sine wave (or a pure tone) is the simplest construct of sound. You may have heard it before—the beep when a speech is censored is a 1 kHz sine wave tone. It is as simple as it sounds, with each tone occupying just a single frequency. With such a simple sound, you may wonder how will you convert normal sounds into them. ## Fourier Transform In the early 19th century (that’s 18xx), this guy called Joseph Fourier found something that turns a function of time into a function of frequency (and phase for those pedantic people). This way, we can turn something like an audio signal into their frequencies (or, in this context, the pitch). Fourier transform (and it's derivatives) are used everywhere. Physics, audio processing, image compression, etc. For this demo, we will use a tool that converts audio into a spectrogram, which is a diagram of frequency over time. ## The ARSS The Analysis & Resynthesis Sound Spectrograph is a really ancient tool that generates spectrograms from audio. I use the ARSS since it allows analysis of frequency on a log scale, while other tools like Scipy can't, at least directly. The log scale is important because while ARSS does return the amplitude of the frequencies, it doesn't return the phases, which when used with linear scale, will result in a dreadful to listen audio. Besides, I don't think Scratch can recreate the phases anyway. I conjecture that this doesn't happen on a log scale since on there the ratio between frequencies are consistent, while linear scales do create inconsistent frequency ratios, including some odd and dissonant ones like 23:24 and 56:33. (Future Gilbert here: It turns out this type of transform is called CQT (constant Q transform). Look it up on Wikipedia to learn more.) The cool thing about the ARSS is that it can also try to reconstruct audio from spectrograms as well. Remember, the ARSS doesn't output the phases of the transform, so it also generates audio from sine waves. And it does that... as good as this demo does. ## Conversion The ARSS outputs an image, which I turn into ASCII text using a custom Python script. I set the ARSS parameters to generate a spectrogram on frequency range 50 Hz to 12800 Hz, with 16 bars per octave and 30 pixels per second. I also set the gamma to 1 (meaning that the amplitude of the song is linearly proportional to the brightness of the image) and the log base of 2. This resulted in an image with a height of 129 pixels. I trim the first column of pixels, since it's not really needed. ## Playback To play back the sound, I use 128 clones to play the tones. While Scratch can't overlay the same sound, it can overlay multiple sound, even on the same sprite. The clones adjust the volume of the tones according to the `current` variable, which represents the current frequencies at this time. (I'm glad audio volumes aren't global.) Currently this is set using a timer, however with some tweaks it can be seeked to a specific point in time. # Results The results of the synthesized sound contains a lot of artifacts. This may be because the spectrogram has a low resolution; about 128 tones. The gamma parameter may also have a role here; it limits the dynamic range of the sound.
# Discussion This demo is purely just a proof-of-concept. It doesn't compress the spectrogram; it doesn't use the assets efficiently; it doesn't convert an array of samples into a spectrogram, etc. In fact, I have to save this thrice because I'm being rate-limited by Scratch :/ All of this limitations may be addressed on another revision. ## Other implementations Looks like I'm not the first one who made this type of audio player before :P Kouzeru made a similar project with an eerily similar synthesis method here: https://scratch.mit.edu/projects/472178490/ I guess great minds think alike. # FAQ ## The audio seems to click a lot. That is a flaw on my implementation. ## The audio seems to "drop-out" more over time. How can I fix this? That seems to be a JavaScript limitation. (probably too many audio objects loaded?) Refreshing the page eliminates that though. ## I hear the tones on the project and for some reason it's noisy. I've encoded the WAV files with a 8-bit sample size to reduce the size. ## The cymbal sounds really tinny. That's a flaw of my implementation. But the ARSS also re-synthesizes cymbals and noises incorrectly. ## What's the exact ARSS parameters you've chosen? -g 1 -- log-base 2 -min 50 -max 12800 --bpo 16 --pps 30 ## Why do you start your lines with hashes? Those are Markdown for headers. ~~also, stop calling them hashtags~~ # Credits - The music is Four Beers Polka by Kevin MacLeod, which is licensed under the CC-BY 4.0 license. The license is in https://creativecommons.org/licenses/by/4.0/ (You wouldn't expect a credit that professional on Scratch, aren't you?) - The ARSS (the program that makes the spectrogram) If you use this project on your publication (for some reason), it would be nice if you cite my work :) The "thank you" message when you remix my project is enough for Scratch, but on other sites you may consider them. (I was about to write this formally like a real researcher but I've decided not to)