Li : Dev Actionscript Development Blog

8Aug/09Off

WarpPlayer

I've recently become involved in a very interesting audio project with Jason McGuire. Jason is an extremely talented guitarist based in San Francisco. He hosts an online learning center -> www.flamenco-lessons.com, which I've been a part of for the last few months. The site exposes hundreds of videos, scores and detailed info about Flamenco guitar playing for registered users and provides a truly impressive learning resource. I've had several guitar teachers in my life, but few educated me as well as Jason is doing through this site... for me, another clear indication of the power of e-education.

So Jason asked me if I could think of ways to improve his site, and that's how all this audio research started. The first thing that came to my mind was having the ability to slow down the playback of his videos, maintaining the audio's frequency characteristics as much as possible, i.e. trying to maintain the audio's pitch. Slowing down the videos is great for understanding what's going on in the playing, and even to play on top of the videos, but if the pitch changes in the process, playing over it sounds simply awful, and becomes impossible... Hence, audio time-stretching would be ideal here. We are planning to produce a pretty advanced online application for the site with this feature amongst many others that will hopefully enhance the music learning experience.

What is relevant in all this to the Flash side of things, is that I am producing a media player engine which I'm calling 'WarpPlayer'. This engine should hold the core functionality of the application we are building with Jason. The player should be able to handle multimedia in an advanced manner: use dynamic audio, mix several audio tracks, be able to time-stretch audio, and sync the playback to multiple flv's or f4v's in Flash. An application based on WarpPlayer could hence play video in slow motion and time-stretch its audio, much like the way Quicktime does this with its A/V controls.

Here is a demo of the first capabilities of WarpPlayer: DEMO.

The player needs to pre-load all its assets for now (multiple sound tracks and multiple videos) before it can actually start playing, so please be patient. However, my next goal is to make it be able to start playback as soon as a small portion of its assets have been loaded.

Try playing with the small audio mixer in the left, switching video views, or the tempo controller to the right of the player.

Anyone interested in WarpPlayer becoming open-source, please leave a comment below!

Filed under: Audio Comments Off
4Aug/09Off

Pitch shifting and time stretching in AS3

Audio time stretching is the process of slowing down or speeding up a piece of audio without altering its pitch. Its complement is pitch shifting, which is the process of altering the pitch of a piece of audio without altering its tempo.

This transformations are not as straight forward as they seem to be. This is primarily because any transformations applied to an audio signal in the time domain usually affect its frequency domain, and vice versa. I've been researching this a lot lately and I must admit I bumped into a pretty big and mean realm. If an attempt to preserve audio quality is made, very complex algorithms (made by some very smart people) start showing up. Different DSP (digital signal processing) techniques exist for this purpose such as the Phase Vocoder, TDHS, WSOLA, PSOLA, etc... you can read more about the topic on wikipedia here.

So, what I did was grabbed Ryan Berdeen's port of the C Sountouch library to AS3 (soundtouchAS3) and modified it to act as a StretchFilter in Joe Berkovitz's StandingWave2 (standingwave) dynamic audio library for flash 10. After adding this filter to standingwave, it was just a question of combining it with the lib's ResamplingFilter, and voila, pitch shifting.

DEMO: here.

****WARNING****: This is quite cpu intensive (the implementation is far from optimized), so be prepared to turn the volume down! Also, please be patient with the sound loading, and sorry for the crappy interface! This is just an experiment. Avoid touching the controls before a sound is loaded (:P). When the sound loads, try altering the pitch or the tempo, one at a time.

Clearly, my adaptation of the time stretching algorithm needs to be optimized and perhaps even ported back to C! (for use with Alchemy) but the main purpose of this experiment is just to show that time stretching and pitch shifting in AS3 is possible... which is pretty good news for us Flash geeks, right?

This should have a few uses. I can think of a couple: Dynamically altering the tempo or pitch of media the way Quicktime player does with the A/V controls (would be pretty good for tutorials, specially music tutorials), being able to warp audio sources in some sort of online audio editor, etc.

Filed under: Audio Comments Off
23Jul/09Off

Understanding dynamic sound in Flash

I made a little test app in order to help me understand how dynamic sound in Flash works. I had done a few of the basic tutorials/experiments and I kind of got the idea, but quite a lot of mystery still remained in the topic... While doing this experiment though, I feel like I have a much more thorough understanding of the way dynamic sound works in Flash and so, I decided to share it, 'cause I would have loved to find something like this on someone elses blog. Besides, I feel like the more people are familiar with this the better because this feature of FP10 is simple awesome!

So, how does dynamic sound work? The new method in Sound called "extract" can read audio data from any position of a Sound object. It extracts the data into a ByteArray, that is, into a bunch of 0s and 1s. But the hardest thing for me to understand was: What is the nature of this data? How does it describe a sound?

Well, byte arrays are highly efficient arrays that contain boolean values, or bits. These 0s and 1s are packed in groups of 8, each being a byte. So byteArray.position = 256 would not place the array pointer at bit 256, it would place it at byte 256, that is, at bit 2048. Now, the extract method populates the byte array with a bunch of bytes, but in order to make sense out of the bytes, we must visualize them as groups of 8 bytes (64 bits). Grouped in this way, each group is what we call a sample in audio...

Real life sound is nothing more than oscillations or vibrations that travel through air. The amplitude of these oscillations correspond to volume and the cycle or wavelength of these oscillations correspond to what we interpret as pitch. The way digital sound works is to "sample" the instantaneous amplitude of these vibrations at a very fast rate (say 44100Hz, that is 44100 samples each second). Such amplitude readings can be stored in 32 bit "float" values. Flash works at 44100Hz, stereo, so the first 4 bytes of each sample correspond to an instantaneous value in the left channel and the next 4 to the amplitude of the right channel. 32 bits per channel. So if we use the extract method, reposition the byte array in 0, we can use the readFloat method of ByteArray in pairs, and hence extract these amplitude values of the entire sound, one by one. Doing this, its crucial to keep in mind that each time you read a part of the byte array, the position marker is shifted to the end of what you've just read. Its just the way byte arrays work, differing from regular arrays.

In this manner, we can read large chunks of samples and give them to the data object in the SampleDataEvent. Doing so, we are delivering a chunk of audio to the computer's audio card so it can make some noise. The beauty in this is that we can read and understand the sound data, and before giving it to the sound card, we can process the audio with complex filters and DSP analysis, use the information for visual display, etc... This is why it is handled in chunks of audio, so we can be able to process this sound in real time.

Its important to note that when we extract a chunk from the Sound object, we don't know if (imagine that we are arriving to the end of a sound) we are going to be able to get the amount of samples we asked for, so the extract method returns a value, indicating how many samples it was able to extract.

So enough talking... see it yourself in the demo by clicking on the image above (view source enabled). In this experiment, I am not doing anything to the original sounds, just using the info extracted each cycle to visualize whats going on. The boxes plot the amplitude of the samples againts time, and from the pixelated look of them, you can easily grasp how many samples these mp3's have... A lot of them. The upper box shows the waveform of the entire sound clip (note that it is not a frequency spectrum), and the lower box shows the instantaneous waveform of the chunk that has just been extracted from the Sound object and been delivered to the sound card. You can change the playback speed or drag the playback head to see the process in slow motion. You can also change the buffer size, which is nothing more that the size of the chunks (in samples) that are processed at a time on each cycle. And, you can change the sounds too, just because waveforms are beautiful... be patient on the loading though, my server is sloooooow...

Processing sounds can be quite cpu intensive, but luckily we have pixel bender and alchemy to give us a little more juice in this area. I'd like to post more about this topic in the future, its just too interesting. Perhaps a spectrum analyzer, pitch shifting, time scale modification, etc... The possibilities are endless. Just look at what guys like Andre Michelle are doing!

Filed under: Audio, Tutorials Comments Off
10Apr/0914

AS3 BeatMachine – Timer Inaccuracy

I've always thought about experimenting with flash in the realm of audio. Lucky me, I finally got a chance! Before discovering the fun in programming (nerd!) I use to spend a lot of my time creating digital music... Music and maths have always being passionate things for me. So, with this being said, you can understand how I could think of a lot of audio related flash web applications. I say "could" because after this experiment, I'm not sure how capable Flash is of doing such things.

The experiment consists of a very simple step sequencer, or a drum machine. You can download the source if you want to have a look at it. It is a very simple API that can load tracks into a sequencer. You can then load a sound on each track and place notes in specific locations of the compass, so that when you press play, a rhythm starts playing. I planned on doing a user interface for it, but I desisted given the bad results I got.

As you can hear in the demo, the player doesn't seem to be able to keep a good tempo. The reason for this is that the sequencer needs a clock to trigger beats in a given BPM (beats per minute) and the flash player can't do this with enough accuracy. Both enterframe based timers and the Timer class itself are very imprecise when dealing with small intervals of time such as 60ms (around 120bpm). The timer executes at values like 70ms, 55ms, 87ms, 120ms, 293ms, etc. Even though the Clock class in the BeatMachine API tries to compensate for these failures, it is not enough to keep a precise tempo. I also made an effort in not using events, so that the garbage collector doesn't mess with the rhythm when it comes around, but again, the results are insatisfactory.

One big TO-DO with this would be to test it in fp10. Apparently the updated player handles sound in a different thread, so that would be a big help. Anyway, this is still discouraging because even though sound routines could be set aside so that the timers trigger sooner to when they are supposed to, heavy visual activity could also slow down the timers. I think the only solution would be to send the timers to a different thread too.

I hope I'm wrong, so if anybody would like to have a look at the source and post any suggestions, be my guest! If the problems could be overcomed, then it would be nice to enhance the API so that it supports more features and perhaps even adapt it so it can easily be connected to some graphics UI.

Any ideas?

[UPDATE] See comments below for additional info on the subject.

Filed under: Audio 14 Comments