Live Voice Chat with Guided Sound with RTMFP
We are creating an RTMFP voice chat application with Cumulus . While basic voice transmission works quite simply with NetStreams, we have one big problem:
There seems to be no way to manipulate the microphone data that the NetStream sends, nor can it manipulate the data that the NetStream is listening to before it plays.
However, this is exactly what we need. We don't want to transmit a normal microphone, recorded audio, but let's make it first, then send it and then play it back. Or submit it first, then submit it and then play it back. But it seems that all audio recording, speex encoding, speex decoding, and audio playback are completely wrapped in the NetStream class.
The only ways to achieve what we want (and all of them to completely remove NetStream) are:
-
Send raw audio data. It does work, but of course there is a lot of data to send and most likely will not work fast enough outside of local LAN testing.
-
Audio data, convert to ogg / mp3 using existing flash encoders, send, decode ogg / mp3 and play. But that would mean encoding every packet sample that was received from the microphone, adding header material, etc. So it probably won't even do that much benefit compared to the original audio data.
2.1. It would be a really good way if there was a Speex encoder / decoder for flash. But ironically, there is nothing more than built-in (which is used to encode / decode audio in NetStreams) that cannot be explicitly used. Yes, thanks for not suggesting it, Adobe ...
-
Submit the data to the Cumulus server, send (and possibly convert) there, and send to the recipient. It probably won't be much faster than 1. and also discard the exact advantage of RTMFP, P2P communication.
Is there any solution to this problem that will work better than the ones I have listed here, perhaps a way to actually manipulate the microphone data before it is passed to the NetStream?
source to share
To get something viable, audio data must be converted in a compressed format, raw data is a huge amount of data. I think the second option is better ;-)
I've already developed ogg vorbis decoder / encoder in flash using Alchemy, it always consumed less than 10% of the CPU! It is possible.
If you prefer the speex format, I think that with constant effort you can get the same thing in making speex code with alchemy.
If I can give you more, please contact me at cumulus.dev@gmail.com ; -)
source to share