Sound normalization, how to convert float array to byte array?
Hi everyone, I'm playing an audio file. I read it like byte[]
and then I need to normalize the audio by setting the values to the range [-1,1]. Then I want to add each float value to the array byte[i]
and then return it byte[]
to the audio player.
I've tried this:
byte[] data = ar.ReadData();
byte[] temp=new byte[data.Length];
float biggest= 0; ;
for (int i = 0; i < data.Length; i++)
{
if (data[i] > biggest)
{
biggest= data[i];
}
}
This part of the code should contain, for example, 0.43 int byte [], if possible. I tried this but didn't work:
for (int i = 0; i < data.Length; i++)
{
temp = BitConverter.GetBytes(data[i] * (1 / biggest));
}
source to share
In the comment, you said, "I am playing an audio file ... I read it as byte [] and then I need to normalize the sound by putting the values in the range [-1,1] and then I need to put that byte [] back to play audio player "
I'm making a big guess here, but I'm guessing the data received from ar.ReadData()
is a byte array of dual channel 16-bit / 44.1kHz PCM data. (note: are you using the Alvas.Audio library?) If so, here's how to do what you want.
Background
First, a little background. A dual channel 16-bit PCM data stream looks like this:
byte | 01 02 | 03 04 | 05 06 | 07 08 | 09 10 | 11 12 | ...
channel | Left | Right | Left | Right | Left | Right | ...
frame | First | Second | Third | ...
sample | 1st L | 1st R | 2nd L | 2nd R | 3rd L | 3rd R | ... etc.
It is important to pay attention to several things here:
- Since the audio data is 16 bits, one sample from one channel is
short
(2 bytes), notint
(4 bytes) with a value in the range of -32768 to 32767. - This data is presented in little-endian notation , and if your architecture is also of little value, you cannot use a .NET
BitConverter
class to convert. - We do not need to split the data into streams in each channel, because we normalize both channels based on one maximum value from any channel.
- Converting a floating point value to an integer value will lead to quantization errors, so you probably want to use some kind of dithering (that's a whole topic in itself).
Secondary functions
Before getting into the actual normalization, make it easier for yourself by writing a couple of helper functions to get short
from byte[]
and vice versa:
short GetShortFromLittleEndianBytes(byte[] data, int startIndex)
{
return (short)((data[startIndex + 1] << 8)
| data[startIndex]);
}
byte[] GetLittleEndianBytesFromShort(short data)
{
byte[] b = new byte[2];
b[0] = (byte)data;
b[1] = (byte)(data >> 8 & 0xFF);
return b;
}
Normalization
An important distinction needs to be made here: audio normalization is not the same as statistical normalization . Here we will perform peak normalization on our audio data, amplifying the signal by a constant amount so that its peak is at the upper limit. To get the maximum normalization of the audio data, we first find the largest value, subtract it from the upper limit (for 16-bit PCM data, it is 32767) to get the offset, and then increment each value by that offset.
So, to normalize our audio data, first scan it to find the maximum value:
byte[] input = ar.ReadData(); // the function you used above
float biggest = -32768F;
float sample;
for (int i = 0; i < input.Length; i += 2)
{
sample = (float)GetShortFromLittleEndianBytes(input, i);
if (sample > biggest) biggest = sample;
}
At this point, biggest
contains the largest value of our audio data. Now, to do the actual normalization, we subtract biggest
from 32767 to get the offset from the peak of the loudest sample in our audio data. We then add this offset to each audio sample, effectively increasing the volume of each sample until our loudest sample peaks.
float offset = 32767 - biggest;
float[] data = new float[input.length / 2];
for (int i = 0; i < input.Length; i += 2)
{
data[i / 2] = (float)GetShortFromLittleEndianBytes(input, i) + offset;
}
The last step is to convert the samples from floating point values to an integer and store them as little-endian short
s.
byte[] output = new byte[input.Length];
for (int i = 0; i < output.Length; i += 2)
{
byte[] tmp = GetLittleEndianBytesFromShort(Convert.ToInt16(data[i / 2]));
output[i] = tmp[0];
output[i + 1] = tmp[1];
}
And you're done! Now you can send a byte array output
that contains normalized PCM data to your audio player.
As a final note, keep in mind that this code is not the most efficient; you could combine several of these loops, and you could probably use Buffer.BlockCopy()
to copy the array, and also modify the to helper function short
to byte[]
take a byte array as a parameter and copy the value directly into the array. I didn't do this to make it easier to see what was going on.
And as I mentioned earlier, you should absolutely read about anti-aliasing, as it will greatly improve the quality of your audio output.
I was working on an audio project myself, so I figured it all out through some trial errors; I hope this helps someone.
source to share
You can use Buffer.BlockCopy
like this:
float[] floats = new float[] { 0.43f, 0.45f, 0.47f };
byte[] result = new byte[sizeof(float) * floats.Length];
Buffer.BlockCopy(floats, 0, result, 0, result.Length);
source to share
You can change temp
to a list of byte arrays so you don't overwrite them all the time.
byte[] data = new byte[] { 1, 3, 5, 7, 9 }; // sample data
IList<byte[]> temp = new List<byte[]>(data.Length);
float biggest = 0; ;
for (int i = 0; i < data.Length; i++)
{
if (data[i] > biggest)
biggest = data[i];
}
for (int i = 0; i < data.Length; i++)
{
temp.Add(BitConverter.GetBytes(data[i] * (1 / biggest)));
}
source to share