Sound normalization, how to convert float array to byte array?

Hi everyone, I'm playing an audio file. I read it like byte[]

and then I need to normalize the audio by setting the values ​​to the range [-1,1]. Then I want to add each float value to the array byte[i]

and then return it byte[]

to the audio player.

I've tried this:

byte[] data = ar.ReadData();
byte[] temp=new byte[data.Length];
float biggest= 0; ;
for (int i = 0; i < data.Length; i++)
{
    if (data[i] > biggest)
    {
        biggest= data[i];
    }
}

      

This part of the code should contain, for example, 0.43 int byte [], if possible. I tried this but didn't work:

for (int i = 0; i < data.Length; i++)
{
    temp = BitConverter.GetBytes(data[i] * (1 / biggest));
}

      

+3


source to share


5 answers


In the comment, you said, "I am playing an audio file ... I read it as byte [] and then I need to normalize the sound by putting the values ​​in the range [-1,1] and then I need to put that byte [] back to play audio player "

I'm making a big guess here, but I'm guessing the data received from ar.ReadData()

is a byte array of dual channel 16-bit / 44.1kHz PCM data. (note: are you using the Alvas.Audio library?) If so, here's how to do what you want.

Background

First, a little background. A dual channel 16-bit PCM data stream looks like this:

   byte | 01 02 | 03 04 | 05 06 | 07 08 | 09 10 | 11 12 | ...
channel |  Left | Right | Left  | Right | Left |  Right | ...
  frame |     First     |    Second     |     Third     | ...
 sample | 1st L | 1st R | 2nd L | 2nd R | 3rd L | 3rd R | ... etc.

      

It is important to pay attention to several things here:

  • Since the audio data is 16 bits, one sample from one channel is short

    (2 bytes), not int

    (4 bytes) with a value in the range of -32768 to 32767.
  • This data is presented in little-endian notation , and if your architecture is also of little value, you cannot use a .NET BitConverter

    class to convert.
  • We do not need to split the data into streams in each channel, because we normalize both channels based on one maximum value from any channel.
  • Converting a floating point value to an integer value will lead to quantization errors, so you probably want to use some kind of dithering (that's a whole topic in itself).

Secondary functions

Before getting into the actual normalization, make it easier for yourself by writing a couple of helper functions to get short

from byte[]

and vice versa:

short GetShortFromLittleEndianBytes(byte[] data, int startIndex)
{
    return (short)((data[startIndex + 1] << 8)
         | data[startIndex]);
}

byte[] GetLittleEndianBytesFromShort(short data)
{
    byte[] b = new byte[2];
    b[0] = (byte)data;
    b[1] = (byte)(data >> 8 & 0xFF);
    return b;
}

      

Normalization



An important distinction needs to be made here: audio normalization is not the same as statistical normalization . Here we will perform peak normalization on our audio data, amplifying the signal by a constant amount so that its peak is at the upper limit. To get the maximum normalization of the audio data, we first find the largest value, subtract it from the upper limit (for 16-bit PCM data, it is 32767) to get the offset, and then increment each value by that offset.

So, to normalize our audio data, first scan it to find the maximum value:

byte[] input = ar.ReadData();  // the function you used above
float biggest = -32768F;
float sample;
for (int i = 0; i < input.Length; i += 2)
{
    sample = (float)GetShortFromLittleEndianBytes(input, i);
    if (sample > biggest) biggest = sample;
}

      

At this point, biggest

contains the largest value of our audio data. Now, to do the actual normalization, we subtract biggest

from 32767 to get the offset from the peak of the loudest sample in our audio data. We then add this offset to each audio sample, effectively increasing the volume of each sample until our loudest sample peaks.

float offset = 32767 - biggest;

float[] data = new float[input.length / 2];
for (int i = 0; i < input.Length; i += 2)
{
    data[i / 2] = (float)GetShortFromLittleEndianBytes(input, i) + offset;
}

      

The last step is to convert the samples from floating point values ​​to an integer and store them as little-endian short

s.

byte[] output = new byte[input.Length];
for (int i = 0; i < output.Length; i += 2)
{
    byte[] tmp = GetLittleEndianBytesFromShort(Convert.ToInt16(data[i / 2]));
    output[i] = tmp[0];
    output[i + 1] = tmp[1];
}

      

And you're done! Now you can send a byte array output

that contains normalized PCM data to your audio player.

As a final note, keep in mind that this code is not the most efficient; you could combine several of these loops, and you could probably use Buffer.BlockCopy()

to copy the array, and also modify the to helper function short

to byte[]

take a byte array as a parameter and copy the value directly into the array. I didn't do this to make it easier to see what was going on.

And as I mentioned earlier, you should absolutely read about anti-aliasing, as it will greatly improve the quality of your audio output.

I was working on an audio project myself, so I figured it all out through some trial errors; I hope this helps someone.

+14


source


It works:

float number = 0.43f;
byte[] array = BitConverter.GetBytes(number);

      



What's not working for you?

+2


source


You can use Buffer.BlockCopy

like this:

float[] floats = new float[] { 0.43f, 0.45f, 0.47f };
byte[] result = new byte[sizeof(float) * floats.Length];
Buffer.BlockCopy(floats, 0, result, 0, result.Length);

      

0


source


You can change temp

to a list of byte arrays so you don't overwrite them all the time.

    byte[] data = new byte[] { 1, 3, 5, 7, 9 };  // sample data
    IList<byte[]> temp = new List<byte[]>(data.Length);
    float biggest = 0; ;

    for (int i = 0; i < data.Length; i++)
    {
        if (data[i] > biggest)
            biggest = data[i];
    }

    for (int i = 0; i < data.Length; i++)
    {
        temp.Add(BitConverter.GetBytes(data[i] * (1 / biggest)));
    }

      

0


source


if (Math.Abs(sample) > biggest) biggest = sample;

      

I would change this to:

if (Math.Abs(sample) > biggest) biggest = Math.Abs(sample);

      

Because if the largest value is negative, you will be multiplying all values ​​with negative.

0


source







All Articles