FFT in Python with explanations

I have a WAV file that I would like to render in the frequency domain. Then I would like to write a simple script that takes in a WAV file and outputs if the energy at a certain frequency "F" exceeds the threshold "Z" (does a certain tone have a strong presence in the WAV file). There are tons of code snippets on the internet that show how to plot an FFT spectrum in Python but I don't understand many steps.

  • I know wavfile.read (myfile) returns sample rate (fs) and array of data (data), but when I run FFT on it (y = numpy.fft.fft (data)), what units do I have?
  • To get an array of frequencies for the x-axis, some posters do this when n = len (data):

    X = numpy.linspace (0.0, 1.0 / (2.0 * T), n / 2)

    while others do this:

    X = numpy.fft.fftfreq (n) * fs) [range (n / 2)]

    Is there a difference between the two methods and is there a good online explanation of what these operations do conceptually?

  • Some online FFT tutorials mention a window, but not many posters use windows in their code snippets. I see that numpy has numpy.hamming (N), but what should I use as input to this method and how to "apply" the output window to my FFT arrays?
  • For my threshold calculation, is it correct to find the frequency in X that is closest to my desired tone / frequency and check if the corresponding element (same index) in Y has an amplitude greater than the threshold?
+3


source to share


1 answer


  • FFT data is in units of normalized frequency, where the first point is 0 Hz and one minute of the last point is fs

    Hz. You can create a frequency axis with linspace(0.0, (1.0 - 1.0/n)*fs, n)

    . You can also use fftfreq

    , but the components will be negative.

  • It's the same if it's n

    even. You can also use rfftfreq

    I think. Note that this is only the "positive half" of your frequencies, which is probably what you want for sound (which is real). Note that you can use rfft

    to simply create the positive half of the spectrum and then get the frequencies using rfftfreq(n,1.0/fs)

    .

  • The window will reduce the sidelobe levels by broadening the main portion of whatever frequencies it has. n

    is the length of your signal and you multiply your signal by the window. However, if you are looking for a long signal, you might want to "chop" it into chunks, box them, and then add the absolute values โ€‹โ€‹of their spectra.

  • "is it correct" is difficult to answer. The simple approach, as you said, is to find the closest one to your frequency and check its amplitude.



+1


source







All Articles