FFT for Pitch Detection

2019-05-30 10:57发布

Ive been recently using FFT for Pitch Detection and I notice that, although the notes are correct (e.g. C, D#, etc.), there are a lot of notes that are in the wrong octave (e.g. E2 is categorized as E3, C3 is categorized as C4, always an octave up).

Why is this the case? My algorithm is after calculating the FFT bins, I get the bin with the greatest intensity and calculate which frequency it is.

Any help on this? Thanks!

标签: fft pitch
4条回答
你好瞎i
2楼-- · 2019-05-30 11:37

Sound like harmonics to me. Greg's pointed question seems to be on the right track.

If that is true, you could try finding the statistical median of all buckets and find the closest, rather than finding the statistical mode (as you are currently doing).

If you are seeing variation in your output, you could also do temporal smoothing (average over time).

I know that guitar tuners do several of these things, and still come up intermittently wrong. It's a messy business :)

Speaking of live sampling, depending on your sample source, there are a lot of anomalies to consider that could be giving you unexpected results:

  • Overtones in the sound
  • Inaudible tones in the sound

These will show up in your data, but you likely won't be able to hear them. And if you're trying to match against multiple tones or chords, your job will be even more complicated.

查看更多
我欲成王,谁敢阻挡
3楼-- · 2019-05-30 11:45

Octave Detection can be very tricky, especially on a polyphonic signal where the fundamental harmonic and/or other harmonics are missing. Assuming that you are correctly detecting 'pitch' and not just 'harmonics' (see Wikipedia link below), then you could use an Octave Detection algorithm that I developed.

In order to do pitch detection for PitchScope Player, I decided on a 2 Stage Algorithm that works like this: a) First the ScalePitch of a note is detected -- 'ScalePitch' has 12 possible pitch values: { E, F, F#, G, G#, A, A#, B, C, C#, D, D# }. And after ScalePitch and Time-Width of a note is determined, b) then the Octave (fundamental) of that note is calculated by examining ALL the harmonics of 4 possible Octave-Candidate notes.

The complete C++ source code and executable for my pitch detection application, PitchScope Player, is on GitHub (link below), and you could compile and step through it to see how my Octave Detection Algorithm works.

You would want to focus on the function FundCandidCalcer::Calc_Best_Octave_Candidate() within in the file FundCandidCalcer.cpp to see that algorithm in C++. The diagram below also gives a rough idea how to calculate the Octave.

https://en.wikipedia.org/wiki/Transcription_(music)#Pitch_detection

https://github.com/CreativeDetectors/PitchScope_Player

The diagram below demonstrates the Octave Detection algorithm which I developed to pick the correct Octave-Candidate note (that is, the correct Fundamental), once the ScalePitch for that note has been determined.

enter image description here

查看更多
\"骚年 ilove
4楼-- · 2019-05-30 11:46

two thoughts :-

  1. if your input and your algorithm are always exactly 1 octave apart from what you expect then can't you just accpet that you're calibrated like that and always subtract an octave?

  2. when you take a guitar string you always get a harmonic (the 2nd harmonic) exactly one octave higher that is very loud - about as loud as the natural (the 1st harmonic). next you get 1 octave 7semitones above (3rd harmonic) but the octave harmonic is really noticeable.

查看更多
三岁会撩人
5楼-- · 2019-05-30 11:49

In deciding which octave to place a pitch in, try adding to each bucket some fraction the amount of audio that is present at 3x the frequency (e.g. add to the 440Hz bucket a fraction of the amplitude of the 1320Hz bucket). On most intstruments, an A440 is likely to have significant components at 880Hz, 1320Hz, 1760Hz, 2200Hz, 2640Hz, etc. An A880 would likely have 880Hz, 1760Hz, and 2640Hz, but would not have a significant 1320Hz component (nor 2220Hz for that matter). So if your code is trying to decide whether a note is an A440 or an A880, looking at the third-harmonic bucket (or other odd harmonics) may provide a useful clue.

查看更多
登录 后发表回答