.NET Library to Identify Pitches [closed]

2019-01-21 22:58发布

I'd like to write a simple program(preferably in C#) to which I sing a pitch using a mic and the program identifies to which musical note that pitch corresponds.


Thank you very much for your prompt responses. I clarify:

I'd like a (preferably .NET) library that would identify the notes I sing. I'd like that such a library:

  1. Identifies a note when I sing(a note from the chromatic scale).
  2. Tells me how much I'm off from the closest note.

I intend to use such a library to sing one note a time.

14条回答
2楼-- · 2019-01-21 22:58

Performing a Fourier transform will give you values for each frequency found in the sample. The more prominent the frequency, the higher the value. If you look for the largest value, you'll find your root frequency but overtones will also be present.

If you're looking for specific frequency, using the Goertzel algorithm can be very effective.

查看更多
做自己的国王
3楼-- · 2019-01-21 23:03

You're looking for a frequency estimation or pitch-detection algorithm. Most people suggest finding the maximum value of the FFT, but this is overly simplistic and doesn't work as well as you might think. If the fundamental is missing (a timpani, for instance), or one of the harmonics is larger than the fundamental (a trumpet, for instance), it won't detect the correct frequency. Trumpet spectrum:

Trumpet spectrum http://www.eng.cam.ac.uk/DesignOffice/mdp/electric_web/AC/02284.jpg

Also, you're wasting processor cycles calculating the FFT if you're only looking for a specific frequency. You can use things like the Goertzel algorithm to find tones in a specific frequency band more efficiently.

You really need to find "the first significant frequency" or "the first frequency with strong harmonic components", which is more ambiguous than just finding the maximum.

Autocorrelation or the harmonic product spectrum is better at finding the true fundamental for real instruments, but if the instrument is inharmonic (most are), then the wave shape is changing over time, and I suspect it won't work as well if you try to measure more than a few cycles at a time, which decreases your accuracy.

查看更多
神经病院院长
4楼-- · 2019-01-21 23:07

Since you're dealing with a monophonic source, most of your pitches detected with an FFT should be harmonically related, but you're not really guaranteed that the fundamental is the strongest pitch. For many instruments and some voice registers in fact, it probably won't be. It should be the lowest of the harmonically related (in integer multiples of the fundamental) pitches detected though.

查看更多
成全新的幸福
5楼-- · 2019-01-21 23:08

You'll want to capture your raw input, accumulate some samples, and then do an FFT on them. The FFT will convert your samples from time domain to frequency domain, so what it produces is a bit like a histogram of how much energy the signal contained at various frequencies.

Getting from that to "the" frequency may be a bit difficult though -- a human voice is not going to just contain a single, clean frequency of sound. Instead you'll normally have energy at a pretty fair number of different frequencies. What you'll typically do is start from about the lowest voice range, and work your way up, looking for the first (lowest) frequency at which the energy is significantly higher than the background noise.

查看更多
看我几分像从前
6楼-- · 2019-01-21 23:11

The crucial piece of this problem is the Fast Fourier Transform. This algorithm turns a waveform (your sung note) into a frequency distribution. Once you've computed the FFT you identify the fundamental frequency (usually the frequency with the highest amplitude in the FFT, but this depends somewhat on your microphone's frequency response curve and exactly what type of sound your mic is listening to).

Once you've found the fundamental frequency you need to lookup that frequency in a list that maps frequencies to notes. Here you'll need to deal with the in betweens (so if the fundamental frequency of your sung note is 452Hz what note does that actually respond to, A or A#?).

This guy on CodeProject has an example of FFT in C#. I'm sure there are others out there...

查看更多
▲ chillily
7楼-- · 2019-01-21 23:12

I'm amazed by all the answers here suggesting the use of FFT, given that FFT isn't generally precise enough for pitch detection. It can be, but only with an impractically large FFT window. For example, in order to determine the fundamental with 1/100th of a semi-tone accuracy (which is about what you need for accurate pitch detection) when the fundamental is around concert A (440 Hz), you need an FFT window with 524,288 elements. 1024 is a much more typical FFT size - the computation time become progressively worse the larger the window.

I have to identify the fundamental pitch of WAV files in my software synthesizer (where a "miss" is immediately audible as an out-of-tune instrument) and I've found that autocorrelation does by far the best job. Basically, I iterate through each note in the 12-tone scale over an 8-octave range, compute the frequency and the wavelength of each note, and then perform an autocorrelation using that wavelength as the lag (an autocorrelation is where you measure the correlation between a set of data and the same set of data offset by some lag amount).

The note with the highest autocorrelation score is thus roughly the fundamental pitch. I then "hone in" on the true fundamental by iterating from one semi-tone down to one semi-tone up by 1/1000ths of a semi-tone, to find the local peak autocorrelation value. This method works very accurately, and more importantly it works for a wide variety of instrument files (strings, guitar, human voices etc.).

This process is extremely slow, however, especially for long WAV files, so it could not be used as is for a realtime application. However, if you used FFT to get a rough estimate of the fundamental, and then used autocorrelation to zero in on the true value (and you were content with being less accurate then 1/1000th of a semi-tone, which is absurdly over-accurate) you would have a method which was both relatively fast and extremely accurate.

查看更多
登录 后发表回答