Guitar Tuner

The Firmware sketch

What should the device firmware do?

acquire the analog signal
compute the FFT and therefore the frequency spectrum
determine the fundamental frequency out of the spectrum
From the fundamental frequency determine the note and the deviation to the perfect pitch
display the note and the deviation

First of all I have to give credit to Enrique Condes for using the FFT library he created. Thank you Enrique.

If you want to use the sketch you need to install also the arduinoFFT libraries.

The FW is a sketch in Arduino IDE. I ain’t gonna teach you nigga how to install and use the IDE. Just goooooogle it!!! Or youtube it! And if you don’t know how to install something on your computer, you should leave this page. Now! Immediately!

Now that the kids are out we may talk between us, guys. The FW sketch can be found here. It is very straight forward. It also has comments inside, but I think that some additional comments may do no harm.

Here we go:

At what sample rate we have to acquire the signal? Well, as we have already decided that the highest note we want to tune will be B5 , with a frequency of 987.8Hz, we must have a sampling frequency higher than 2 x 987 = 1976Hz. We choose 2000Hz as sample rate. We have seen already that this is no problem for the ESP8266 ADC.

How many samples? We want as many samples as possible in order to have a better precision of the frequency computation. With a specific sampling rate F (in our case 2000 samples /sec or 2000Hz) and a specific number of samples N (for instance 1000) after the FFT computation we will have a spectrum stretching from 0 to F (2000 Hz). And, as it will be a discrete spectrum, it will have a finite number of discrete frequency. This number is equal to the number of samples N. This is the result of the FFT transposing an amplitude/time domain with a specific number of discrete values (samples) N to the amplitude / frequency domain, having the same number N of discrete values, but this time in the frequency domain. The discrete frequencies are placed at equal “distance” to each other. The “distance” between two consecutive frequencies is F/N (in our case 2000/1000 = 2Hz). And therefore for instance the 55th sample out of the 1000 will represent the amplitude of the 55 x 2 = 110 Hz component in the spectrum. Increasing the number of samples will decrease the distance between each two consecutive frequencies. And as we want to have the frequency as precise as possible this will be very good. On the other hand the time between two computations of the fundamental frequency (and therefore note) is determine by the number of samples, as we may do the computation only after all samples have been acquired. With a sample rate of 2000Hz (2000 samples/sec) it requires 0.5 sec to take 1000 samples and therefore the note can be compute (and the display refreshed) only every 0.5 sec, or 2 times /sec.

You may say that all we have to do is to increase the sample rate. Then the N=1000 samples will be acquired in less time and the note can be compute more often. But increasing the sample rate F will increase the distance between two adjacent frequency ( = F/N) and this will lower the precision of the measurement. With a sampling rate F (samples/sec) and N samples you have a distance f_deltabetween two consecutive frequencies of F/N. And the time T_acquire to get all the samples is N/F. This two are just inverse f_delta = 1/T_acquire . In conclusion : if you want a specific spacing between frequencies, for instance 2Hz, you also have a specific acquire time, in this case of 1/2 = 0.5 sec.

So we have to compromise between as many samples as possible in order to have the best precision in computing the fundamental and as few as possible in order to refresh the display very often or the tuning will be very difficult or even impossible. Basically the figures I presented above ( sampling rate 2000 Hz and 1000 samples) are the best compromise I could come with. The only adjustment is that the FFT is working well only with number of samples which is a power of two. Therefore we need to to take 1024 samples equal 2↑10, instead of 1000. The “distance” between two consecutive frequencies is not 2Hz exactly, but 2000/1024 = 1,953125 Hz.

The acquisition has to be done at constant sampling rate (constant time intervals) otherwise the FFT will be fooled and will fool us in return with wrong results.

We have to do the sampling in an interrupt, called at fixed time intervals of 1/F = 1/2000 = 500µs.

As soon as all 1024 samples are acquired we will compute the FFT.

At this point we may do a cut-off of all frequencies bellow a specific one. The reason for implementing this “high pass filter” is to get rid of hum component (mains frequency 50Hz or 60 Hz depending on country/continent). Depending on the HW construction, the guitar, the room itself and the electrical wiring, etc. big hum component my be induce in the guitar signal and it may fool the device in thinking that this component is the fundamental frequencies and/or that this is the signal itself in absence of a real guitar signal. By cutting all the frequencies bellow a frequency just above the mains (hum) one we get rid of hum , but also of this lower frequencies. For a bass guitar it is not OK., therefore we should do this only if , due to the HW and the other factors, the hum is to big and confuse the device.

Out of the frequencies amplitudes (spectrum) resulted, we look for the biggest one overall. This may be the fundamental, but it also may be one of the harmonics. In order to determine what it is, we divide the frequency by 2, by 3, .. up to by 7 and we look for the lowest from this “under-harmonics” which value is still high enough, comparing to the biggest one. This has to be the fundamental. If none of them has a value big enough than the biggest is also the fundamental. Why all this and not just simple look for the first local peak, bigger than some threshold? Because 1. sometimes the harmonics are stronger than the fundamental and 2. sometimes not only the chord we pick is oscillating but the others too, so that he first local peak may just be from an other chord and not the one we want to tune. We don;t have to look more than the 7th “under-harmonics” because usually only the second or the third harmonics amy be stronger than the fundamental, rarely the forth. Beginning at forth – fifth harmonic the amplitude decay rapidly and none of the higher harmonics is significant relatively to the fundamental. We consider up to the seventh harmonics as being the strongest one. This way we wont’n miss anything and we get a “under-harmonic” value.

If all amplitudes are lower that a threshold we conclude that there is no signal at the input and we switch off the display and the LEDs.

As we have discrete frequencies placed at a distance of 2000/1024 Hz from each other , in order to get the real frequency where the amplitude is the highest, we need to interpolate between the three frequencies and theirs amplitude around the peak we have found. For the interpolation we assume that the frequency in the continuum spectrum are represented by a second order function. We use the known values (the three frequencies and the corresponding amplitudes) to compute the coefficients of the second order function and as soon as we know this coefficients and therefore the function itself we compute the local differential to get the maximum amplitude and the frequency corresponding to this maximum. The actual note frequency will be this one divided by the “under-harmonic” value we have found before.

Practically I discovered that the resulting frequency is not the absolute correct one and we need small adjustments. I have empirically determined the adjustment coefficients for the frequency representing the notes . The reason for this adjustments is most probably that the function we are trying to find the maximum is not a second order one. It is slightly more complex.

Now that we know the fundamental of the input signal we look in a table to find out the corresponding note and the perfect pitch for this note. And we display the note on the 7 segments display and based on the deviation between the actual frequency and the perfect pitch we switch on/off the corresponding LEDs.

We have three LEDs but varying the intensity of them we get more indications. For instance there is only left red one with 100% or 75% left red and 25% green, or 50-50% and so on. In total 9 intervals. The intensity in varied in the sampling acquire interrupt (and therefore with a frequency of by 2000Hz) by cycling through four states and setting ON or OFF each LED in each of this states, depending on what we need to display.

Then we wait for a new set of 1024 samples and we start from the beginning. That’s all.

There is a small draw back, but more of a theoretical nature and not of a practical one. If we feed a signal with a fundamental frequency f higher than 1000Hz the device will (wrongly) determine that the fundamental is 2000-f and will (wrongly) display the corresponding note. This is called aliasing. Actually the FFT requires that there are no components in the spectrum higher than half the sampling rate. If necessary one should use an external low pass filter to ensure this requirement. As we don’t have any low pass filter at the input, for higher frequencies than half the sampling rate (2000/2 = 1000 Hz) the results are wrong. But we only want to use the device for tuning a guitar, and the highest note for a normal, standard tuning is the E4 with a fundamental of 329.6 Hz. So we don’t have to bother cause there should be no fundamentals higher than 1000Hz.

With the firmware defined, all we have to do is to implement all of this stuff.