The range of ACF is usually not known in advance. To limit the range of ACF to [-1, 1], we can use the following NSDF (normalized squared difference function) formula:
$$nsdf(\tau)=\frac{2\sum s(i)s(i+\tau)}{\sum s^2(i)+\sum s^2(i+\tau)}$$
All the summations in the above equation should have the same lower and upper bounds. The range of NSDF is [-1, 1] due to the following inequality:
$$-1 \leq \frac{2xy}{x^2+y^2} \leq 1$$
If the selected pitch point is $\tau=\tau_0$, then we define the clarity of this frame is
$$clarity=acf(\tau_0)$$
A higher clarity indicates the frame is closer to a pure periodic waveform. On the other hand, a lower clarity indicates the frame is less periodic, which is likely to be caused by unvoiced speech or silence. The following is a typical example:
The following example uses NSDF to perform pitch tracking:
We can increase the frame size to reduce pitch-halving errors: