From reading so many sources online, I still cannot grasp why a different waveforms have harmonics.
For example: when designing a silly amplitude modulation (AM) circuit that puts a square wave from a microcontroller in to an antenna, how are harmonics generated? The signal is just "on" or "off", how are there first, third, and fifth harmonics and why do they get weaker?
I've heard oscilloscopes being able to measure up to the fifth harmonic of a square wave (or something similar) is important, but why would that make the reading different? Are these harmonics irrelevant in things such as data transfer (high=1, low=0) and only matter in situations such as audio or RF?
Why do sinusoidal waves not have as many harmonics? Because the waveform is always moving and not flat going up (triangle) or horizontal (square), but circular with an always changing value?
Sinusoidal waves don't have harmonics because it's exactly sine waves which combined can construct other waveforms. The fundamental wave is a sine, so you don't need to add anything to make it the sinusoidal signal.
About the oscilloscope. Many signals have a large number of harmonics, some, like a square wave, in theory infinite.
This is a partial construction of a square wave. The blue sine which shows 1 period is the fundamental. Then there's the third harmonic (square waves don't have even harmonics), the purple one. Its amplitude is 1/3 of the fundamental, and you can see it's three times the fundamental's frequency, because it shows 3 periods. Same for the fifth harmonic (brown). Amplitude is 1/5 of the fundamental and it shows 5 periods. Adding these gives the green curve. This is not yet a good square wave, but you already see the steep edges, and the wavy horizontal line will ultimately become completely horizontal if we add more harmonics. So this is how you will see a square wave on the scope if only up to the fifth harmonic are shown. This is really the minimum, for a better reconstruction you'll need more harmonics.
Like every non-sinusoidal signal the AM modulated signal will create harmonics. Fourier proved that every repeating signal can be deconstructed into a fundamental (same frequency as the wave form), and harmonics which have frequencies that are multiples of the fundamental. It even applies to non-repeating waveforms. So even if you don't readily see what they would look like, the analysis is always possible.
This is a basic AM signal, and the modulated signal is the product of the carrier and the baseband signal. Now
\$ sin(f_C) \cdot sin(f_M) = \dfrac{cos(f_C - f_M) - cos(f_C + f_M)}{2} \$
So you can see that even a product of sines can be expressed as the sum of sines, that's both cosines (the harmonics can have their phase shifted, in this case by 90°). The frequencies \$(f_C - f_M)\$ and \$(f_C + f_M)\$ are the sidebands left and right of the carrier frequency \$f_C\$.
Even if your baseband signal is a more complex looking signal you can break the modulated signal apart in separate sines.
Pentium100's answer is quite complete, but I'd like to give a much simpler (though less accurate) explanation.
The reason why sinewaves have (ideally) only one harmonic is because the sine is the "smoothest" periodic signal that you can have, and it's therefore the "best" in term of continuity, derivability and so on. For this reason it is convenient to express waveforms in terms of sinewaves (you can do it with other waves as well, as well as they are \$C^{\infty}\$, infinitely continuous and derivable).
Just an example: why do you usually see curved waves on water? (for the sake of the example, ignore the effects of the beach or wind) Again, it's because it's the shape that requires the least energy to form, since all the ramps are smooth.
In some cases, like the Hammond organ [1], sinewaves are actually used to compose the signal, because with decomposition is possible to synthesize a lot of (virtually all) sounds.
There is a beautiful animation by LucasVB [2] explaining the Fourier decomposition of a square wave:
These images explain better the square wave decomposition in harmonics:
[1] http://en.wikipedia.org/wiki/Hammond_organYou can decompose any waveform into an infinite series of sine waves added together. This is called Fourier analysis (if the original waveform is repeating) or Fourier transform (for any waveform).
In case of a repeating waveform (like a square wave), when you do Fourier analysis you find that all the sines that compose the waveform have frequencies that are an integer multiple of the frequency of the original waveform. These are called "harmonics".
A sine wave will only have one harmonic - the fundamental (well, it already is sine, so it is made up of one sine). Square wave will have an infinite series of odd harmonics (that is, to make a square wave out of sines you need to add sines of every odd multiple of the fundamental frequency).
The harmonics are generated by distorting the sine wave (though you can generate them separately).
Why is this important:
The derivative - rate of change - of a sinusoid is another sinusoid at the same frequency, but phase-shifted. Real components - wires, antennas, capacitors - can follow the changes (of voltage, current, field-strength, etc.) of the derivatives as well as they can follow the original signal. The rates of change of the signal, of the rate-of-change of the signal, of the rate-of-change of the rate-of-change of the signal, etc., all exist and are finite.
The harmonics of a square wave exist because the rate of change (first derivative) of a square wave consists of very high, sudden peaks; infinitely high spikes, in the limit-case of a so-called perfect square wave. Real physical systems can't follow such high rates, so the signals get distorted. Capacitance and inductance simply limit their ability to respond rapidly, so they ring.
Just as a bell can neither be displace nor distorted at the speed with which it is struck, and so stores and releases energy (by vibrating) at slower rates, so a circuit doesn't respond at the rate with which it is struck by the spikes which are the edges of the square wave. It too rings or oscillates as the energy is dissipated.
One conceptual block may come from the concept of the harmonics being higher in frequency than the fundamental. What we call the frequency of the square wave is the number of transitions it makes per unit time. But go back to those derivatives - the rates of change the signal makes are huge compared to the rates of change in a sinusoid at that same frequency. Here is where we encounter the higher component frequencies: those high rates of change have the attributes of higher frequency sine waves. The high frequencies are implied by the high rates of change in the square (or other non-sinusoid) signal.
The fast rising edge is not typical of a sinusoid at frequency f, but of a much higher frequency sinusoid. The physical system follows it the best it can but being rate limited, responds much more to the lower frequency components than to the higher ones. So we slow humans see the larger amplitude, lower frequency responses and call that f!
In practical terms, the reason harmonics "appear" is that linear filtering circuits (as well as many non-linear filtering circuits) which are designed to detect certain frequencies will perceive certain lower-frequency waveforms as being the frequencies they're interested in. To understand why, imagine a large spring with a very heavy weight which is attached to a handle via fairly loose spring. Pulling on the handle will not directly move the heavy weight very much, but the large spring and weight will have a certain resonant frequency, and if one moves the handle back and forth at that frequency, one can add energy to the large weight and spring, increasing the amplitude of oscillation until it's much larger than could be produced "directly" by pulling on the loose spring.
The most efficient way to transfer energy into the large spring is to pull in a smooth pattern corresponding to a sine wave--the same movement pattern as the large spring. Other movement patterns will work, however. If one moves the handle in other patterns, some of the energy that gets put into the spring-weight assembly during parts of the cycle will be taken out during others. As a simple example, suppose one simply jams the handle to the extreme ends of travel at a rate corresponding to the resonant frequency (equivalent to a square wave). Moving the handle from one end to the other just as the weight reaches end of travel will require a lot more work than would waiting for the weight to move back some first, but if one doesn't move the handle at that moment, the spring on the handle will be fighting the weight's attempt to return to center. Nonetheless, clearly moving the handle from one extreme position to the other would nonetheless work.
Suppose the weight takes one second to swing from left to right and another second to swing back. Now consider what happens if one moves the handle from one extreme of motion to the other has before, but lingers for three seconds on each side instead of one second. Each time one moves the handle from one extreme to the other, the weight and spring will have essentially the same position and velocity as they had two seconds earlier. Consequently, they will have about as much energy added to them as they would have two seconds before. On the other hand, the such additions of energy will only be happening a third as often as they would have when the "linger time" was only one second. Thus, moving the handle back and forth at 1/6Hz will add a third as much energy per minute (power) to the weight as would moving it back and forth at 1/2Hz. A similar thing happens if one moves the handle back and forth at 1/10Hz, but since the motions will be 1/5 as often as at 1/2Hz, the power will be 1/5.
Now suppose that instead of having the linger time be an odd-numbered multiple, one makes it an even-numbered multiple (e.g. two seconds). In that scenario, the position of the weight and spring for each left-to-right move will be the same as its position on the next right-to-left move. Consequently, if the handle adds any energy to the spring in the former, such energy will be essentially cancelled out by the latter. Consequently, the spring won't move.
If, instead of doing extreme motions with the handle, one moves it more smoothly, then at lower frequencies of handle motion there are apt to be more times when one is fighting the motion of the weight/spring combo. If one moves the handle in a sine-wave pattern, but at a frequency substantially different from the resonant frequency of the system, the energy that one transfers into the system when pushing the "right" way will be pretty well balanced by the energy taken out of the system pushing the "wrong" way. Other motion patterns which aren't as extreme as the square wave will, at at least some frequencies, transfer more energy into the system than is taken out.
an even simpler analogy is to imagine a trampoline.
electrifying a conductor is analogous to stretching the trampoline membrane, doing so 'stretches' (distorts) energy fields linked to that wire.
go stand in the middle of the trampoline, reach down and grab the membrane of the trampoline floor. now stand up and pull/stretch it up as you go, so there is a peak about the height of your waist.
this has of course the effect of storing some energy in the membrane.
now if you just let it go, it will not simply float gently down and stop moving. it will snap down quickly and then VIBRATE... oscillating back and forth a bunch more times 'on its own' ... as it runs down its stored energy.
if instead you gradually lower it back into place... it cant violently snap anywhere and so nothing causes/allows it to vibrate 'on its own'. the only vibrating its doing is from you moving it.
all frequencies (of any waveform) have mathematical harmonics, waveforms with sudden potential changes provide an easier opportunity for these harmonics to be expressed as real world oscillations.
Just a complement to this question,
Are these harmonics irrelevant in things such as data transfer (high=1, low=0) and only matter in situations such as audio or RF?
that I think nobody said: It is not irrelevant. Usually we are interested in transmitting pulses in digital circuits so in most cases we don't take this wave phenomenology into consideration. This is because even though the square wave has its harmonics (not infinity number of harmonics in real world) so it will take some time to rise/fall, your circuitry design is usually "aware" of that. This is one of the greatest advantages of digital electronics/digital communication: from a given point (voltage) up, the signal is interpreted as 1 and from a given point down, it's 0. In most cases it does not really matter the precise format of the square wave since it meets certain time specifications.
But note that whether your square signal frequency rises up to a point where the wavelength is approximately in the order of magnitude of its transmission line (may be a conductive track of a PCB), then you may take this wave phenomenology into consideration. You still have a circuit in your hand but some wave phenomena may occur. So depending on your "line" impedance, some frequencies may have different propagation speed of other frequencies. Since the square wave is composed from many harmonics (or ideally infinity) you probably will have a distorted square wave in the end of your transmission line or conductive track (because each harmonic will travel with different speed).
A good example where this may happen is when we use USB data transmission in a circuit. Note that the data rate is very high (high frequency square waves) so you must take the impedance of your transmission line into consideration. Otherwise you probably will have problems in the communication.
In short, it all does matter and it all works together but is up to you to analyze whether these things are important in your project/analysis or not.