There are so many problems with the generic idea of octaves as taught in Western music that I can't get into them all here. The question of why do we treat octaves as equivalent has to be qualified by questions of when. In Bill Sethares' notable book Tuning Timbre Spectrum Scale (the link includes summary and samples), he describes and provides audio examples of how the idea of equivalence and blend is related to the particular vibrations in a sound's spectrum. Among the simplest examples, as piano tuners know, the inharmonicity of hammered (or plucked) strings can lead to stretched octaves. Bill goes on to discuss harmony for completely inharmonic sounds as well, which is a whole further (and very interesting) issue.
There have been many attempts to explain why the octave is so significant, some based on carefully controlled scientific experiments. What we know about octaves is that they are, at the most basic, a doubling (or halving) of vibration speed (or close). Because of that, they blend very well and fit into the same overall periodicity. Men and women have likely been singing octaves since prehistoric times. The vast majority of all the world's music treats the octave specially, usually with equivalency, giving the same note name to pitches at an octave.
While the significance of octaves must be acknowledged, there is more than enough evidence to disprove any claim that octaves are absolutely universally equivalent. Octaves are not fully equivalent. They sound different. Harmony does not work the same way at all octaves. A close position major chord sounds great in middle to high ranges, but move it down some octaves into bass ranges and it sounds muddy even if the tuning is not tempered. The vast majority of all guitar methods teach that octaves are equivalent and so any combination of C, E, and G makes a C chord; but the very same books near-universally teach students not to play the low E when holding a C chord. That E doesn't fit as well the harmonic series of the rest of the C chord, so it sounds rougher. If they explain at all, most books and teachers just say that while E is technically part of the chord, we just don't play it because it sounds bad. Some books say it is because the lowest note should be C for a C chord. And yet there is widespread acceptance of the same chord with a low G bass note (which fits the harmonic series in that octave better than the E). Explaining all this is simple once we drop the idea that octaves are totally equivalent.
Another great example is Diana Deutch's Mysterious Melody. She shows how if the octaves are mixed up in a melody it is unrecognizable. Significantly, however, once the melody is known by hearing it normally, then one can still hear it within the mixed-up-octaves version! So octaves have some equivalence: they can substitute for one another if our expectations are clear. But they aren't fully equivalent; it is based on expectations and context.
A common question says, "if octaves are equivalent because they fit into the same periodicity and are a simple 1:2 ratio and part of the harmonic series, wouldn't the 1:3 ratio be comparable and therefore also be equivalent?"
Maybe octaves are more significant because of cultural reinforcement. Maybe it's because men and women don't have such different ranges that they would sing at an even further 1:3 ratio. Maybe it's because 1:3 can be divided by 1:2, thus making 1:1.5 which is not as simple and thus 1:2 is more absolutely basic... but maybe we shouldn't even assume that 1:3 can't be equivalent. Maybe it can be.
I was playing around with this today and decided to make a video showing how effective it is to play at a set 1:3 ratio (called a twelfth in standard Western music theory terms counting letters; also called a tritave in harmonic terms because it's a multiple of 3)
Notice how the final note in the video sounds fully resolved (it does to me anyway). Both the low and high notes each feel not just like part of a tonic harmony but actually feel like the main tonal center, even though one is E and the other is B. But this isn't bitonality to me, though it might be arguably similar. I think this is more like twelfths/tritave equivalency and feels about the same as octave equivalency, just lacking the life-long cultural reinforcement. Maybe in a 12ths/tritave-based theory the E and B would actually get the same name, like we usually do with octaves (for example, the Bohlen-Pierce tuning is tritave-based). Sure, 12ths/tritaves don't sound really identical, but remember that octaves don't either...
EDIT update 11/15/10:
I should mention that the technique I used (playing some music and then checking whether an isolated tone seems to fit) has technical terms. The preparation listening is called "priming" and the isolated tone is called a "probe tone." These are some of the standard methods used in empirical studies of music cognition. My simple demonstration could easily be repeated in controlled testing administered to a number of listeners from different backgrounds. Different priming and different probe tones could be used. The results could better clarify my hypotheses about the potential for listeners to learn 12ths-equivalence (or other alternate equivalences), though additional varieties of tests would be needed to truly be conclusive.