1. Molecular Model and Basic Physical Variables
The smallest wavelength within the audio spectrum is about 2 cm (the maximum possible frequency in air is discussed below). We can select a cube of air a thousand times smaller than this in width and it will still contain over 2x10^{11} molecules. Within this cube, the statistics of molecular velocity will be essentially uniform. The statistics do change over distances on the order of a wavelength, and this variation in the statistics is precisely the subject of interest.
For simplicity, we will consider a gas consisting of a single type of diatomic molecule. The location of the cube under consideration is defined by the vector r=xa_{x}+ya_{y}+za_{z}, where a_{x,} a_{y,} and a_{z} are unit vectors in the Cartesian coordinate system. Except for the molecular mass, which is of course a constant, the basic quantities we will deal with are all functions of x, y, z, and time t. They are:
Table 1. Quantities Describing the Molecular Model
m 
Mass of one molecule, [kg] 
n 
Molecules per cubic meter, [m^{3}] 
v_{x}, v_{y}, v_{z} 
Cartesian components of molecular velocity, [ms^{1}] 
u_{x}, u_{y}, u_{z} 
Mean values of v_{x}, v_{y}, v_{z,} [ms^{1}] 
Variances of v_{x}, v_{y}, v_{z,} [m^{2}s^{2}] 
The product of molecular weight and particle density equals the mass density, ρ(r,t)=mn(r,t) kgm^{3}. The mean velocity will also be written as a vector u=u_{x}a_{x}+u_{y}a_{y}+u_{z}a_{z}.
We only need two additional assumptions: (1) that the statistics of the three Cartesian velocity components are described by three independent Gaussian distributions (actually any distribution symmetric around its mean is sufficient), and (2) that the principle of equipartition of energy, which says that energy distributes itself equally among all degrees of freedom, applies locally within our cube. Both assumptions are in fact true for a gas at equilibrium [Vincenti and Kruger page 35]. A diatomic molecule has five degrees of freedom; translation in three dimensions plus rotation around two axes. (As noted below the full story is a little more complicated)
Equipartition of energy is a result of the frequent collisions between molecules. Collisions can otherwise be ignored; all effects are reflected in the velocity statistics. (See a brief analysis of collision effects). Without collisions a wave will not propagate much further than a wavelength. This is the limiting factor for the maximum possible frequency in air. The minimum possible wavelength is on the order of the mean free path, corresponding to a frequency in the vicinity of 10^{9} Hz.
All of the equations we need can then be directly derived from the statistics of the molecular motion. It turns out that the resulting sound waves propagate without frictional loss. This surprised me, and it took a while before I figured out where the loss term got lost. For an analysis that includes the missing loss factor, and more on the effect of collisions and the upper limit on frequency, see Wave Attenuation.
One further note on equipartition. It might be thought that the equipartition should apply to the total energy in each degree of freedom, which would mean
where E{x} denotes the expected value of x [see for example Papoulis], and σ_{r} is the standard deviation of the spin velocity around either axis.
However, consider a set of molecules with u_{x} = 0, which has a series of collisions, resulting in all the variances being equal at equilibrium. Now take the same set with the same starting conditions, except add u_{x }to all the xvelocities. If we view the collisions in a coordinate system moving with the velocity u_{x}, the result will look exactly the same as the first case. (See a brief analysis of collision effects). Thus the variances should be equal, independent of the velocity means, and we can use a common value σ^{2} for all variances.
The particle velocities are the most basic characterization of a molecular model. The traditional variables, temperature and pressure, are one level of abstraction higher. However it is useful to make the connection with temperature and pressure, and that is the next topic. The definitions given here differ from the standard statistical mechanics definitions, but are equivalent for most circumstances; for more on this see Temperature and Pressure in Statistical Mechanics.
Here we assume that the mean values of the velocity components are negligible compared to the random velocity values, i.e. u<< σ. In fact this is true for even extremely loud sound levels, as will be verified later. The mean value of the kinetic energy per molecule, for each degree of freedom is then
As noted, there are 5 degrees of freedom for a diatomic molecule, so the total energy per molecule is 5 times the above equation. (for a more complete story see more on degrees of freedom)
Temperature T in degrees Kelvin is essentially defined by the equation
where k is Boltzmann's constant (see Units, conversion factors, and physical constants), and again this applies to each degree of freedom.
Therefore, equating (2) and (3) relates temperature to the molecular mass and velocity variance:
The total velocity of a molecule
has a Maxwell probability density [Papoulis] with a mean value of
Plugging in some numbers, at T=293 K (20 ^{0}C), and for m=4.65x10^{26} kg (N_{2}), the resulting values are: σ=295 ms^{1 }and a mean velocity of 471 ms^{1}. This is pretty close to the handbook value I found of 463 ms^{1 }mean velocity for air.
Particle Mass, Momentum, and Energy Fluxes
Before developing the equation for pressure, we will first consider the way that molecular motion carries mass, momentum, and kinetic energy through space. We will quantify the flux of these three quantities through one face of our small cube, a face parallel to the yz plane.
The fraction of particles with an xcomponent of velocity within the range of v_{x} to v_{x}+dv_{x} is:
The molecules within this velocity band flowing across the face of the cube create the following incremental fluxes:
Mass 

+xdirected momentum 

+ydirected momentum 

+zdirected momentum 

Kinetic energy 
A positive flux means there is a net positive flow across the face in the +x direction. All fluxes can be positive or negative, except for the xdirected momentum flux. A positive v_{x} causes an increase in the +xdirected momentum on the +x side of the face. A negative v_{x} causes a decrease in the xdirected momentum on the +x side of the face. In either case, the net +xdirected momentum is increased on the +x side of the face.
The 2σ^{2 }term in the kinetic energy flux is due to the rotational energy for each of the two spin axes. This term would be absent for monatomic molecules.
The total net fluxes across the face are obtained by integrating with respect to v_{x}, which is equivalent to calculating the mean values of the fluxes. The results are:
Table 3. Total Net Fluxes
Mass 

+xdirected momentum 

+ydirected momentum 

+zdirected momentum 

Kinetic energy 
Equation (7) is a Gaussian probability distribution, but this form is actually only used for the pressure calculation below, which is not part of the basic derivation. All of the results in Table (3) are valid for any probability distribution, except for the energy flux, which also assumes a distribution symmetrical with respect to its mean value.
The pressure exerted by a gas is due to molecular collisions against a surface. Pressure is equal to the change in momentum caused by the collisions, per unit time and per unit area. (Go to an animation of molecular collisions you can play with  created by FuKwun Hwang at National Taiwan Normal University ). We will assume at this point that the surface is immovable; a vibrating surface is considered later, after the sound equations are developed and solved. We also assume that the surface is flat, except at a molecular scale. In other words, we will (with one later qualification), allow molecules to bounce off the surface at odd angles, but otherwise treat the surface as perfectly flat. We will consider a surface parallel to the yz plane, with air on the left (x direction), and a perfect vacuum on the right.
Consider a thin slice of air adjacent to the surface. Divide the molecules in this slice into two groups: all molecules with positive xvelocities go into group A, and all molecules with negative xvelocities go into group B. Group A is about to collide with the surface, and group B has just collided with the surface.
Consider the situation pictured below; on the lefthand side, a molecule strikes the surface with velocity v, at some angle, and bounces off with the same velocity but at a different angle. On the righthand side the direction of travel is reversed. If the probabilities of the righthand and lefthand cases are equal, then the statistics of the molecules in group B will be identical to the statistics of the molecules in group A, except with the sign of the xvelocity component reversed (the sign of the y and zvelocity components are also reversed, but this doesn't change the statistics). In other words, statistically the molecules behave exactly as if they bounce like billiard balls off a perfectly smooth surface, with a change in sign of v_{x}, and an average momentum change of 2mv_{x}. The mean velocity at the surface is zero, and there is an equal probability of a molecule being in group A or group B. The two groups are therefore the two halves of a zeromean Gaussian distribution.
For a lossless collision of a monotonic molecule, the velocity after the collision must equal the velocity before the collision, the angle with respect to the xaxis at impact is totally random, and clearly the above situation applies. For a diatomic molecule, the argument is a little more difficult, since the translation velocity before and after the collision is in general different. A numerical model in a separate subsection on dumbbell collisions indicates that despite the potential differences, in fact equal pressure is exerted by diatomic and monatomic molecules, at an equal temperature. The only qualification is that a perfectly flat surface is assumed in calculating the dumbbell collisions (no odd bounces). In any case this result is confirmed below by an alternative argument.
We need to compute the mean value of the momentum change. With a factor of 2 due to the momentum change, and the restriction to molecules with positive xvelocities, the pressure is:
The same result for pressure can be obtained directly from the momentum flux given in Table 3. In the absence of a sound wave there is a constant momentum flux of ρσ^{2} flowing in the xdirection everywhere in the gas. At a wall the flux stops abruptly, and this change in momentum represents a force, equivalent to the pressure given by equation (8). This result applies to both monatomic and diatomic molecules, and is independent of the surface roughness. It also eliminates the assumption of a Gaussian velocity distribution, so it is a very direct and general result. If a sound wave is present, it carries additional momentum, also as given in Table 3. This momentum is evaluated for a plane wave in another section. Note that it is important to distinguish between momentum flux and momentum density (momentum per unit volume). In the case of a gas at equilibrium the expected value of momentum density is zero, while the momentum flux is quite large.
This ends the connection with the traditional variables of temperature and pressure. In the next section we return to the main thrust of the derivation, and use the mass, momentum, and particle fluxes presented in Table 3 above, to derive basic equations of fluid mechanics.