I have performed non-trivial scientific calculations, in university and beyond, for almost 15 years.
Until the end of undergraduate school, this was mostly done by hand. Optimizing the trajectory of a heavy ball or solving the heat equation are problems with analytical solutions, and thus can be solved with pen(cil) and paper.
However, there were instances when numerical tools had to be used, rather than analytical ones. This occurred, for example, when replacing models of the physical world with measurements from the physical world in experimental physics classes. By the end of my B.Sc., the need to take measurements, transform them, and report results was made much simpler by using a computer.
Fast forward to graduate school, and the amount of measurements (and their complexity) require the use of some of the most powerful computers I have ever used, even to this day. When implementing numerical routines, I had to pore over the equations (translated to computer expression) countless times, to ensure that they were correctly applied. Most importantly, all units needed to be carefully checked by hand. Small mistakes went unnoticed and wasted valuable resources. Take a look at one of the computations from my Ph. D. dissertation (PDF, Git repository):
from scipy.constants import physical_constants
def population(temperature: float, frequency: float) -> float:
"""
Determine average phonon population at `temperature`.
Parameters
----------
temperature : float
Temperature in Kelvin
frequency : float
Phonon frequency in Hertz
"""
*_ = physical_constants["Planck constant over 2 pi in eV s"]
h, *_ = physical_constants["Boltzmann constant in eV/K"]
kb, return 0.5 * coth(h * frequency / (2 * kb * temperature)) - 0.5
This isn’t a complex equation, but it showcases the problem perfectly. temperature
, frequency
, h
, and kb
are all floating-point numbers, with their units attached as documentation. This is not robust.
These days, instead of meticulously going over the details of calculations ahead of time, I now defer to my computer to check the validity of my calculations as much as possible. That’s why computers exist: to free people from repetitive and error-prone work. In this post, I will describe a mechanism, typed dimensions, by which we can eliminate entire classes of bugs, and how the dimensional
Haskell package makes this easy.
Dimensions, units, and quantities
In the international system of units (also called SI units), there are 7 basic dimensions a physical value can have:
- time duration ;
- length ;
- mass ;
- electric current ;
- thermodynamic temperature ;
- amount of substance ;
- luminous intensity .
Using the symbols associated with these base dimensions, we can express any dimension using exponents. For example, velocity (length over time duration) has dimensions , while force (mass-length over time squared) has dimensions . Dimensionless quantities have all exponents being 0.
Why are dimensions important? There are mathematical rules associated with combining numbers with dimensions:
Two quantities with dimensions and can only be added or subtracted if (and only if) . For example, it does not make sense to add a length to a mass.
Any two quantities with dimensions and can be multiplied or divided, in which case the resulting dimension follows the usual rules of arithmetic. For example:
From these two facts, we can derive some surprising conclusions. For example, the argument to many functions (such as sine, cosine, exponential, etc.) must be dimensionless! Consider the Taylor series expansion of the sine function:
The argument must have the same dimensions as , which is only possible if and only if is a dimensionless quantity.
Units of measure are physical manifestations of dimensions. Dimensions are abstract, while units of measure are concrete. For example, the length dimension can be measured in units of meters. Each basic dimension has its own canonical unit of measure, the base unit:
- The second (s) for time duration ;
- The meter (m) for length ;
- The kilogram (kg) for mass ;
- The ampere (A) for electric current ;
- The kelvin (K) for thermodynamic temperature ;
- The mole (mol) for amount of substance ;
- The candela (cd) for luminous intensity .
Units for non-basic dimensions, so-called derived units, are combinations of base units using the usual multiplication () and division () operators. For example, velocity has dimensions of , and can be measured in units of meters per second (). Some derived units have names; for example, the Newton (N) is a derived unit corresponding to .
Based on their dimensions, units can only be combined in certain ways. You can only add/subtract units of the same dimensions, while the multiplication/division of units modifies the dimensions.
Units can be converted to any other units of the same dimensions by conversion base unit. For example, converting from units of imperial feet (ft) to kilometers (km) involves conversion from imperial feet to the base unit of length, the meter, and then conversion from the meter to the kilometer.
Finally, a quantity is a number attached to units. 3.15 meter/second is a quantity with magnitude 3.15, units of meters/second, and dimensions of length over time duration (or ).
Scientists take measurements of quantities, and run scientific computations to get answers in the form of other quantities. We can ensure correctness of these computations by realizing that dimensions are known ahead of time. This means that for statically- and strongly-typed languages (e.g. Haskell, Rust), we should enforce correctness using the type system.
Type-safe dimensions
People’s first contact with computational unit systems may come from pint
, a Python library to manipulate quantities. While pint
is better than nothing, it does not guarantee correctness at runtime due to Python’s dynamic nature.
Instead, I prefer the dimensional
package, a Haskell library that guarantees correctness by ensuring type-safe dimensions at compile-time. Other tools exist for other languages – for example, F# famously includes units of measure as a first-party feature.
We will work by example to compute the Maxwell-Boltzmann distribution. This is the distribution of velocities of identical particles of mass , given some thermodynamic temperature :
where is the magnitude of particle velocity in a three-dimensional space, and is the Boltzmann constant.
Without typed dimensions, the computation of this distribution can be done as follows:
-- Boltzmann constant in Joules/Kelvin
boltzmannConstant_JpK :: Double
= 1.380649e-23
boltzmannConstant_JpK
untypedMaxwellBoltzmannDist :: Double -- ^ Particle mass in kilogram
-> Double -- ^ Thermodynamic temperature in Kelvin
-> Double -- ^ Particle velocity in meters/second
-> Double -- ^ Probability density (dimensionless)
untypedMaxwellBoltzmannDist mass_kg temp_K velocity_mps = ( mass_kg / (2 * pi * boltzmannConstant_JpK * temp_K) ) ** (3/2)
* exp ( - (mass_kg * velocity_mps **2) / (2 * pi * boltzmannConstant_JpK * temp_K) )
It’s not too hard to check units by hand – but it is tedious and error-prone.
Now let’s do this again, using dimensional
.
Dear reader, be forewarned: the rest of this post is not for the faint-of-heart.
Yes, this is overkill for simple calculations. And yet, the technique you are about to see has saved me numerous times. There’s quite a bit more ceremony; in exchange, you get a lot more certainty about correctness. Software design is about trade-off, and this level of safety is required for some real-world applications.
Setup: I am using the GHC2024 language edition. We will also replace the (implicitly imported) Prelude
module to use Numeric.Units.Dimensional.Prelude
, which replaces the usual arithmetic operators to use dimension-aware ones, as well as importing the Unit
, Dimension
, and Quantity
types.
{-# LANGUAGE NoImplicitPrelude #-}
import Numeric.Units.Dimensional.Prelude
To get acquainted, let’s convert the definition of the Boltzmann constant. The dimensions of the Boltzmann constant are called DHeatCapacity
in dimensional
. Therefore, we can write:
boltzmannConstant :: Quantity DHeatCapacity Double
which means that boltzmannConstant
is a Quantity
of dimension DHeatCapacity
, whose magnitude is represented by a Double
. We use the (*~)
operator to combine a number (1.380649e-23
) with a compound unit, joule / kelvin
, to get:
boltzmannConstant :: Quantity DHeatCapacity Double
= 1.380649e-23 *~ (joule / kelvin) boltzmannConstant
We are not yet ready to implement the Maxwell-Boltzmann distribution function. One particular wrinkle is that exponents (such as the exponent of the first term) change the dimensions (i.e. the type) of the input. Therefore, some type arithmetic is required. To this end, dimensional
provides (^)
to raise to an integer power, and sqrt
to take the square root, while appropriately changing the dimensions at compile time. The type signatures are non-trivial to say the least, as it involves compile-time manipulation of types. The operation of “raising to the three-halfs power” becomes:
raiseToThreeHalfsPower :: Floating a
=> Quantity d a
-> Quantity (NRoot d Pos2 ^ Pos3) a
= (sqrt x) ^ pos3 raiseToThreeHalfsPower x
where pos3
is a type-level integer. The dimension (NRoot d Pos2 ^ Pos3)
will resolve to the appropriate dimension at compile-time via type families once d
is known.
Finally, we can implement the (dimensionally-typed) Maxwell-Boltzmann function:
maxwellBoltzmannDist :: Mass Double
-> ThermodynamicTemperature Double
-> Velocity Double
-> Dimensionless Double
maxwellBoltzmannDist mass temp velocity = raiseToThreeHalfsPower ( mass / (_2 * pi * boltzmannConstant * temp) )
* exp ( negate (mass * velocity ^ pos2) / (_2 * pi * boltzmannConstant * temp) )
where _2
is the dimensionless integer 2
, and (*)
, (/)
, exp
, negate
, and ^
are all dimensionally-aware operators from dimensional
(rather than the Prelude
module).
If you compile the program above, you’ll get an error message ಠ_ಠ:
Couldn't match type ‘Neg3’ with ‘Zero’
Expected: Dimensionless Double
Actual: Quantity
(NRoot
(DMass
/ ((Dim Zero Zero Zero Zero Zero Zero Zero * DHeatCapacity) * DThermodynamicTemperature))
Pos2 ^ Pos3
)
Double
(...snip...)
This is because I’m a sloppy physicist – sorry! The Maxwell-Boltzmann distribution is not a dimensionless quantity: it actually returns a velocity density with dimensions of , or .
Let’s fix this. There is no type synonym for velocity density in dimensional
, but we can create one ourselves.
type DVelocityCube = DVelocity ^ Pos3
type DVelocityDensity = Recip DVelocityCube -- dimension
type VelocityDensity = Quantity DVelocityDensity -- quantity
Our final version of maxwellBoltzmannDist
becomes:
maxwellBoltzmannDist :: Mass Double
-> ThermodynamicTemperature Double
-> Velocity Double
-> VelocityDensity Double
maxwellBoltzmannDist mass temp velocity = raiseToThreeHalfsPower ( mass / (_2 * pi * boltzmannConstant * temp) )
* exp ( negate (mass * velocity ^ pos2) / (_2 * pi * boltzmannConstant * temp) )
We can compute the result of maxwellBoltzmannDist
in any compatible unit. We will do this for a gas of diatomic nitrogen. In an interactive Haskell session:
> import Numeric.Units.Dimensional.NonSI ( unifiedAtomicMassUnit ) -- unifiedAtomicMassUnit = amu
ghci>
ghci> let n2_mass = 2 *~ unifiedAtomicMassUnit
ghci> let room_temperature = 300.0 *~ kelvin
ghci> let velocity = 400 *~ (meter / second)
ghci>
ghci> maxwellBoltzmannDist n2_mass room_temperature velocity
ghci4.466578950309018e-11 m^-3 s^3
We can easily request specific output units using the (/~)
operator. In this case, we convert result to (hour / nautical mile)3:
> import Numeric.Units.Dimensional.NonSI ( nauticalMile, degreeRankine, knot )
ghci>
ghci> let n2_mass = 2.6605E-27 *~ kilo gram
ghci> let room_temperature = 491.0 *~ degreeRankine
ghci> let velocity = 777 *~ knot
ghci>
ghci> (maxwellBoltzmannDist n2_mass room_temperature velocity) /~ ((hour / nauticalMile) ^ pos3)
ghci5.041390507268275e-12
Conclusion
This was a whirlwind tour of dimensional
with a practical example. If this was your first introduction to typed dimensions in Haskell, do not be alarmed – I bounced off of it the first time.
As mentioned, software design is about trade-off. Using dimensional
is complex, but it isn’t complicated1; it makes you address the complexity of dimensions and units up-front, rather than hoping for the best. Sometimes this doesn’t make sense, but for critical calculations, there is nothing like typed dimensions.
I am now a maintainer of dimensional
on GitHub, and I want nothing more than to help people use typed dimensions (when it makes sense). If you have any feedback, for example missing features or lacking documentation, feel free to raise an issue!
Code available in Haskell module.