Motivation
Recall Jaynesβ finite sets policy:
In principle, every problem must start with⦠finite-set probabilities; extension to infinite sets is permitted only when this is the result of a well-defined and well-behaved limiting process from a finite set.
In first-year calculus we learn about limiting processes in the realm of numbers; the real numbers themselves are obtained by starting with the finitely-representable rational numbers and filling in the gaps, so to speak, adding additional numbers such as β2 that are approximated arbitrarily closely by rational numbers but are not themselves rational. These limiting processes rely on a notion of distance between two values, with the distance between real numbers π₯ and π¦ being |π₯-π¦|. Our limiting processes are then sequences of values which become and remain arbitrarily close to each other, known as Cauchy sequences. (Iβll say more on this in the next article.)
(Pseudo-)metric spaces generalize these concepts to other sorts of mathematical objects for which we can define a suitable distance function. In particular, we will later define a pseudo-metric on premises that is related to how close the probabilities they define are: the distance between π and π will be defined in terms of the distances between ππ³(π΄ | π) and ππ³(π΄ | π) for arbitrary queries π΄. This will allow us to define exactly what we mean by a limiting process on premises. And, just as the real numbers are obtained by completing the rational numbers, adding in any missing limits of Cauchy sequences, weβll obtain a space of generalized premises that take us beyond the finite but can be approximated arbitrarily closely by finite premises.
Definition of a (pseudo-)metric space
A pseudo-metric space (π, π) is a set π together with a real-valued binary function π : πΓπ β β, called a pseudo-metric, that behaves like a distance function; that is, for all π₯,π¦,π§ β π,
π(π₯, π¦) β₯ 0 (distances are nonnegative);
π(π₯, π₯) = 0 (every point is at zero distance from itself);
π(π₯, π¦) = π(π¦, π₯) (the distance from π₯ to π¦ is the same as the distance from π¦ to π₯); and
π(π₯, π§) β€ π(π₯, π¦) + π(π¦, π§) (triangle inequality: going to an intermediate point first cannot shorten the total distance).
If π has the additional property that
π(π₯, π¦) = 0 only if π₯ = π¦ (any two distinct points are separated by some nonzero distance),
then we call π a metric and we call (π, π) a metric space.
A metric is a generalization of the familiar Euclidean distance between real-valued vectors, but it is a distance function appropriate to the set π. Here are some examples:
Discrete metric on the integers: π(π₯, π¦) = 1 if π₯ β π¦, 0 if π₯ = π¦.
Hamming distance: π is the set of strings of a given length π, and π(π₯, π¦) is the number of locations at which π₯ and π¦ differ. E.g., π(ππππππ, ππππππ) = 3 because the two strings differ at locations 1, 2, and 4.
Edit distance: π is the set of strings of any length, and π(π₯, π¦) is the number of single-character edits (insertion, deletion, or replacement) required to turn π₯ into π¦. E.g., if π₯ = πππ and π¦ = ππππππ, then π(π₯, π¦) = 4. (π₯ β π¦: insert π at the beginning, replace the π with a π, and inserting π and π at the end; π¦ β π₯: delete π and π from the end, replace the π by a π, and delete π from the beginning.)
Prefix metric on Cantor space: π is the set of infinite bit strings, and π(π₯, π¦) = 2β»βΏ if the first position at which π₯ and π¦ differ is position π, counting from 0. E.g., if π₯ = πΆπ·π·πΆπΆπ· β― and π¦ = πΆπ·π·π·πΆπΆ β― then π(π₯, π¦) = 2β»Β³.
Equivalence under a pseudo-metric
There is an obvious equivalence relation associated with a pseudo-metric space (π, π):
(You can easily verify that this defines an equivalence relation.) Consider, for example, the edit distance ignoring case differences. Here we have π(πππ, π°ππ½π) = 1 instead of 4. π is a pseudo-metric, but not a metric, because
even though πππ β π°π½π. The strings are equivalent: πππ βΌ π°π½π.
This equivalence relation gives us a standard way of creating a metric space (π, π) from a pseudo-metric space (πβ, πβ):
π β πβ/βΌβ. That is, π is the set of equivalence classes under the equivalence relation βΌβ defined by the pseudo-metric πβ.
π([π₯], [π¦]) β πβ(π₯, π¦). This is well-defined because if
\(x'\sim_{0}x\mbox{ and }y'\sim_{0}y\)then π(π₯β², π₯) = π(π¦β², π¦) = 0, and the triangle inequality then guarantees that π(π₯β², π¦β²) = π(π₯, π¦).
Turning it around, we can think of the elements of πβ as concrete representations of more abstract entities, with πβ(π₯, π¦) defined to be identical to some appropriate distance metric between the abstract entities represented by π₯ and π¦.
When we define our pseudo-metric on premises, the corresponding equivalence relation will be such that π and π are equivalent iff ππ³(π΄ | π) = ππ³(π΄ | π) for every query π΄.
Isometries
If (πβ, πβ) and (πβ, πβ) are pseudo-metric spaces, an isometry of πβ into πβ is a function π : πβ β πβ that preserves distances:
for all π₯,π¦ β πβ. We can think of this as saying that πβ is in some sense embedded in πβ, or is a subset of πβ, or πβ is an extension of πβ. That latter interpretation will be important when we talk about completing a pseudo-metric space, which is central to implementing Jaynesβ finite sets policy. For an isometry π, if π(π₯) = π(π¦) then πβ(π₯, π¦) = 0, i.e. π₯ and π¦ are equivalent; and if πβ is a metric then π₯ = π¦. Thus an isometry of a metric space is always one-to-one; distinct elements of the domain always map to distinct elements of the range.
If π is an onto functionβevery element of πβ can be expressed as π(π₯) for some π₯ β πββthen we say that π is an isometry of πβ onto πβ. If, additionally, πβ and πβ are metrics, then π is both one-to-one and onto (a bijection), and we say that the two metric spaces are isometric to each other. In this case we can think of the two metric spaces as being essentially identical except for renaming of elements.
Here are some geometric examples of these concepts:
Let πβ be the real line β with πβ(π₯, π¦) the one-dimensional Euclidean distance |π₯-π¦|, and let πβ be the plane βΒ² with πβ(π₯, π¦) the two-dimensional Euclidean distance β₯π₯-π¦β₯β. If we define π : β β βΒ² by π(π₯) β (π₯, 3), that is π maps the real line to a horizontal line in the plane with π¦-intercept 3, then π is an isometry of β into βΒ².
Likewise, if we define π : β β βΒ² by π(π₯) β (π₯/β2, π₯/β2), that is, π maps the real line to a 45-degree horizontal line in the plane passing through the origin, then π also is an isometry of β into βΒ². Note that we needed the factor of 1/β2 to make the distances match.
Let πβ be the plane βΒ² and πβ the Euclidean distance. Let πβ = πβ and πβ = πβ. If π : βΒ² β βΒ² is a rigid translation such as π(π₯, π¦) = (π₯+3, π¦-7) then π is an isometry of βΒ² onto itself. Likewise, if π is a rigid rotation such as π(π₯, π¦) = (-π¦, π₯) then π is again an isometry of βΒ² onto itself.
Continuity and pseudo-metric spaces
Discussing well-defined and well-behaved limiting processes naturally brings us to the topic of continuity, which also generalizes to arbitrary pseudo-metric spaces. For a function between two pseudo-metric spaces the definition of continuity at a point is the same as it is for functions on the real line, except that again we replace |π₯-π¦| with π(π₯, π¦). Let π : πβ β πβ, where (πβ, πβ) and (πβ, πβ) are pseudo-metric spaces, and let π₯ β πβ; we say that π is continuous at π₯ if
That is, we can make π(π¦) arbitrarily close to π(π₯) by choosing π¦ sufficiently close to (yet distinct from) π₯. Additionally, we simply say π is continuous if it is continuous at all points in its domain.
For examples in the familiar case of real numbers and Euclidean distance, consider the real-valued functions
whose domain is all of β. Both of these functions are continuous at every point in β except for π₯ = 0.
We can make either of these functions be continuous by restricting their domains: if π is defined to be the result of restricting π to just the nonnegative real numbers, then π is a continuous function; and if π is defined to be the result of restricting π to just the positive real numbers, then π also is a continuous function.
Now consider a less familiar example: a function
on the Cantor space (infinite bit strings), using the prefix metric defined earlier. Plugging the definition of the prefix metric into our generalized definition of continuity for metric spaces yields the following:
π is continuous at π₯ iff, for any π β β, there exists some π β β such that the the first π bits of π₯ determine the first π bits of π(π₯).
(More precisely: the first π bits of π(π¦) are the same as the first π bits of π(π₯) whenever the first π bits of π¦ are the same as the first π bits of π₯.)
Or consider a real-valued function
on the Cantor space. Then
π is continuous at π₯ iff, for any π > 0, there exists π β β such that the first π bits of π₯ determine π(π₯) to within an accuracy of π.
Uniform continuity
We say that π is uniformly continuous if, in the definition of continuity discussed above, for any given π we can use the same πΏ for all π₯ β πβ; that is,
Going back to the function π(π₯) = 1/π₯ defined for π₯ > 0, although π is continuous, it is not uniformly continuous: the required πΏ for any fixed π gets arbitrarily small as π₯ gets closer and closer to 0 and π(π₯) gets arbitrarily large. If, however, we restrict the domain further, defining h(π₯) = 1/π₯ for π₯ β₯ 1, then h is uniformly continuous: for any given π > 0, the πΏ > 0 that works for π₯ = 1 works for any π₯ > 1 also.
Equivalence of pseudo-metrics
Sometimes we donβt care too much about the exact value of the pseudo-metric, being interested only in its implications for continuity. Let πβ and πβ be pseudo-metrics on the same set π.
We say that πβ and πβ are topologically equivalent iff the identity function π(π₯) = π₯ is continuous when considered either as a function from (π, πβ) to (π, πβ) or as a function from (π, πβ) to (π, πβ). This means that small distances according to πβ are also small distances according to πβ, and vice versa.
If you know what a topology is, one can also define pseudo-metrics to be topologically equivalent iff they induce the same topology.
We say that πβ and πβ are uniformly equivalent iff the identity function π(π₯) = π₯ is uniformly continuous when considered either as a function from (π, πβ) to (π, πβ) or as a function from (π, πβ) to (π, πβ).
Itβs easy to see that if πβ and πβ are topologically equivalent, then any continuous function from (π, πβ) to another pseudo-metric space (π, π) is also a continuous function from (π, πβ) to (π, π). Likewise, any continuous function from (π, π) to (π, πβ) is also a continuous function from (π, π) to (π, πβ). A similar comment applies for uniform equivalence.
Coming Up
In the next article weβll talk about limiting processes in the general context of pseudo-metric spaces. This includes Cauchy sequences, limits, the notion of completeness, and how to complete a pseudo-metric space, analogously to how we obtain the real numbers by completing the rationals.
the most minor of typos: "as saying that πβ is in some sense embedded in πβ"