Why Not Extend Predicate Logic?
Predicate logic
Probability theory extends propositional logic to allow reasoning about degrees of credibility. Predicate logic extends it in a different direction, replacing simple propositional symbols with a rich language of atomic formulas, adding variables, adding quantification (βfor all π£β and βthere exists π£β), and allowing reasoning about infinite domains. Predicate logic is, more or less, the language of mathematics.
If you've taken a calculus class, you may remember that
(βthe limit of πβ as π goes to infinity is π₯β) is defined to mean
or, in more formal notation,
This is an example of a predicate logic formula.
First-order logic
More specifically, it is a first-order predicate logic (FOPL) formula. βFirst-orderβ just means that when we write ββπ£β or ββπ£β the variable π£ is assumed to range over some fixed domain of discourse π·. There is no quantification over functions or relations on π·; that would be second-order predicate logic.
The restriction to first-order quantification is not as important of a limitation as you might think. As Zermelo-Fraenkel set theory demonstrates, your domain of discourse may include objects that behave like functions and relations. In ZF,
the objects are sets;
an ordered pair (π₯,π¦) is represented as the set {{ π₯ } , { π₯, π¦ }};
a binary relation is represented as a set of ordered pairs; and
a function is represented as a binary relation π with the special property that if π₯ π π¦ and π₯ π π§ then π¦ = π§, guaranteeing that there is a unique π¦ associated with π₯.
So for most purposes that require predicate logic, FOPL is sufficient.
Propositional vs. predicate logic compared
OK, so FOPL is a powerful and expressive logic. Then why do we extend only the more limited propositional logic to deal with degrees of credibility? Why not do the same with FOPL? The short answer is that 1) it's not clear how to do so, and 2) we don't need to. Let me expand on the latter.
The important difference between propositional logic and FOPL is that FOPL fully supports reasoning about infinite domains. A propositional formula can refer to an infinite domain such as all integers β€ or all real numbers β, but any single formula can only reference a finite number of distinctions on that domain. For example, a propositional formula can contain a finite number of propositional symbols of the form π‘ = π for various integers π, but it cannot express something like βπ.π(π), where π ranges over all infinitely many integers.
As we'll see later, if the domain of discourse π· is finite and there are constant symbols for each member of π·, then any FOPL formula can be reduced to a propositional formula via a process of propositionalization. This involves
turning βπ£.π(π£) into a conjunction (AND) of formulas, one for each member of π·, and
likewise turning βπ£.π(π£) into a disjunction (OR) of formulas.
This is why the fundamental distinction between propositional logic and FOPL is the latter's support for reasoning about infinite domains. For finite domains FOPL offers notational convenience and improved computational efficiency of deduction, but does not extend what can be said nor what can be deduced.
Finite vs. infinite domains
When considering statements about specific things that may occur, or that we may experience, or data we may acquire, we are in a finitary realm where propositional logic (and its extension to probability theory) are adequate:
We are talking about specific, concrete instances of possible phenomena.
We have only a finite number of observations.
There are finite limits to how fine-grained of distinctions we can observe.
We can experience only a finite portion of the world.
A universally quantified statement of the form βfor all π£, π(π£) is true'' is something that we can never observe nor experience.
This then is the argument for why it suffices to extend propositional logic. On the other hand, one can make the argument that although a finite set of propositions is adequate for the query π΄ in Pr(π΄ | π), the premise π, one's state of information, may need to be an infinite collection of propositions in order to fully capture what is known. Furthermore, the infinite case can often be simpler and more convenient to work with than the (very large) finite case.
For example, suppose that π¦ is the sum of a large number of small, independent disturbances:
Then
and computing this exactly for large π generally requires summing a large number of terms. But as π goes to β, the distribution for π¦ converges to the standard (continuous) normal distribution. So for large π it is much easier and faster to instead compute
where π· is the standard normal CDF.
The above example shows not only why you might want to work with infinite domains, but how to do it: by taking a large-π limit of a finite analysis parameterized by an integer π that determines the size of the (finite) domain being considered. As I have previously written, we'll be following Jaynes' finite sets policy1: start by modeling the problem of interest using finite sets of propositions, and observe what happens as the number of propositions increases indefinitely. In a series of future posts I will explore this process in more detail, and formalize both what it means for an infinite sequence of premises to be converging and what its limit is.
E. T. Jaynes, 2003, Probability Theory: The Logic of Science, Cambridge University Press, p. 663.

