Russell's
1925 Logic
Section 1: History
Russell's metaphysical stance in [1903] was an extreme Platonism: you name it and he believed in it, and at least one contemporary reviewer spoke of the book's "scholastic realism." By the time Principia Mathematica was written the Platonism was, at the very least, less exuberant. This was partly a result of the shock of the paradoxes and partly a result of Russell's growing adherence to the traditional British empiricism, and by 1910 Russell was clearly trying to limit his ontological commitments to abstracta, taking as his guiding maxim that one should "[w]henever possible, substitute constructions out of known entities for inferences to unknown entities" [1924]. Gödel [1944] dubs Russell's new philosophical stance as "constructivism."
In itself, that is quite a good name for it, but the logical community now understands the word as referring to like those of Brouwer and Bishop, and in two versions of an introductory footnote added to reprintings of [1944], Gödel noted the difference and characterized Russell's program as "a strictly anti-realistic" or "a strictly nominalistic kind of constructivism."
Section 2: Ramified Logic and Semantics
In order to facilitate comparison with the 1925 logic, this section presents a version of Ramified Type Theory ("IRT" -- Edition I, Ramified Types), generally following Church [1976] but formulated with abstracts (and rules of b-conversion) instead of comprehension axioms. The second subsection discusses the status of extensionality principles and the motivation for using abstracts, and the third presents the "substitutional" interpretation.
(A) The logic IRT. Since Ramified Type Theory adds formal complications to Simple Type Theory, I wll start with the latter. (For details on axiomatizing Simple Type Theory, see Church [1972].) We assume entities to be organized into a system of types. There is a type, i, of urelements or individuals, and for every finite sequence of types t1,...,tn a type (t1,...,tn) of propositional functions taking a sequence of arguments of the given types t1,...,tn: n-adic relations (properties in the case n=1) or, if we assume extensionality, n-adic relations-in-extension (classes in the case n=1). We include zero-length sequences, yielding a type () of propositions or, assuming extensionality, truth-values.
A conventional formal language embodying such a conception would be based on a many-sorted quantificational language, with all variables and constants having assigned types. There would be alphabets of free and bound variables for each type, interpreted as ranging over entities of that type, and, optionally and depending on application, constants of some types, interpreted as denoting specific entities of those types. Free variables (and constants) of type () standing alone count as formulas; all other atomic formulas are of the form a[b1...bn], where b1,...,bn are free variables (or constants) of the respective types t1,...,tn, and a is a free variable (or constant) of type (t1,...,tn). Non-atomic formulas are formed in the usual ways, by combining formulas with connectives or by going through a formula, replacing every occurrence in it of some one free variable with occurrences of a bound variable (not already occurring in the formula) of the same type, and prefixing a quantifier with this bound variable as operator variable. Sentences, as usual, are formulas not containing free variables, quasi-formulas things like formulas except for containing unbound occurrences of bound variables in places where they ought to have free variables.
There is an optional added feature for this language which is convenient for some purposes and trivial on some (but not all) assumptions. A formula containing free variables is naturally thought of as expressing a propositional function. In order to ensure that the language contains a term for each propositional function expressed by one of its formulas, we may add abstracts. Where A is a quasi-formula containing free occurrences of no bound variables other than x1,...,xn, of types t1,...,tn respectively, {x1,...,xn: A} is an abstract of type (t1,...,tn), and may occur in any environment in which a free variable or constant of that type may occur (and occurrences of x1,...,xn in an occurrence of {x1,...,xn: A} in a formula are counted as bound in the formula). We can make our machinery uniform by saying that, where A is a formula, {A} is an abstract of type (). If extensionality is assumed, abstracts are eliminable by contextual definition (cf. Quine [1963] for discussion). Otherwise they may not be, and in some applications they give a useful increase in expressive power. Quasi-abstracts are expressions like abstracts but containing ocurrences of bound variables which are neither bound in the contained formula nor bound to the abstraction operator.
The formal language is rich enough that, on its familiar set-theoretic interpretation, completeness is not to be hoped for. A basic axiomatization should, however, include at least standard quantificational logic for each type, and some embodiment of the motivating idea that formulas express propositional functions and sentences propositions. So: start by postulating the usual natural deduction or sequent-calculus rules for the connectives and quantifiers (the quantifier rules for variables and constants of each type), as in Gentzen [1934]. These rules by themselves, however, do not allow us to prove that there is a propositional function expressed by a given formula. As stated so far, the quantifier rules allow free variables and constants (if any) to be substituted for the bound variables of higher types: what is needed is some way of letting formulas be substituted.
There are two general approaches possible here. If extensionality is assumed, the simplest is to postulate axioms of comprehension. Without extensionality,comprehension has to be formulated with a non-truth-functional connective propositional identity, as in Church [1974] and [1984] and (following Church) Anderson [1986] and [1989]. The alternative is to supplement the statement of the quantifier rules, for types other than i, with a notion of substituting a formula for a variable (cf. Church [1974]). This is easier with abstracts as auxiliary notation.
Postulate,
then, rules of b-conversion to handle
abstracts: formulas obtained from one another by b-conversion
are equivalent. (This is an effective
rule: since we are working in a typed system, b-convertibility
is decidable.) Then, in stating the
quantifier rules, allow bound variables to be substituted for abstracts (though
not, of course, quasi-abstracts), as well as free variables and constants, of
their type.
Extensional Simple Type Theory is a natural, though weak, set theory: working mathematicians’ use of capital and lower case letters, and distinguishing locutions like “family of sets,” suggest that their thinking is in practice type-theoretic. The non-extensional variant, however, is inadequate as a theory of propositions and propositional functions: it succumbs to the semantic and epistemic paradoxes. Russell moved to a theory making finer discriminations of type. Church [1976] refers to the types of Ramified Type Theory as r-types.
Following the account in Church [1976], we once again start with a type i for individuals. As in the Simple Type Theory, entities of other types may be distinguished according to the types of arguments they take, but now further distinctions must be made, reflecting the differing complexities of the propositional functions falling within a single (Simple) type. Start with those entities taking only individuals as objects (including abstracta with no arguments: propositions). Each type subdivides into a hierarchy of ramified types (r-types, for short) according to the complexity of their structure or constituency. Thus, instead of simply having types of propositions, properties, dyadic relations, triadic relations, and so on, (), (i), (i,i), (i,i,i), we have, for each such simple type a series of r-types, (i)/1, (i)/2, (i)/3, ... (i,i)/1, (i,i)/2, (i,i)/3, ... and so on. It is convenient, though not faithful to the letter of Russell's exposition, to interpret each of these hierarchies cumulatively: for example, count an entity of type (i,i)/3 as also being of types (i,i)/4, (i,i)/5, and so on. With the types so far listed, we have ramified second-order logic, as described in sections 58 and 59 of Church [1956]. The numeral after the slash in a type designation is said to give the level of the type, or of the entities of that type.
To get ramified third-order logic, we add types for entities taking arguments of at least one of the non-individual types recognized in ramified second-order logic. Once again, the entities taking (sequences of) arguments of given types are subdivided into levels. Thus we will have types ((i)/1)/1, ((i)/1)2, ..., of properties taking as arguments level-one properties of individuals, types (i,(i)/3)/1, (i,(i)/3)/2, ..., of relations between individuals and level-three properties of individuals, and so on. By the cumulativity of levels for second-order types, an (i,(i)/3)/1 can relate an individual to an (i)/1 or an (i)/2; levels of third-order types are also cumulative, so an (i,(i)/3)/1 counts an (i,(i)/3)/2 as well. In third-order logic (and at higher types) it is useful to define a further notion, the badness of a type. Individuals have badness 0; the badness of other types is the sum of the "worst" allowable argument for an entity of the type with the level of the type. Thus the type (i,(i)/3)/1 has badness 4 (=3, the badness of those (i)/3 entities which are not (i)/2's, +1, the level of the relational type), and (i,(i)/3)/2 has badness 5.
Ramified fourth-order logic adds types of entities taking as arguments entities of at least one third-order type, and the types of a full system Ramified Type Theory, such as our IRT, are all those belonging to ramified n-th order logic for some natural number n.
The vocabulary of IRT will contain (free and bound) variables of all these types, and perhaps constants of some; constants and variables will be said to have the badness of their type. Atomic formulas are of the form a[b1,…,bn], where the bI are terms (constants, variables, or abstracts) of types suitable as arguments to for items of a’s type: note that the badness of the predicate of an atomic formula will always be greater than the badness of any of its arguments. The set of well-formed formulas is specified by the usual sort of recursion. Its deductive apparatus will include, as before, the usual rules of quantificational logic for all types, and rules of b-conversion. The level-aspect of r-types is supposed to reflect the constructional complexity of propositions and propositional functions, and this is what motivates the logic’s specification of what abstracts may be substituted for the variables of a given (non-individual) type. One preliminary technical definition will be useful before stating this specification. In implementing his decision to make r-types cumulative as to level (in the sense that, for given r-types t1,…,tn, items of type (t1,…,tn)/j are construed as also belonging to, and as being in the range of variables of, all types (t1,…,tn)/k for k>j), define one type to be directly lower than another if the number to the right of the final slash in its designation is lower, and everything to the left of the final slash the same, as in the designation of the second type. Then we may state the
Fundamental specification: In the application of the
Quantification and b-conversion rules, an abstract {x1,…,xn: A}
is substitutable for a variable of type (t1,…,tn)/j and
badness k (k>j) if and only if
(i) the abstraction variables x1,…,xn are, in order,
of types t1,…,tn or of directly lower types,
(ii) no free variable or constant occurring in A is
of badness >k, and
(iii) every bound variable occurring in A is of
badness strictly <k.
(B) Abstracts and extensionality. Many applications of IRT (including Russell’s intended one) are incompatible with assuming extensionality, but an extensional variant might be more convenient for mathematics. Imposing extensionality axiomatically is slightly more complicated than it is in Simple Type Theory and other standard set theories. In Simple Type Theory one can postulate, for each type t, a single axiom of extensionality (with variables of types F:((t)), X:(t), Y:(t) and x:t):
"X"Y("x(X[x] ´ Y[x]) Æ "F(F[X] ´ F[Y])).
In IRT this is not possible, the problem lying with the quantifiers of the two higher types. There are infinitely many types, (t)/1, (t)/2, etc., of propositional functions taking items of type t as arguments, so infinitely many axioms with the "X"Y quantifiers rewritten in these different types will be needed. Worse, each of these axioms will have to be reduplicated infinitely many times, with the "F rewritten in the different types capable of taking arguments of the type of the "X"Y quantifiers. (In other words, the axiom of extensionality, already schematic in a type-theoretic context, becomes triply so!)
So much for extensional identity at higher types. In Simple Type Theory one can also define identity of individuals by the formula
"X(X[a] ´ X[b]).
As Russell pointed out, however, this will not suffice in a Ramified logic: whatever level is chosen for the "X quantifier, the corresponding formula with a higher-level X will not be derivable from it (this non-derivability is rigorously established in Myhill [1974]). So yet another infinite set of extensionality axioms will have to be added:
"X(X[a] ´ X[b]) Æ "Y(Y[a] ´ Y[b])
(with variables of types X:(i)/1, Y:(i)/n for n>1, and x:i). For metamathematical purposes, the important thing to note is that extensional systems are interpretable in our basic IRT by making appropriate restrictions on all higher-order quantifiers.
Without extensionality, comprehension axioms will not give as flexible a system as the use of abstracts. Church [1976] is a tour de force, demonstrating that it is possible to use Ramified Type Theory to give an interesting analysis of a semantic paradox while using a variant of the logic with only comprehension axioms, but for other purposes the expressive power of abstracts is desirable. For example, the use of abstracts allows, what comprehension axioms do not, a natural identification of the propositional function expressed by a given formula of the formal system.
(C) Semantics with substitutional quantifiers. Since the motivating idea of propositions and propositional functions has them structured in a way that parallels the construction of the formulas of the formal language, the idea of Ramified Type Theory substitutionally was a natural one. An interpretation of this kind was given in Fitch [1938]. The clearest discussions are in Parsons [1971], [1971a], [1974] and [1982].
There are both conceptual and technical difficulties with such a semantics, which we can either ignore or finesse. One conceptual problem has already been mentioned: there can be no guarantee that (empirical) discoveries will not lead us to supplement our language with new predicates expressing universals that are at present unknown. It is also not clear how we should formulate the semantics of higher-order predicate constants, such as those expressing intensional notions like meaning or belief. This problem can be ignored in the context of a comparison with Russell's 1925 logic, for such constants were explicitly excluded from the later logic. Finally, there is the problem of referentially interpreted quantification over individuals: if there are nameless individuals in the domain interpreting variables of type i, it will in general not be possible to interpret quantifications of other types in terms of closed substitution instances. This last technicality was pointed out by Parsons; for simplicity we may limit consideration to interpretations of IRT over models with domains of named individuals.
A substitutional model for IRT (with no predicate constants of types other than (i,...,i)/1), then, is simply a structure in the sense of the model theory of First Order Logic, interpreting the predicate constants, in which all individuals are named. As one would expect, an atomic sentence (the semantics deals only with closed formulas!) is true if and only if the tuple of objects named by its terms are in the extension of its predicate. It will slightly simplify the exposition (and, since the existence of b-normal forms is decidable, etc., it is harmless) to assume that all sentences are in b-normal form: abstracts occur only as arguments to bound predicate variables.
Then truth of sentences in general can be defined by the usual sort of recursion:
--a negation is true iff the negated sentence is not true;
--a conjunction is true iff both of its conjuncts is true;
--similarly for any other connectives you want to include;
--a universal quantification is true iff all of its instances are
true; and
--an existential quantification is true iff at least one of its
instances is true,
where
--an instance of a quantification with quantified variable of type
i is any sentence formed from it by deleting the initial
quantifier and replacing all occurrences of the variable
bound to it with occurrences of some one name, and
--an instance of any other quantification is any sentence formed
from it by deleting the initial quantifier and replacing all
occurrences of the variable bound to it with occurrences of
some one closed abstract and (if necessary) normalizing.
Intuitively
it is clear that this semantics explains the truth value of every complex
sentence by reducing it to a question of the truth values of simpler
ones. This notion of reduction is often
dramatized with the metaphor of a semantic game. To determine the truth value of a sentence,
we give it to two omniscient players, Pro, who will win if the game ends
with a true atomic sentence, and Opp, who will win if the game ends with
a false atomic sentence. At each stage
of the game, if the sentence in play is a conjunction or universal
quantification, Opp gets to choose a conjunct or instance, trying for a false
one. If the sentence in play is a
disjunction or existential quantification, Pro chooses, trying for a true
disjunct or instance. If it is a
negation, they switch sides and play with the negated sentence; rules for other
connectives are readily specified. The
starting sentence is false if Opp wins, true if the Pro can.
The semanticsis only proper, however, if the game always terminates: whatever sentence the players start with, no sequence of moves, each move reducing the sentence in play to one of those its truth value depends on, can go on forever. (Technically speaking: the reduction relation must be well-founded.) To prove this, define an ordinal measure of complexity, or grade, on sentences and their quasi-sub-formulas as follows:
--the grade of an atomic sentence or quasi-formula with a
predicate constant as its predicate is 0
--the grade of an atomic quasi-formula with a predicate variable
as its predicate is w.k, where k is the badness of the variable
--the grade of a negation is the grade of the item negated +1
--the grade of a conjunction (etc) is 1 more than the grade of
the higher grade conjunct (...)
--the grade of a quantification is the grade of the matrix +1.
It is easy to see that the sentences to which the semantics "reduces" the truth value of a given non-atomic sentence will always be of lower grade. The only difficult case is that of a quantification with a quantified variable of non-individual type. Arguments to a predicate variable must have lower badness than the variable itself. So occurrences of the quantified variable to be replaced as arguments in atomic quasi-subformulas don't matter: replacing them with abstracts will not affect the grade of the sentence. There remains the case of a predicate variable occurring as a predicate. Since it is to replaced by a closed abstract, every variable occurring in the abstract, as well as variables standing as (or occurring within abstracts standing as) arguments to it must be of strictly lower badness. When the abstract is b-reduced, therefore, the resulting (quasi-) formula will have a lower grade than the original atomic quasi-formula, and so (induction on sentence construction) the whole sentence will have its grade reduced.
Note that the "substitutional" semantics defined for IRT is an "honest" one, unlike that claimed for Simple Type Theory in Leblanc [1975]. No matter what set of primitive predicates we start with, every theorem of IRT will be true on the semantics: the soundness of the deductive apparatus can be established by the usual sorts of argument. (Completeness, of course, is another matter: as pointed out by Dunn & Belnap [1968], completeness on a substitutional semantics is not to be hoped for without infinitary inference rules!) It is, in other words, consistent with the acceptance of the deductive apparatus of IRT to believe that one's language has enough predicates to express all the ways in which individuals resemble and differ from one another. Leblanc's interpretations, in contrast, are related to non-standard models of Simple Type Theory (cf. Henkin [1950]), and may require the addition of infinitely many new predicates to a given language to verify the theorems of the system. Leblanc, in other words, obtains substitutional semantics only by postulating an unknown language extending the one he starts with: surely a case of theft rather than honest toil! On the other hand, the semantics of Leblanc & Weaver [1973] (and also that presented in Grover [1973] in the same volume) is close to that presented here.
The complexity measure used here is closely related to Gentzen [1934]'s notion of grade, which was employed in proving his Hauptsatz for First Order logic. For semantic purposes we are only interested in the complexity of sentences, but a slightly more devious measure assigns complexity to formulas containing free variables (and has the same limiting complexity w2). With this revised definition of grade, then, a Hauptsatz for IRT can be proven by reconstruing Gentzen's induction on grade as transfinite induction, and otherwise taking over his proof unchanged. The provability of cut-elimination for Ramified Type Theory has been known at least since Lorenzen [1951]. There are treatments in Schütte [1960] (first edition only!) and Toledo [1975].
Section 3: The 1925 Logic
In this section we describe the 1925 logic ("BMT" for "Appendix B, Modified Type Theory). The first subsection describes the motivating ideas and the divergences from IRT. The (independently readable) second gives a formal description of BMT. The third discusses extensionality axioms and rules. The fourth proves the existence of a substitutional semantics.
(A) The Influence of Ramsey and Wittgenstein. Russell's earlier logical work tried to construct a general intensional logic, one in which such notions as the cognitive relations between a mind and the propositions it understands and believes could be expressed. The type structure of IRT is natural and well-motivated, allowing the type theory to defuse the semantic and intenTional (epistemic) paradoxes as well as the set-theoretic ones. Consider, for example Buridan's scenario, in which Socrates (by hypothesis an infallible and instantaneous logician) reads with understanding the following graffito:
(*) Socrates does not believe this proposition
(and let us interpret "proposition" in a Russellian rather than Buridanian way). Socrates, ever questioning, considers the proposition, and sees that if he were to decide it was true and start to believe it, this would render it false: which, since he recognizes it, means he cannot rationally believe it. On the other had, unless and until he comes to believe it, it is true, and this too he recognizes. Assuming, as we have, that his beliefs are formed with perfect rationality, he must both believe it and not believe it.
Most modern treatments of this take the "this proposition" to be a definite description, short for something like "the proposition expressed (in English) by this inscription." This allows at least two distinct resolutions:
(a) we may (following Russell's discussion of, e.g., the Epimenides) point out that when the definite description is given its Russellian contextual definition, a propositional quantification occurs, and Ramified Type Theory yields the result that the proposition is non-paradoxically false-- whatever level (badness) we assign to the quantification in it, the proposition itself will be of the next level up, so, in the sense of the quantifier, there will be no proposition expressed by the inscription, or
(b) we may instead choose to "ramsify, not ramify," and focus attention on the systematic ambiguity of "expressed," appealing to the doctrine of "levels of language" to conclude that no proposition at all is expressed by the inscription, for the relation of expression holding between sentences of any language and propositions can only be expressed by a predicate of a higher language.
Neither of these approaches require that propositional functions be limited in the levels (as opposed to the simple types) of the arguments they take. It is, however, far from obvious that "this proposition" should be analyzed as a definite description or as involving a reference to the inscription. "This," after all, is Russell's favorite example of a logically perfect name, and if Socrates understands the inscription, is acquainted with each of the constituents of the proposition expressed, and is actively considering the proposition, then this is as good a candidate as one could hope for for an example of direct acquiantance with a proposition. Suppose, therefore, that we interpret the "this proposition" as directly referential. There is then no mention of the expression relation, and no quantification over propositions. The only (obvious) remaining way to defuse the paradox is to postulate a type-ambiguity in believe. Since there are many types of propositions, the type structure of IRT forces us to suppose that there is a corresponding hierarchy of belief relations, and the paradox is dispelled by noting that no belief relation can apply to a proposition of which it is a constituent.
In a certain sense (Russell, in the Introduction to the second edition of Principia, uses these very words), however, low-level propositional functions can apply to high-level arguments. Suppose we want to say of a (i,i)/1 relation, R, expressed by a predicate constant, that it is, say, symmetric. We may do so by asserting a purely first order sentence:
"x"y(RxyÆRyx).
Replace the predicate by a predicate variable and we get a formula defining a ((i,i)/1)/1 property of relations. But now suppose we wanted to say of some high level binary relation of individuals-- perhaps an (i,i)/79-- that it was symmetric. Whatever horrendous formula we would have to write to express the relation, the additional material needed to say it was symmetric would be precisely the same as the material used in saying that R was symmetric: the conditional operator and a couple of quantifiers with variables ranging over individuals. Officially, we have a hierarchy of symmetry properties: the ((i,i)/1)/1 first defined, the ((i,i)/79)/1 we (in effect) defined when we said the (i,i)/79 was symmetric, and so on. (And since there is no highest level of binary relations of individuals, no one symmetry property can be ascribed meaningfully to all such relations.) Still, although the theory of types implicit in IRT insists that these are different, there is a clear sense in which they have much in common, in which they are defined in the same way. If we didn't have some other reason for distinguishing between properties of relations taking relations different level as arguments, it would be tempting to identify them....
By about 1920 Russell, under the combined influence of Wittgenstein and of such psychological behaviorists as Watson, had largely abandoned his theory of belief, etc., as relations between minds and propositions. Appendix C, added to the second edition of Principia, sketches a defense of the new view, on which there are no intensional primitives, and the only higher order properties (relations) recognized are those definable by logical means from (ultimately) the first order primitives. (Readers who recall the jacket blurb, "This is the book that has meant the most to me," from W.V. Quine on the back cover of Whitehead & Russell [1962] will see in Russell's ontological development a precedent for Quine's later "flight from intension.") The net effect of the psychological speculation is to suggest a reduction of intentional notions to semantic ones, somewhat in the manner of more recent speculations in cognitive science about "the language of thought." As for the semantic paradoxes, they can be defused by means of the doctrine of "levels of language" (as Gödel [1944] hints via references to Ramsey and Tarski). It seems fair to attribute this line of thought to Russell, as he had already given a very clear statement of the levels of language idea at the end of his preface to Wittgenstein's Tractatus.
One cryptic statement of the new view that Russell gives in the Introduction to the second edition, "Propositional functions occur only through their values," can be interpreted as an endorsement of the substitutional interpretation of higher-order quantifiers as an "intended interpretation." Terms (in particular, variables) for propositional functions occur in argument position to higher order predicates in the new logic as in the old. If the slogan means anything, therefore, it must mean that the truth conditions of sentences containing occurrences of function-terms as arguments must, ultimately, be explicated in terms of sentences in which the function terms occur as predicates: sentences, in other words, in which subformulas expressing propositional values of the functions occur. Ultimately, then, the only genuine basic facts are such as could be expressed in a First Order language with appropriate predicates. As Russell more or less says, the role of the machinery of higher order predication and quantification is simply to abbreviate various infinitary conjunctions and disjunctions of such First Order statements.
(B) BMT: Formalism. We specify the type structure, the vocabulary and formation rules of the formal langage, and the deductive apparatus.
Types. There is a BMT-type i of individuals. For every
non-individual simple type (thus: for the types of propositions and of
propositional functions taking arguments of various simple types) t and every positive integer n, there
is a BMT-type t/n of propositional
functions (or propositions) of level n and simple type t.
Note that the BMT-type of a non-individual in the ontology of the logic
is only partially specified by giving its simple type, but that the single notion
of level does the duty of both level and badness in IRT; individuals will again
be assumed to be of level 0. Where
confusion seems unlikely, "BMT-type" will be abbreviated to
"type." The simple type corresponding
to a BMT-type t/n is t (and i corresponds to i).
Non-logical vocabulary. We may have constants denoting individuals, and certain predicate constants. In giving the substitutional interpretation of IRT we assumed that the only primitive predicates were level 1 predicates of individuals, but IRT allows (and its application as an intensional logic requires) predicates of higher type. The intended interpretation of BMT, however, seems to rule such things out. All predicate constants, therefore, are of type (i,...,i)/1 (with 0 or more occurrences of i).
Logical vocabulary. We have the usual connectives, quantifier symbols, parentheses, brackets, and braces, and many, many variables: for each BMT-type we have alphabets of free and bound variables, and, in addition, for each simple type we have an alphabet of abstract variables (which will only occur bound).
Well-formed
expressions. We define the notions
of term and formula by simultaneous recursion:
i. Free variables (i.e. variables from the first class of alphabets) and constants of a given type are terms of that type;
ii. If F is a term of some type (t1,...,tn)/k and a1,...an are terms of types t1,...,tn respectively, F[a1,...an] is a formula;
iii. Negation and the other truth functional connectives form new formulas out of old as usual;
iv. If A is a formula, not containing any occurrence of the bound BMT-typed variable x, a is a free variable of the same BMT-type as x, and Ax/a is the result of replacing every occurrence of a in A with an occurrence of x, then $x(Ax/a) and "x(Ax/a) are both formulas; and
v. If A is a formula, a1,...an are free variables of BMT-types t1,...,tn respectively, x1,...,xn abstract variables of the corresponding simple types t-1,...,t-n which do not occur within A, and Aa.../x... is the result of simultaneously substitution occurrences of x1,...,xn for all occurrences of a1,...an in A, then {x1,...,xn:Aa.../x...} is a term of the BMT-type (t-1,...,t-n)/k, where k is the maximum of the levels of the free BMT-typed variables occurring in Aa.../x... or 1+(the maximum of the levels of the quantified variables occurring in A), whichever is higher. (Note that this includes the case of propositional abstracts: where A is a formula, {A} is a term of some type ()/k.)
Deductive apparatus: The usual machinery of rules for the connectives and for quantifiers of each type, where the terms substitutable for the bound variables of any given BMT-type are stipulated to be precisely the free variables, constants, and abstracts of that type. For the purposes of this paper, this completes the description of the logic BMT; axioms of extensionality, of infinity, and for identity will be considered non-logical.
(C) Extensionality. The 1925 logic is based on an extensional notion of propositional functions in (at least) one clear sense. Higher-order properties and relations-- properties and relations, that is, holding of propositions and propositional functions-- are countenanced only insofar as they can, in principle, be defined by expressions of an extensional language in which the variables for their arguments occur as (the predicates of) atomic subformulas. No property defined in such a way can distinguish between co-extensive propositional functions. Analogues of set theory's Axiom of Extensionality ought to be valid on Russell's intended interpretation, but, since the propositional functions of a given simple type come in a topless infinite hierarchy of levels, no quantified variable can range over all the potential arguments of a higher-order predicate, so no straightforward statement of extensionality is possible. The formulation of appropriate extensionality principles is thus a delicate matter, and their attribution to Russell somewhat contentious.
In the Introduction to the second edition, Russell (pp. xxxix, xl) somewhat obscurely suggests that certain inferences related to extensionality are legitimate if their premisses are "logical truths," but not if they are only "hypotheses." The idea seems to be something like this. Suppose
"x(F[x] ´ G[x]),
where x is a variable for propositional functions of some type, is a theorem of the logic. In typical cases it will have been proven by proving
(F[a] ´ G[a]),
with a free variable, a. Now, provided nothing in the proof depended on the level of the variable a, the same proof can be imitated with a higher level variable, and the equivalence thus established for propositional functions of (the same simple type but) arbitrary level.
Thus, even though the variable x is of some definite level, the provability of
"x(F[x] ´ G[x])
often indicates that the equivalence holds at higher levels, whereas its mere truth, or the fact that it follows from assumptions, does not.
Although an extensionality rule of this sort seems to be valid on Russell's intended interpretation, the sort of extensionality axiom familiar from set theory,
"F"G("x(F[x] ´ G[x]) Æ "F(F[F] ´ F[G])),
is problematic. Russell states it (p. xxxix), but may not have meant to endorse it at all types: the variable x was restricted, in an earlier section of the introduction, to ranging over individuals. If the variable x is of type i (and F, G, and F consequently of types (i)/i, (i)/j, and ((i))/k for some levels i, j and k), it is valid: since there are no distinctions of level among individuals, the antecedent says in full generality that F and G are coextensive, implying that they cannot be distinguished by any F definable in an extensional language. If x is of any other type, however, it is not valid. Its implausibility for function types can be seen by considering its corollary,
"F"G("x(F[x] ´ G[x]) Æ "y(F[y] ´ G[y])),
where y is of the same simple type as x but of higher level. This says, essentially, that if F and G agree on all functions of some low level, they will also agree on all functions of a higher level, but this would only follow if for every higher level function a coextensive function already existed at the lower level. And this is simply Russell's Axiom of Reducibility, which he was not assuming in the introduction and Appendix B, and which is not valid on the intended interpretation.
(D) Substitutional semantics. Let a substitutional model be as in the previous section: a structure of the standard sort for the first order fragment of the language, with named individuals. Atomic sentences are true in this model under the usual conditions. We must define truth for arbitrary sentences in it. Truth-functionally compounded sentences depend on their truth-functional constituents in the standard way, and universal (existential) quantifications are true if and only if all (at least one) of their instances are true. If the quantified variable is an individual variable, the instances are the results of deleting the initial quantifier and substituting a name of an individual for all remaining instances of the bound variable. If the quantified variable is of any other BMT-type, the instances are the results of deleting the initial quantifier and substituting some (closed) abstract for all the remaining occurrences of the bound variable (and then, if necessary, reducing to b-normal form). This truth definition reduces the truth value of a complex sentence to the truth values of its instances (or truth-functional compounds), but it must be shown that the reducibility relation is well-founded: that the game won't go on forever. We show this by defining an ordinal measure of complexity or grade for formulas of the language, and showing that every reduction of a formula (to an instance, if it is a quantification, or to a truth-functional constituent) leads to a reduction in grade.
The ordinal assignment for formulas of BMT will have to be a bit subtler than that for IRT. We may, as before, assign 0 to atomic formulas having predicate constants of the language as predicates. With IRT, the grade of an atomic formula with a variable as its predicate was determined by the badness of the predicate, but this will not do for BMT: since, in BMT, arguments may have the same level as, or a higher level than, the predicate of an atomic formula, the levels of the arguments have to be taken into consideration as well: consider the atomic formula
F[{A}],
where F is a variable of type ()/1, and A is a formula containing variables of high level. Admissible substituents of F will include
{p:p}
and
{p:(p&p)},
so the atomic formula will have substitution instances which (after b-reduction) include A itself and the conjunction (A&A). The grade of the atomic formula, therefore, will have to reflect the complexity of the argument. It turns out that the appropriate strategy is to make the grade of an atomic formula depend on the maximum of the levels of its terms, regardless of whether the maximum is achieved by the predicate or an argument (or both). Where one or more of the arguments are abstracts, they will be deemed to have the same level as the lowest-level BMT-typed variables substitutable for them: the level of an abstract is therefore that of the highest-level variables occurring free anywhere within it, including occurrences in smaller abstracts occurring in it, or one more than the highest-level quantified variable occurring anywhere within it, including occurrences in smaller abstracts, whichever is higher.
In defining the grades of more complex formulas for IRT, we were able to use a particularly simple scheme for the purposes of the substitutional semantics, where only closed formulas (sentences) were at issue. Because the open formulas substitutable for a variable could contain free variables of the same badness as the first variable itself, however, a more complicated scheme was needed for proof-theoretic purposes. Since the highest level term of an atomic formula of BMT may not be the predicate, the sort of problem that forced us to use a more complex scheme with open formulas in IRT arises in BMT arises even in the semantics of BMT. Again, in IRT we were able to keep the ordinals low (< w2) by the trick of raising the grade by only 1 for an outer quantifier with a variable of a certain badness, once an inner quantifier with a variable of the same or worse badness had been recorded. This, however, was possible only because the arguments of an atomic formula are not as bad as the predicate, so that when a predicate variable is eliminated, the formula that replaces an atomic formula with the given variable as predicate must be built up from atomic formulas of strictly lower grade. In BMT the level of a quantified variable must be recorded in the grade of a formula regardless of how high the grade of the matrix of the quantification is. Thus, for example, let F in our previous example of a formula be of type ()/2, with A containing bound variables of higher level. Then, if we tried to use the trick that kept the grades low in IRT, our grade for
$F(F[{A}])
would be (level of {A})+1. One admissible substituent for a variable of type ()/2, however, is
{p:$Y(Y[{A}])},
where Y is a variable of type ()/1, so that
$Y(Y[{A}])
turns out (after b-reduction) to be an instance of
$F(F[{A}]).
Clearly, the grade of a formula of this form must record the level of the variable quantified by the initial quantifier.
A definition of grade satisfying these desiderata may be defined as follows:
--the grade of an atomic sentence or quasi-formula with a
predicate constant as its predicate is 0
--the grade of an atomic quasi-formula with a predicate variable
as its predicate is w(2k-1), where k is the level of that variable, or of the highest-level quasi-term occurring
as an argument to it, whichever is higher
--the grade of a negation is the grade of the negated formula
(or quasi-formula) +1
--the grade of a conjunction (etc) is the grade of whichever conjunct (...) has the higher grade + 1
--the grade of a quantification with an individual variable is the grade of the matrix +1.
--the grade of a quantification with a higher-order variable of
level k is the grade of the matrix + w2k.
(Note that the desire to record the level of variables of quantification independently of the grade of the matrix forces us to make our "jumps" by exponentiation rather than-- as with IRT-- multiplication. As a result the limit of the grades of BMT formulas is ww instead of w2.)
It remains to verify that a reduction of a formula always reduces grade. The cases of truth-functional compounds and quantifications with individual variables are standard, leaving the case of a higher-order quantification with a variable of level k. The initial effect of deleting the quantifier is to subtract w2k from the grade: in terms of the standard notation for ordinals, to delete a final addend of the form
w2k (or to reduce by one the parameter in a final addend of the form (w2k).n). The grade of the matrix may increase, but not by enough to undo the effect of this subtraction. Note first that the substitution of an abstract for a variable as an argument in an atomic (quasi-)formula cannot increase the grade of that atom, so we only need to worry about substitutions of abstracts for predicates in atomic quasi-formulas. An abstract substitutable for a bound variable of level k may contain bound variables of level no higher than k-1, and free variables of level no higher than k (in the proof-theoretic environment: in the semantics, the only substituents considered are closed abstracts). Concretizing to get rid of the abstract, we get a (quasi-)formula built up from atoms having, as their predicates and arguments, free and bound variables of the abstract and/or argument terms from the original atom. (If some of the arguments of the original atom were themselves abstracts, further concretion may be needed to obtain a b-normal form, yielding additional possibilities for the atoms of the new (quasi-)formula. Since, however, the grade of the original atom already reflected levels of the highest-level variables occurring in it, including occurrences within abstracts, it can be seen that this further complication does not affect the argument.) Thus the highest possible grade for an atomic (quasi-)subformula of the new (quasi-)formula is the same as the grade of the atom we started with. The new (quasi-) formula is built up from these atoms by means of truth-functional connectives and quantifiers with variables of level no higher than k-1. Its grade, therefore, will be less than the grade of the original atom + w(2k-1). Substitution of formulas of this grade for occurrences of the original atom, in turn, can raise the grade of the matrix of the original quantification by no more than w(2k-1), so the reduction has yielded a formula of lower grade.
This notion of grade can also be used to prove the Hauptsatz.
Section 4: Mathematical Induction
(A) Arithmetic in Ramified Logics. The first edition of Principia postulated reducibility. Given infinitely many individuals this allows the definition of cardinal numbers as ((i))s (the equivalence classes of (i)s under the equipollence relation) and the definition of the finite or natural numbers as those belonging to every (((i))) containing 0 and closed under successor. First and Higher Order Peano Arithmetics are interpretable in the system, with the rule of mathematical induction (IND) derived by simple universal quantifier elimination from the definition of natural number. The Axiom of Reducibility, however, is introduced somewhat apologetically, as justified mainly in terms of its consequences. What could Russell have done if he had combined his philosophy of logic, including a certain scepticism about Reducibility, with Brouwer's (or Weyl's) personality and willingness to advocate trimming mathematics to eliminate the philosophically suspect bits? More precisely, how much of this development is possible in a Ramified system: say, IRT plus a weak Axiom of Infinity? For simplicity, assume that identity of individuals is given as a primitive predicate constant of type (i,i)/1.
It turns out that the definitions of cardinal number and the most familiar arithmètic functions (addition, multiplication and-- with a bit more work-- exponentiation) can be taken over with minimal changes, and can be proven to be total and to satisfy their usual recursion equations over the finite numbers. Additional functions can be defined by composition and by bounded primitive recursion. The situation with IND, however, is delicate. The quantification over (((i)))s in the definition of finite has to be restricted to some level or other-- for example, to (((i)/1)/1)/3s. The derivation of IND then applies only to conditions defining propositional functions of this type: in particular, to conditions which, when written out in primitive notation, contain no quantified variables of this type. Most of the induction axioms of First Order Arithmetic are therefore not derivable in IRT+Infinity: quantification over natural numbers is represented by quantification over cardinals restricted to finite ones, so any number-theoretic condition containing (unbounded) quantification over natural numbers translates into one containing forbidden bound variables! It is, however, possible to prove that numbers less than a given natural number are themselves natural, so that IND for number-theoretic conditions containing only bounded numerical quantifiers is available. (Cf. Hazen [1982], Burgess & Hazen [1999])
(B) Appendix B. BMT is a proper supersystem of IRT, so it is not immediately apparent that a foundational system based on it will have be as weak as IRT+Infinity. In the second of the three appendices added to the second edition of Principia, Russell claimed to have derived unrestricted IND in BMT+Infinity. Had the claim been correct, this system would have served as a natural foundation for predicative analysis, as advocated by Weyl and his followers. The claim is, however, wrong: the proof given is irremediably flawed, as Gödel [1944] appears to have been the first to note.
Still, considering the supposed proof suggests that BMT might have advantages over IRT. Russell's lemma *89.12 states that every propositional function true of only finitely many objects is coextensive with one of minimal level. This lemma is correct, and of considerable use in deriving mathematics within ramified systems (cf. Hazen [1992]). A stronger form is provable in BMT than in IRT. For simplicity let us consider propositional functions true of individuals and assume that identity of individuals is expressed by a primitive of type (i,i)/1. For convenience we adopt set-theoretic vocabulary ("subset," "member," ...) for talking about propositional functions and the items of which they are true. Recall also that levels are cumulative, so that (i)/1 "sets" are included among the (i)/2s.
In IRT we may define, for example, an (i)/2 set, X, to be finite if and only if it belongs to every ((i)/2)/1 class which (a) includes all unit subsets of X, and (b) for each proper (i)/2 subset, Y, of X included and each individual, x, in X-Y includes also the union of Y with the unit set of x. (A fairly standard set-theoretic definition of finite set.) The class of (i)/2 subsets of X which are coextensive with (i)/1 sets is a ((i)/2)/1 class, denoted by the abstract
{Y:"x(Y[x]ÆX[x]) & $A"x(Y{x]´A[x])},
where X and Y are of type (i)/2, A of type (i)/1, and x of type i. We now prove that it fulfills conditions (a) and (b).
(a) Let Y be a unit (i)/2 subset of X and x be its member. Then
{y:y=x}
is a (i)/1 set coextensive with Y.
(b) Let Y be an (i)/2 subset of X, let x be a member of X not in Y, and let A be an (i)/1 set coextensive with Y. Then
{y:A[y] v y=x}
is an (i)/1 set coextensive with the (i)/2 set
{y:Y[y] v y=x}.
This proves *89.12: for finite (i)/2, but says nothing about finite (i)/3 (and higher) sets: since they cannot belong to ((i)/2)/1 classes, the definition of finiteness does not apply to them.
BMT, by contrast, proves *89.12 in full generality. The quantified class variable in the definition of finiteness may be taken to be of BMT-type ((i))/2, taking as arguments variables of simple type (i) and any level, and the bound set variables in conditions (a) and (b) as of BMT-type (i)/2. The definition is applicable to sets of type (i)/n for any n, so finite sets of all levels can be shown coextensive with (i)/1s.
Unfortunately for Russell, the result that he really needed, that every subset of a finite set is finite (his *89.17) is still not forthcoming, leaving the question of just how strong BMT+Infinity is. Gödel [1944] leaves the question open:
"So the question whether (or to what extent) the
theory of integers can be obtained on the basis
of the ramified hierarchy must be considered as
unsolved at the present time." (p. 146)
Burgess's negative result (Burgess & Hazen [1999]) for a system based on Ramified Second Order Logic is established by reference to Gödel's Incompleteness Theorem: the system cannot yield the mathematics needed for a proof of its own consistency, where a key step in the obvious consistency proof is the establishment of Gentzen's Hauptsatz for the logic. The fact that the complexity ordering of sentences used in establishing the substitutional semantics and Hauptsatz for BMT is longer than that needed for IRT suggests that they may require stronger assumptions than those for IRT, but (First Order) Peano Arithmetic seems strong enough for the proofs. Given the Hauptsatz for the logic of BMT, a consistency proof for BMT+Infinity is straightforward. By the Hauptsatz, any First Order conclusion derivable from First Order premisses in BMT is derivable from them in First Order Logic: any BMT-provable sequent containing only First Order formulas in the antecedent and succedent is a valid sequent of First Order Logic. To prove the consistency of BMT+Infinity, then, it suffices to give an evidently consistent First Order theory from which the axiom of Infinity is derivable in BMT. Consider the (decidable) theory of dense linear order, formulated with identity and the order relations as primitives. (To turn this example into a concrete example of a substitutional model, take the rational numbers as the domain of individuals, with names given by appending fractions as subscripts to some constant symbol.) General substitution of identicals is not derivable from the First Order axioms for identity: substitution axioms allowing the substitution of terms for identicals as arguments to the predicate constants of the First Order language will not yield substitution as arguments to (i)/1 predicate variables. The system with general substitution of (individual) identicals is, however, interpretable in the system without: simply restrict all higher order quantifiers of appropriate types to propositional functions for which substitution holds. A good axiom of Infinity asserts the existence of a ((i))/1 class of non-empty (i) sets containing a proper superset of each of its members. But the class of sets of the form
{x:a<x & x<b}
for a<b will work. Given the elementary nature of the consistency proof just sketched, it seems very unlikely that full IND (even for conditions in the language of First Order Peano Arithmetic) is derivable in BMT+Infinity, but determining exactly what subsystem of PA is obtainable would require a more refined analysis.
References
Anderson, C.A., [1986] “Some difficulties concerning Russellian
Intensional logic,” Nous 20, pp. 35-43.
--- [1989] “Russellian intensional logic,” in Joseph Almog et al., eds.,
Themes from Kaplan, New York: Oxford University Press.
Burgess, John P., and A.P. Hazen, [1999] “Predicative logic and formal
Arithmetic,” Notre Dame Journal of Formal Logic 39 (1998
cover date), pp. 1-17.
Church, Alonzo, [1956] Introduction to Mathematical Logic, Princeton:
Princeton University Press.
--- [1974] “Russellian Simple Type Theory,” Proceedings and
Addresses of the American Philosophical Association 47, pp. 21-
33.
--- [1976] “Comparison of Russell’s resolution of the
semantical antinomies with that of Tarski,” Journal of Symbolic
Logic 41, pp. 747-760; repr. with correction in R.L. Martin, ed.,
New Essays on Truth and the Liar Paradox, New York: Oxford
University Press.
--- [1984] “Russell’s theory of identity of propositions,” Philosophia
Naturalis 21, pp. 513-522.
Davoren, J.M., and A.P. Hazen, [1991] "Russell, Gödel and Skolem: how
much of arithmetic is predicative?" Journal of Symbolic Logic
56, pp. 1108-1109.
Dunn, J.M., and N.D. Belnap [1968] "The substitution interpretation of the quantifiers," Nous 2, pp. 177-185.
Fitch, Frederick B., [1938] "The consistency of the ramified Principia,"
Journal of Symbolic Logic 3, pp. 140-149.
Gentzen, G., [1934] "Untersuchungen über das logische Schliessen,"
Mathematische Zeitschrift 39, pp. 176-210 and 405-431; Eng.
tr. "Investigations into logical deduction," American Philosophical
Quarterly 1 (1964), pp. 288-306 and 2 (1965), pp.204-218;
Eng. tr. repr. in [1969].
--- [1969] Collected Papers of Gerhart Gentzen (ed. M. Szabo),
Amsterdam:
North-Holland.
Gödel, Kurt, [1944] "Russell's Mathematical Logic," in P.A. Schilpp, ed.,
The Philosophy of Bertrand Russell, Evanston: Northwestern
University.
Grover, Dorothy, [1973] "Propositional quantification and quotation
contexts," in Leblanc [1973].
Hazen, A.P., [1985] “Nominalism and abstract entities,” Analysis 45,
pp. 65-68.
--- [1992] "Interpretability of Robinson's arithmetic in the ramified
second-order theory of dense linear order," Notre Dame Journal
of Formal Logic 33, pp. 101-111.
Henkin, Leon, [1950] "Completeness in the theory of types," Journal
of Symbolic Logic 15, pp. 81-91.
Landini, Gregory, [1996] "The definability of
the set of natural