(2.80)
2.4 数值
We have found so far the most general consistent rules by which our robot can manipulate plausibilities, granted that it must associate them with real numbers, so that its brain can operate by the carrying out of some definite physical process. While we are encouraged by the familiar formal appearance of these rules and their qualitative properties just noted, two evident circumstances show that our job of designing the robot’s brain is not yet finished.
到目前为止,我们已经找到了机器人处理似真度的最一般的一致规则,即必须将似真度关联到一个实数,然后机器人的大脑才能以某种明确的物理过程来完成运作。虽然这些规则的形式看起来很熟悉而且其定性属性让我们感到鼓舞,但两个明显的情况表明我们设计机器人大脑的工作还没有完成。
In the first place, while the rules (2.63), (2.64) place some limitations on how plausibilities of different propositions must be related to each other, it would appear that we have not yet found any unique rules, but rather an infinite number of possible rules by which our robot can do plausible reasoning. Corresponding to every different choice of a monotonic function p(x), there seems to be a different set of rules, with different content.
首先,虽然公式(2.63)和(2.64)给出了对不同命题的似真度之间的关系的一些限制,但似乎似真推断的规则的数目并不唯一而是无穷个。对应于选中的每个单调函数p(x),似乎存在不同的规则集,具有不同的内容。
Secondly, nothing given so far tells us what actual numerical values of plausibility should be assigned at the beginning of a problem, so that the robot can get started on its calculations. How is the robot to make its initial encoding of the background information into definite numerical values of plausibilities? For this we must invoke the ‘interface’ desiderata (IIIb), (IIIc) of (1.39), not yet used.
其次,到目前为止,我们并不知道在开始处理问题时,应该赋予似真度什么数值,以便机器人可以开始工作。机器人如何为背景信息确定初始似真度的数值?为此,我们必须引用尚未使用的'界面'基础原理,即(1.39)中的(IIIb)和(IIIc)。
The following analysis answers both of these questions, in a way both interesting and unexpected. Let us ask for the plausibility that at least one of three propositions is true.We can find this by two applications of the extended sum rule (2.66), as follows. The first application gives
以下分析以既有趣又意外的方式回答了这些问题。 考虑的似真度,其中三个命题中至少一个为真。我们可以如下应用两次扩展和规则(2.66)。 第一次应用给出
where we first considered as a single proposition, and used the logical relation
首先将看作单个命题,并使用逻辑关系
(2.81)
Applying (2.66) again, we obtain seven terms which can be grouped as follows:
再次应用(2.66),得到如下分组的7项
(2.82)
Now suppose these propositions are mutually exclusive; i.e. the evidence B implies that no two of them can be true simultaneously:
现在假设这些命题彼此独立,例如证据B蕴含了这些命题没有任何两个可以同时为真
(2.83)
Then the last four terms of (2.82) vanish, and we have
然后(2.82)的最后4项消除了,有
(2.84)
Adding more propositions , etc., it is easy to show by induction that if we have n mutually exclusive propositions , (2.84) generalizes to
增加更多命题如等等,容易归纳出,如果有n个彼此独立的命题,(2.84)可推广为
, 1 ≤ m ≤ n, (2.85)
a rule which we will be using constantly from now on.
这个规则从现在开始我们会经常用到.
In conventional expositions, Eq. (2.85) is usually introduced first as the basic but, as far as one can see, arbitrary axiom of the theory. The present approach shows that this rule is deducible from simple qualitative conditions of consistency. The viewpoint which sees (2.85) as the primitive, fundamental relation is one which we are particularly anxious to avoid (see Comments section at the end of this chapter).
在传统的论述中,等式(2.85)通常作为最基本的东西被首先介绍,但如你所见,作为一个"天选"的公理。我们的方法表明,该规则可以从定性这个简单的定性条件中推导出来。 将(2.85)视为原始且基本的关系的观点,是我们特别想避免的观点(参见本章末尾的评论部分)。
Now suppose that the propositions are not only mutually exclusive but also exhaustive; i.e. the background information B stipulates that one and only one of them must be true. In that case, the sum (2.85) for m = n must be unity:
现在假设命题不仅互相排斥,而且是穷尽的;即,背景信息B规定其中一个且仅一个必须为真。 在这种情况下,当m=n时和(2.85)必须是1:
(2.86)
This alone is not enough to determine the individual numerical values . Depending on further details of the information B, many different choices might be appropriate, and in general finding the by logical analysis of B can be a difficult problem. It is, in fact, an open-ended problem, since there is no end to the variety of complicated information that might be contained in B; and therefore no end to the complicated mathematical problems of translating that information into numerical values of . As we shall see, this is one of the most important current research problems; every new principle we can discover for translating information B into numerical values of will open up a new class of useful applications of this theory.
仅此还不足以确定所有的单个数值。根据信息B的进一步细节,可能有许多不同的合理选择,并且一般来说通过对B的逻辑分析来找到所有的可能是非常难题的。事实上这是一个开放的问题,因为B中可能包含的各种复杂信息是无止境的;因此将这些信息转换为对应的的数值,也是个无止境的复杂数学问题。正如我们将要看到的,这是当前最重要的研究问题之一;我们可以发现将信息B转换为对应所有的数值的每一个新原理,都将开辟这一理论的一类新的有意义的应用。
There is, however, one case in which the answer is particularly simple, requiring only direct application of principles already given. But we are entering now into a very delicate area, a cause of confusion and controversy for over a century. In the early stages of this theory, as in elementary geometry, our intuition runs so far ahead of logical analysis that the point of the logical analysis is often missed. The trouble is that intuition leads us to the same final conclusions far more quickly, but without any correct appreciation of their range of validity. The result has been that the development of this theory has been retarded for some 150 years because various workers have insisted on debating these issues on the basis, not of demonstrative arguments, but of their conflicting intuitions.
然而,在一些情况下,答案是如此简单,只需要直接应用已有的原则就够了。但是,我们现在要处理的问题是如此微妙,以至于导致了一个多世纪的混乱和争议。在这个理论的早期阶段,就像在初等几何学中一样,我们的直觉远远的跑在了逻辑分析之前,以至于经常丢弃了逻辑分析。问题在于直觉很快就让我们得出了和逻辑分析相同的最终结论,但却没有了解结论合法的边界。结果是这个理论的发展已经被推迟了大约150年,因为人们都坚持在他们互相冲突的直觉的基础上讨论问题,而不是基于展示各自的论证过程。
At this point, therefore, we must ask the reader to suppress all intuitive feelings you may have, and allow yourself to be guided solely by the following logical analysis. The point we are about to make cannot be developed too carefully; and, unless it is clearly understood, we will be faced with tremendous conceptual difficulties from here on. Consider two different problems. Problem I is the one just formulated: we have a given set of mutually exclusive and exhaustive propositions and we seek to evaluate . Problem II differs in that the labels of the first two propositions have been interchanged. These labels are, of course, entirely arbitrary; it makes no difference which proposition we choose to call and which . In Problem II, therefore, we also have a set of mutually exclusive and exhaustive propositions , given by
因此,在这一点上,我们必须要求读者放弃所有的直觉,并且完全遵从下面的逻辑分析。我们即将制定的观点不能过于谨慎;而且,除非清楚地理解,否则我们将面临巨大的概念上的困难。考虑两个不同的问题。问题I阐述为:有一套相互排斥且穷尽的命题集,我们试图评估。问题II的不同之处在于前两个命题的标签被互换了。当然,这些标签完全是任意的;我们选择将哪个命题称为以及哪个命题为没有任何实质区别。因此,在问题II中,我们也有一组相互排斥且穷尽的命题集,如下
(2.87)
and we seek to evaluate the quantities , i = 1, 2, … , n.
我们试图评估的具体数值,其中i = 1, 2, … , n.
In interchanging the labels, we have generated a different but closely related problem. It is clear that, whatever state of knowledge the robot had about A1 in Problem I, it must have the same state of knowledge about in Problem II, for they are the same proposition, the given information B is the same in both problems, and it is contemplating the same totality of propositions in both problems. Therefore we must have
通过交换标签,我们得到了一个不同但类似的问题。 显然无论机器人在问题I中对A1有什么知识状态,它在问题II中必须具有相同的关于的知识状态,因为它们本就是相同的命题,而且两个问题都是在给定的信息B的前提下,并且在两个问题中都考虑了相同的命题集。 因此我们必须有
, (2.88)
and similarly
类似的
(2.89)
We will call these the transformation equations. They describe only how the two problems are related to each other, and therefore they must hold whatever the information B might be; in particular, however plausible or implausible the propositions might seem to the robot in Problem I.
我们将这些称为变换方程。它们只描述了这两个问题是如何相互关联的,因此它们必须和B保持一致无论B包含了什么信息;特别是,机器人认为在问题I中有多可信或多不可信。
Now suppose that information B is indifferent between propositions and ; i.e. if it says something about one, it says the same thing about the other, and so it contains nothing that would give the robot any reason to prefer either one over the other. In this case, Problems I and II are not merely related, but entirely equivalent; i.e. the robot is in exactly the same state of knowledge about the set of propositions in Problem II, including their labeling, as it is about the set in Problem I.
现在假设信息B在命题和之间并无不同;也就是说,如果它对一个命题陈述了什么,那么它对另一个命题陈述了相同的东西,因此它没有包含任何信息可以让机器人有任何理由偏爱任何一个而不是另一个。在这种情况下,问题I和II不仅仅是相关的,而是完全相同的;即机器人在问题II中的命题集所处于的知识状态,包括命题的标签,与问题I的命题集$\{A_1,…,A'n\}p(A_i|B)_I=p(A'_i|B){II}p(A_1|B)I=p(A_2|B)_I\{A''_1, , A''_n\}{A_1, , A_n\}p(A''_i|B)A''_k≡A_i\{A'' 1,A''_ n\}{A_1,,A_n\}p(A''I|B)A''_k≡A_ip(A_i|B)_I=p(A''_k|B){III}\{A''1, … , A''_n\}{A_1, , A_n\}\{A'' 1,…,A''n\}\{A_1,,A_n\}p(A_k|B)_I=p(A''_k|B){III}p(A_i|B)_I=p(A_k|B)_Ip(A_i|B)_I{A_1, , A_n\}p(A_i|B)_I{A_1,,A_n\}p(A_i|B)_I= \frac 1 np(x)=p(A_i|B)x=A_i|Bp(x)=p(A_i|B)x=A_i|BA_iA_ip_ip_iA_iA_ip_ip_iA_i≡A_i≡p(A_i|B)= \frac 1 10p(black|B)=p(A_4+A_6+A_7|B)p(black|B) = \frac 3 10p(A|B) = \frac M N$$. (2.99)
This was the original mathematical definition of probability, as given by James Bernoulli (1713) and used by most writers for the next 150 years. For example, Laplace’s great Th´eorie Analytique des Probabilit´es (1812) opens with this sentence:
这是詹姆斯伯努利(James Bernoulli,1713)给出的概率的原始数学定义,并被大多数作家接下来使用了150年。例如,拉普拉斯伟大的著作"Th’eorie Analytique des Probabilit’es"(1812)以下面这句话开始:
The Probability for an event is the ratio of the number of cases favorable to it, to the number of all cases possible when nothing leads us to expect that any one of these cases should occur more than any other, which renders them, for us, equally possible.
一个事件的概率是有利于它的事件的数量与所有事件的数量之比,当没有任何理由使得我们预期某些事件应该比其他事件发生更多次时,对我们来说这意味着其概率是相等的。
Exercise 2.3. As soon as we have the numerical values a = P(A|C) and b = P(B|C), the product and sum rules place some limits on the possible numerical values for their conjunction and disjunction. Supposing that a ≤ b, show that the probability for the conjunction cannot exceed that of the least probable proposition: 0 ≤ P(AB|C) ≤ a, and the probability for the disjunction cannot be less than that of the most probable proposition: b ≤ P(A + B|C) ≤ 1. Then show that, if a + b > 1, there is a stronger inequality for the conjunction; and if a + b < 1 there is a stronger one for the disjunction. These necessary general inequalities are helpful in detecting errors in calculations.
练习2.3。对于数值a=P(A|C)和b=P(B|C),乘法和加法规则规定了与和或操作结果的可能范围。假设a≤b,表明逻辑与的概率有最大值:0≤P(AB|C)≤a,并且逻辑或的概率有最小值:b≤P(A+B|C)≤1。然后表明,如果a+b>1,则逻辑与的不等式两边差别变得更大;如果a+b<1,逻辑和得两边差别变大。这些必要的,通用的不等式可以用来检测计算中的误差。