Moderated by Erik Sandewall.
 

Marc Denecker, Daniele Theseider Dupré, and Kristof Van Belleghem

An Inductive Definition Approach to Ramifications

The article mentioned above has been submitted to the Electronic Transactions on Artificial Intelligence, and the present page contains the review discussion. Click for more explanations and for the webpage of theauthors: Marc Denecker, Daniele Theseider Dupré, and Kristof Van Belleghem.

Overview of interactions

N:o Question Answer(s) Continued discussion
1 7.7  Murray Shanahan
13.7  The authors
17.7  Murray Shanahan
2 11.7  Eugenia Ternovskaia
17.7  The authors
27.7  Eugenia Ternovskaia
5.8  The authors
20.8  Eugenia Ternovskaia
30.8  The authors
11.9  Eugenia Ternovskaia
16.9  Michael Thielscher
3 14.7  Michael Thielscher
18.7  The authors
 
4 8.3  Anonymous Referee 1
   
5 8.3  Anonymous Referee 2
   
 

Q1. Murray Shanahan (7.7):

Marc, Daniele and Kristof,

I haven't read your paper very closely yet. But I'm very interested to understand how your formalism (and indeed any other formalism) works with the circuit example of Figure 1 in your paper.

I can't find anywhere in the paper a precise claim about what your formalisation of this example yields. But presumably, when switch  r  is closed, switch  s  opens,  relay1  is activated, and switch  q  opens, while  relay2  remains not activated and switch  p  remains closed. (By the way it would be nice if the relays were labelled in the diagram.) Is this what your formalisation yields? I guess it is.

To relieve me of the trouble of going through your definitions and proving it myself, can you also tell me what your formalisation yields in the case that switch  s  is left out altogether. This is obviously more problematic, since the constraints are then contradictory. From  r  we get  relay2 , which gives  ¬ p , which gives  ¬ relay1  which gives  ¬ q  which gives  ¬ relay2 .

Do we get inconsistency according to your formalisation? Or do we get multiple possible outcomes? On page 7, you say that "only theories that do not entail multiple changes of the same fluent at the same time are considered as well-defined". So would this example not be well-defined? Yet this is a realisable circuit. Indeed it's a simplification of the circuit you use as an example. I'm not quite sure what would actually happen - I suppose the relays would get stuck. But something would happen. So maybe inconsistency or no-well-definedness is not what we want in such examples. Or maybe the representation just needs to be a bit more elaborate. What do you think?

Murray

A1. Marc Denecker, Daniele Theseider Dupré, and Kristof Van Belleghem (13.7):

Dear Murray,

Thanks for your comments. You are right that we should have made more precise claims about what our formalisation yields in the circuit example.

Below, we try to answer your questions. There is brief answer and an extended answer.

First the brief answer.

In our circuit example, there are two stable states. In case a suitable primitive action of opening or closing switches  r  or  s  happens, then the physical system will switch from one state to the other. Our claim is that that our formalisation models these state transitions correctly. (Despite the fact that the effect rules contain positive and negative cycles, are not stratifiable and simple circumscription and completion both fail to assign the right semantics.)

In your simplified example, there is one stable state; in this state the switch  r  is open. The physical system will start to oscillate when r is closed in this stable state. As we mention in the paper, our approach is not intended to model such non-stable behaviour; actually no approach in which ramifications have "immediate" effects without delay can be expected to model an oscillation process. This is a case where delays in the effect propagation are relevant and should be explicitated. However, as we hoped for and predicted in the paper, our formalisation detects this non-stable behaviour by returning a 3-valued model. As a consequence, the transition function is undefined for the case that a primitive action of closing  r  is performed in the stable state.

Now the extended answer with a more technical discussion. First we make a detailed analysis of your example.

In your example, there is one circuit containing a switch  p  which if closed, activates relay  r1 ; in the other circuit, there are switches  r  and  q , and if they are both closed, relay  r2  is active. An active  r1  CLOSES  q ; an active  r2  OPENS  p . This is how the circuit looks:

  --------||----------
  |		     |
  - p ----------- r1 -
    |             |
  -~r2----- r --- q --
  |		     |
  --------||----------

The state constraints in this example are:

	r1 <-> p
	p  <-> ~r2
	r2 <-> r & q
	q <-> r1

If follows logically that :

   { p & r1 & q & ~r & ~r2}
This corresponds to the unique stable state:
   {p, r1, q, ~r, ~r2}

If  r  gets closed in this stable state, the system is trapped in an oscillation:  r2  is activated,  p  is opened,  r1  is disactivated,  q  is opened,  r2  is disactivated, which closes  p , activates  r1 , closes  q , activates  r2 , etc.. Simple electromechanical bells are constructed in a similar way and show a similar behaviour.

The formalisation in our language is as follows ( c(p stands for  caus(p;  h(p stands for  holds(p). The effect rules can be derived from the above state constraints with Thielscher's method using the influence information that fluents at the right can influence fluents at the left.

r1 <-> p yields:

	c(r1) <- c(p), ~h(p)  			
                        (initiating p causes r1)

	c(~r1) <- c(~p), h(p)   		
                        (initiating ~p causes ~r1)

r2 <-> r &  q  yields:

	c(r2) <-  c(q), ~h(q), c(r), ~ h(r)	
                        (initiating q&r causes r2)

	c(r2) <-  c(q), ~h(q), h(r), ~ c(~r)

	c(r2) <-  h(q), ~c(~q), c(r), ~ h(r)

	c(~r2) <-  c(~q), h(q)			
                        (initiating ~(q&r) causes ~r2)

	c(~r2) <-  c(~r), h(r)

q <-> r1 yields:

	c(q) <- c(r1), ~h(r1) 			...

	c(~q) <- c(~r1), h(r1)

p <-> r2 yields:

	c(p) <-  c(~r2), h(r2)			

	c(~p) <-  c(r2), h(r2)

Assume that in addition, there are primitive actions  close_r ,  open_r :

	c(r) <-  act(close_r)

	c(~r) <- act(open_r)

Consider this definition in the case that a  close_r  action happens in the stable state. The stable state is given by the set of literals:

	{h(p), ~h(r2), h(r1), h(q), ~h(r)} 
The set of actions is represented by the set
	{act(close_r)}
The well-founded model of the above rule set, given that  act  and  h  are interpreted by the above sets, is obtained as follows.

First, we replace all  h  and  act  literals by their truth value in all effect rules. Then we delete all rules with  false  in the body and delete all literals  true  in the body. This yields:

	c(r)  <- 
	c(r2) <- ~c(~q), c(r)
	c(~p) <- c(r2)
	c(~r1)<- c(~p)
	c(~q) <- c(~r1)
	c(~r2)<- c(~q)
The well-founded model of this rule set is
        true :	    c(r)
	undefined : c(r2),  c(~q) c(~r1) c(~p)
	false :     all other literals (c(~r), c(p), ...)
There are many ways to verify this. Any way to construct the well-founded model is ok. For example, the well-founded model is known to be a model of the 3-valued completion semantics. Verify that the (3-)valued completion entails the following equivalences:
	c(r) <-> true
	c(r2) <-> ~c(~q) <-> ~c(~r1) <-> ~c(~p) <-> ~c(r2) 
	c(l) <-> false         for the other fluent literals l
As a consequence, the 3-valued completion has a unique model which must be the well-founded model and which corresponds to the above model.

Another way is to use our formalisation with prooftrees. The prooftree belows is the unique prooftree of c(~r2) and contains the unique prooftrees for the atoms c(~q), c(~r1), c(~p), c(r2), c(r)

        c(~r2) <- c(~q) <- c(~r1) <- 
                  c(~p) <- c(r2) <- c(r)  <- true  
                                          <- ~c(~q) 
(So the node  c(r2 has 2 daughter nodes  c(r and  ¬ c(¬ q)

Our semantics first associates  u  to  c(¬ q.  c(r has a true prooftree, all other atoms in the above list have a weak prooftree with only true and literals with truth value  u  in the leafs. So, the well-founded model is reached after one step already. It is a fixpoint because  ¬ c(¬ q is still undefined.

Note the close correspondence between this prooftree and the actual effect propagation. This correspondence between effect propagation and the constructiveness of inductive definitions was our main motivation for our work.

Our approach is not suitable to model this oscillation. To model oscillation, a formalism with evolving time and effect rules with delays seems necessary. However, in the paper we claim that our formalisation "detects" non-stable behaviour when it produces a 3-valued model for the above state transition and also when it derives contradictory causal literals  caus(f and  caus(¬ f. The transition function is not defined in such cases.

So, the behaviour of our formalisation for this case is exactly what we hoped for and had predicted in the paper.

Finally, consider the original circuit example. The circuit in the paper is slightly more complex:

  --------||----------
  |		     |
  - p ----- s --- r1 -
    |       |     |
  -~r2-----~r --- q --
  |		     |
  --------||----------

The effect rules in the circuit are related to the following state constraints:

	r1 <-> p & s  (if switches p and s are
                       closed then relay r1 is active)

	r2 <-> q & r  (if switches q and r are 
                       closed then relay r2 is active)

	r  <-> ~s     (switches r and s are 
                       mechanically connected this way)

	p  <-> ~r2    (p is closed iff r2 is not active)

	q  <-> r1     (q is closed iff r1 is active)
It can be seen easily that this theory simplifies to
	p & ~r1 & (s <-> r1 <-> q <-> ~r) 
Hence, the system can occur in two stable states
	{p, ~r1, s, r1, q, ~r}
and
	{p, ~r1, ~s, ~r1, ~q, r}. 
It can be verified also that any applicable primitive action of opening or closing r or s will yield a transition from one stable state to the other stable state.

To formalise this, we could add primitive actions  close_r ,  open_r ,  close_s  and  open_s  with the following effects:

	caus(r) <- act(close_r)
	caus(~r) <- act(open_r)
	caus(s) <- act(close_s)
	caus(~s) <- act(open_s)
The other effect rules in the paper can be derived from the first set of state constraints using the influence information that fluents at the right can influence fluents at the left and r can influence s.

Our formalisation yields the correct state transitions starting from the stable states and performing any of the primitive actions. We have a proof of this which we will add in a next version of the paper.

Thanks again.

Marc, Kristof, Daniele

C1-1. Murray Shanahan (17.7):

Marc, Daniele and Kristof,

Many thanks for your answer to my question about your paper. I think it's an excellent paper.

I've just completed a short paper on the same topic, which you may be interested in. I've submitted it to the AIJ. Basically it shows how the event calculus from my book "Solving the Frame Problem" can be used to tackle the ramification problem, including tricky examples like Thielscher's circuit. The paper is available via the following URL.

   http://www.dcs.qmw.ac.uk/~mps/ramifications.ps.Z

All the best, Murray


Q2. Eugenia Ternovskaia (11.7):

Dear Marc, Daniele and Kristof,

It was interesting to read your paper, especially because I am interested in inductive definability myself. I wrote a paper "Inductive Definability and the Situation Calculus" presented at Dynamics'97 last October. It was published in a volume of LNCS later. You can get it from http://www.cs.utoronto.ca/~eugenia/papers/indef.ps

Among other things, I describe an inductive solution to the frame and ramification problems. I treat causal rules specifying direct and indirect effects of actions as rules of an inductive definition. Causal (inductive) rules specifying direct effects of actions may contain both cycles and negations. Causal rules describing indirect effects may not contain cycles. This condition can be relaxed, but in this case the correct form of successor state axioms will not be obtained. I would be grateful to you if you could read this paper and provide some comments.

Here are some comments on your paper. First of all, it's a good paper.

The main contribution, I think, is that you further elaborated the generalized inductive definition principle and applied it to the ramification problem. I consider Marc's result about the equivalence between this principle and the well-founded semantics significant, and it was good to see how it works in the context of the ramification problem. It was nice to see examples for the syntactic subclasses of the generalized inductive definition.

I think that introducing the third truth value helps you to explain the subtle details of the iterative process which was quite difficult for me since I was working in classical first order logic with fixpoints.

From my experience, it is a long way from understanding how causal (inductive) rules work and capturing it in a classical logic-based formalism such as the situation calculus. In your paper you say that your approach can be embedded in the situation calculus. Perhaps the main question to you is how the third truth value would be represented in the two-valued setting?

The following will help you to make a few little improvements, I hope.


p. 28, after Th.1.

"Unless nondeterminism is explicitly introduced, our theory leaves no room for ambiguity."

I do not think this is an advantage of your theory. I think the truth value "u" should be obtained in two cases.

First, it may be a result of a "bad axiomatization". Second, it may appear due to lack of complete information about the current state of the world. (It seems that you apply the Closed World Assumption in your definition of a state as a set of (positive) fluent literals, if I understood correctly.)

I believe your definitions might be easily changed to represent the second case as well. For example, a state could be associated with a set of "known" positive and negative literals.


It also seems strange that you cannot derive anything about "holds", although I understand your motivation. Would it be difficult to introduce this feature?


p. 10 "..., where  top  denotes the inconsistent state."

This is a confusing notation. Usually  bot  is used to denote false or contradiction (as well as the bottom of a lattice) and  top  is used to denote true (or the top of a lattice), as you perhaps know. I noticed that you already use  bot  later in the text, but maybe another solution is possible. For example, in automata theory  emptyset  is used to denote a "junk" state without outgoing transitions.


p. 10 Footnote 8: I did not understand the second sentence at first reading. "Initial state", "presence and absence" are confusing.


p. 23, footnote 17: A more or less meaningful "real-world" example could be something like this: an object cannot be at location  p  and  q  at the same time (but must be somewhere), and actions  move-to-p  and  move-to-q  are always applicable.


p. 28, top. "completely orthogonal to successor state calculation" is too strong, see comments for p.38-39.


p.38-39 "This approach contains several contributions: - A complete uncoupling of ramifications from state constraints..."

I do not think they can be uncoupled completely. Even in your theory, ramifications impose some restriction on your function Trans, I believe. Trans maps states and actions to states. In this sense reachable states are restricted and state constraints are implied. Can you clarify?

The other direction, from state constraints to ramification, as described by Thielscher, is a nice idea. As far as I know, no one claims that this is the only way to obtain causal rules specifying indirect effects. It is a useful method to obtain some of these rules, even though it does not always work.

Also, in your summary you say "the standard view on causal laws as rules serving to restore the integrity of state constraints is too restrictive." From talking to different people, I never had the impression that this is the only and standard view on the role of causal rules in the community.

Overall, even if you think your statement is absolutely right in its current form, it's strange to see it as a contribution #1. I think you have done more interesting things. Again, it's a good paper.


I hope my comments will be helpful.

Regards,

Eugenia

A2. Marc Denecker, Daniele Theseider Dupré, and Kristof Van Belleghem (17.7):

Dear Eugenia,

Your comments are very useful. We didn't have a close reading of your paper yet, but a first look showed that there is strong correspondence between our approaches. We have had very similar intuitions. Thanks for the careful reading and the corrections.

  Perhaps the main question to you is how the third truth value would be represented in the two-valued setting?

We think there is no need to represent the third truth value in a 2-valued setting. Observe that the transition function is a partial mapping from 2-valued states (and sets of actions) into 2-valued states. IF an expert designs his effect theory such that for all states which satisfy the state constraints and sets of actions which satisfy the action preconditions, the transition function is correctly defined, (i.e. the model of the rule set is 2-valued and no contradictory fluent literals are caused), and IF this effect theory is represented correctly in situation calculus, the transition of that state and those actions will be defined and will be 2-valued. So, in this case, a third truth-value is not necessary.

In your approach, you impose a syntactical restriction (no cycles) which guarantees that the effects are unique and well-defined. Our approach does not impose a syntactical restriction, but there is still the methodological constraint that the expert should design his effect theory such that it leads to well-defined transitions (at least for the above mentioned states and sets of actions). In case this methodological rule is followed, it is not necessary to introduce a third truth value in situation calculus or event calculus. This methodological rule requires from the expert that he establishes the 2-valuedness of the well-founded models of his effect theory at least in all states satisfying state constraints and actions sets satisfying action preconditions.

This is a point that deserved more attention.

  "Unless nondeterminism is explicitly introduced, our theory leaves no room for ambiguity." I do not think this is an advantage of your theory. I think the truth value "u" should be obtained in two cases. First, it may be a result of a "bad axiomatization". Second, it may appear due to lack of complete information about the current state of the world. (It seems that you apply the Closed World Assumption in your definition of a state as a set of (positive) fluent literals, if I understood correctly.) I believe your definitions might be easily changed to represent the second case as well. For example, a state could be associated with a set of "known" positive and negative literals.

You propose to use the truth value "u" for two different purposes: as a pointer to ill-defined cause literals and as a pointer to unknown atoms. This may be problematic. When a cause literal is 3-valued, how does the expert know whether to interpret it in one way or the other?

In our view, "u" is reservated for pointing to ill-defined causal literals. Uncertainty caused by nondeterministic actions must be modeled explicitly through nondeterministic effect rules and is modeled by having different transition states (see the section about this topic). We do not model uncertainty on the input state; we do not need to do so. We expect that in any approach which integrates our effect theories in a full temporal language (e.g. EC or SC), uncertainty on the state at a certain time point is represented by having different models with different states at that moment. In each single model, the state is well-defined and can be represented as a 2-valued set on which our transition function can be applied. In our experience, this is a better way to model uncertainty than through belief sets. We can't see in what way we would be applying the Closed World Assumption.

  It also seems strange that you cannot derive anything about "holds", although I understand your motivation. Would it be difficult to introduce this feature?

This is not really true. The result of the transition function defines the holds predicate in the next state. One can verify that the next state is derived from the previous state and the causal literals as described by the normal situation calculus persistence axiom.

  p. 23, footnote 17: A more or less meaningful "real-world" example could be something like this: an object cannot be at location p and q at the same time (but must be somewhere), and actions  move_to_p  and  move_to_q  are always applicable.

We agree that the footnote may not be correct, but it's not clear to us what you mean by that example. Another example of a theory with negative loops is given in the Murray Shanahan's variant of the circuit example. Though we consider the semantics of the rule set as ill-defined, the rules are certainly meaningful.

  p.38-39 "This approach contains several contributions: - A complete uncoupling of ramifications from state constraints..." I do not think they can be uncoupled completely. Even in your theory, ramifications impose some restriction on your function Trans, I believe. Trans maps states and actions to states. In this sense reachable states are restricted and state constraints are implied. Can you clarify?

This remark is related to your other remark:

  Also, in your summary you say "the standard view on causal laws as rules serving to restore the integrity of state constraints is too restrictive." From talking to different people, I never had the impression that this is the only and standard view on the role of causal rules in the community.

First, let us clarify what we mean by uncoupling: we define the semantics of the effect rule theory without taking state constraints into consideration. In many other approaches, this is not the case: in fact in most approaches causal rules directly imply a state constraint, they are a state constraint augmented with some directionality-of-propagation information; an exception to this is Michael Thielscher's approach, but there state constraints are used in the calculation of the possible successor states, in that effect propagations may stop as soon as a state satisfying the constraints is reached. In our approach the successor state calculation is in no way dependent on the constraints. It is always carried out to its end and yields a unique successor state. However, it is true that the end result of the calculation is checked against the state constraints. These constraints may cause rejection of the found successor state, but will in that case not calculate another (valid) successor state.

Observe moreover that ramifications need not imply state constraints even if they restrict Trans. Of course a restriction of Trans rules out certain transitions, but that does not need to mean that any states are ruled out: a state is ruled out only if all transitions leading to it are invalid. The "counter" example in our paper gives an example where this is not the case: whenever an output changes to true, the counter must be augmented. But a resulting state in which the counter were not updated (but all other effects applied) need not be invalid in itself. This state might well have existed at an earlier time in the history of the network. It can only not be reached from the currently given state. So ramifications need not imply any state constraints.

It is absolutely true that in many cases, there is a relation between state constraints and ramifications; in that case the method of Michael Thielscher is applicable and works fine in our approach as well (we applied it in the circuit example; see our answer to Murray Shanahans question). In this case, the set of effect rules should satisfy the following correctness criterion: that any state satisfying the state constraints and any set of actions satisfying the action preconditions and occurring in this state lead to a transition state in which the state constraints are satisfied. This is not automatically satisfied but should be proven in our approach.

Also we discussed with a number of people about the issue of the relationship between state constraints and effect rules; had anybody contradicted us on this point, we would not have put it the way we did. But we have no problem in weakening the point. Still, it seems true that in most approaches, state constraints are underlying or are directly involved in describing the semantics of causal rules. Maybe this is not a deliberate, conscious choice, but in the existing formalisations it is apparently a fact.

Marc, Daniele, Kristof

C2-1. Eugenia Ternovskaia (27.7):

Dear Marc, Daniele, Kristof,

Thank you for your answers to my questions. These answers clarify an important difference between our works.

Using generalized inductive definitions to detect ill-defined literals is the advantage of your work.

However, after 2-valuedness is shown, all reasoning involving multiple models have to be performed on the metamathematical level. In contrast, first order logic augmented with inductive definitions allows us to work with all models at once therefore allowing for incomplete information. Perhaps your approach could be developed in this direction, but you do not address this problem in the paper. Notice I do not talk about non-determinism here. I completely agree with your point of view about non-deterministic actions.

Therefore it seems that we mostly developed different parts of the problem.

  In your approach, you impose a syntactical restriction (no cycles) which guarantees that the effects are unique and well-defined.

The syntactic requirement of acyclicity is by no means inherent to my approach. It simplifies the presentation in the LNCS paper. Otherwise I would have to consider the cases where the correct syntactic form of successor state axioms is not obtained, thus making the paper even longer.

To verify my method, I worked out the Gear wheels example. The approach works just fine, moreover, it produces successor state axioms in the correct form. Starting from your set of rules, I obtain the following successor state axiom:
    turn_1(do(as)) <-> (a = start_1 v a = start_2 v turn_1(s) ^ a =/ stop_1 ^ a =/ stop_2  
And similarly for  turn_2(do(as)) . To avoid the cyclic regression problem, regression has to be done with respect to the causal rules for direct effects rather than with respect to successor state axioms. This is the way I do it in the paper Causality via Inductive Definitions. See
  http://www.cs.utoronto.ca/~eugenia/publications.html.

  We can't see in what way we would be applying the Closed World Assumption.

I use this term in a sloppy way here to say that a state is represented by a set of positive fluent literal, and the literals not in the set are false in this state.

  It also seems strange that you cannot derive anything about "holds", although I understand your motivation. Would it be difficult to introduce this feature?

  This is not really true. The result of the transition function defines the holds predicate in the next state. One can verify that the next state is derived from the previous state and the causal literals as described by the normal situation calculus persistence axiom.

Perhaps I should have chosen a different wording. I used "derive" which I understand as "derive in a formal proof system". I did not mean "makes true in a particular model".

  Observe moreover that ramifications need not imply state constraints even if they restrict Trans. Of course a restriction of Trans rules out certain transitions, but that does not need to mean that any states are ruled out: a state is ruled out only if all transitions leading to it are invalid. The ``counter'' example in our paper gives an example where this is not the case: whenever an output changes to true, the counter must be augmented. But a resulting state in which the counter were not updated (but all other effects applied) need not be invalid in itself. This state might well have existed at an earlier time in the history of the network. It can only not be reached from the currently given state. So ramifications need not imply any state constraints.

I do not understand neither this comment nor the "Counter" example. The rules imply a state constraint relating any two consecutive states. This constraint is perfectly valid.

Finally, a question about Definition 1. You say "and body B a nonempty set of positive and negative literals of D". Do you mean "and body B, a nonempty set of symbols from D, including   {t}   and   {f}  "?

Regards,

Eugenia

C2-2. Marc Denecker, Daniele Theseider Dupré, and Kristof Van Belleghem (5.8):

Dear Eugenia,

  Thank you for your answers to my questions. These answers clarify an important difference between our works. Using generalized inductive definitions to detect ill-defined literals is the advantage of your work. However, after 2-valuedness is shown, all reasoning involving multiple models have to be performed on the metamathematical level. In contrast, first order logic augmented with inductive definitions allows us to work with all models at once therefore allowing for incomplete information. Perhaps your approach could be developed in this direction, but you do not address this problem in the paper. Notice I do not talk about non-determinism here. I completely agree with your point of view about non-deterministic actions. Therefore it seems that we mostly developed different parts of the problem.

To an important extent your work is indeed complementary to ours. Indeed, we must stress the following: our formalism is not a stand-alone temporal reasoning logic. It does not deal with descriptions of initial states, action preconditions, narratives, state constraints. It does not even make an ontological choice on the nature of time. To use our effect language for temporal reasoning, it must be extended with language constructs for describing these other aspects of temporal knowledge. Or it must be embedded in an existing approach which provides this expressivity. Your work provides such an embedding. We don't.

Instead, we focus entirely on ramification and (instantaneous) effect propagation. You will admit that there is merit in our approach: it shows that the ramification problem can be investigated independently of all these other issues - at least if one views causal rules as we do, namely as descriptions of effect propagations independent of say, state constraints-. In other words, our work can be used embedded in a number of other approaches ( A , situation calculus, event calculus, temporal logics..).

On the other hand, we pay a price for singling out the ramification problem; namely we do not show in our paper the interplay between causal rules and the other types of information. For instance, uncertainty on a certain state of affairs may be due to uncertainty on the initial state or on the narrative (the sequence of events that leads to the state). Obviously, in our paper, we cannot show how to model these sorts of uncertainty; however, our language and semantics is compatible with and is embeddable in many temporal logics in which such uncertainty can be described. Such a logic should satisfy only two conditions: in the logic there must be a notion of momentaneous state of the world which can be represented as a set of true fluents, and uncertainty on the state of the world at a particular moment or situation must be modeled by having different models of the theory in which the state at the time or situation is different (but always two-valued). This is precisely how uncertainty is modeled in your integrated approach.

We may add that in the past we have also investigated this combination of inductive definitions and first order logic with uncertainty in a global temporal logic and in logic in general. We have deliberately kept the issue of representing uncertainty on the world (except in our study of nondeterministic effects) out of the current paper to keep the different issues separated. However, uncertainty is addressed in the technical report referred to as [38] in the paper, since there an embedding in a general theory of time is discussed. (The report, like several of the papers mentioned below, can be obtained from
  http://www.cs.kuleuven.ac.be/~kristof/publications.html

As a matter of fact, the idea of integrating inductive definitions with first order logic is the main idea behind Open Logic Programming, used by Marc and Kristof in most of their previous related work (e.g. the papers by Van Belleghem, Denecker and De Schreye in ICLP'94, ICTL'94 and ICLP'97, by Denecker in LPNMR93 and LPNMR95) and is related to ideas about Abductive Logic Programming (in the paper by Console, Theseider Dupre' and Torasso in JLC'91).

The idea in Open Logic Programming is that some predicates are considered "defined", and therefore a "closure" semantics, i.e. a Logic Programming (or Inductive Definition)-like semantics is applied to them, while the interpretation of other predicates is left open, possibly being constrained by partial knowledge given in the form of integrity constraints. So, OLP is to be seen as a integration of classical logic with inductive definitions and is suitable for representing uncertainty. Representing and reasoning on uncertainty in this logic is not on the metamathemathical level but on the object level. The embedding of our ramification approach in a linear time structure [38] is actually firmly based on Open Logic Programming.

  In your approach, you impose a syntactical restriction (no cycles) which guarantees that the effects are unique and well-defined.

  The syntactic requirement of acyclicity is by no means inherent to my approach. It simplifies the presentation in the LNCS paper. Otherwise I would have to consider the cases where the correct syntactic form of successor state axioms is not obtained, thus making the paper even longer.

Sorry, that remark was by no means intended as a critique of your approach. We were only trying to explain the use of the third truth value. It helps us to avoid imposing a syntactic restriction, by detecting "bad" definitions in the semantics. Bad definitions need to be dealt with somehow, either by restricting the syntax, which we chose not to do, or by detecting them in the semantics, as we have done. We did not mean to say "your approach does not work with cycles", but only that in any approach some restriction or a way around bad definitions needs to be provided (and that introducing the third truth value is the way we chose to do it).

  We can't see in what way we would be applying the Closed World Assumption.

  I use this term in a sloppy way here to say that a state is represented by a set of positive fluent literal, and the literals not in the set are false in this state.

Classical logic has a 2-valued model semantics. Yet it seems to us that even with a sloppy use of the term "Closed World Assumption", one could not say that classical logic "applies the Closed World Assumption". Neither can this be said about our two-valued state semantics.

We do not allow any state to contain undefined fluents just because by "state" we mean a real state, not knowledge of an agent on a state. We do not at all intend that there should be complete knowledge on states. If there is partial knowledge on a state, there will be more than one state matching this partial knowledge. It's again an issue of reasoning about different models, and for each of them our transition function gives a resulting state (it gives one as long as there are no bad definitions, and not more than one as long as there is no explicit nondeterminism).

  It also seems strange that you cannot derive anything about "holds", although I understand your motivation. Would it be difficult to introduce this feature?

  This is not really true. The result of the transition function defines the holds predicate in the next state. One can verify that the next state is derived from the previous state and the causal literals as described by the normal situation calculus persistence axiom.

  Perhaps I should have chosen a different wording. I used "derive" which I understand as "derive in a formal proof system". I did not mean "makes true in a particular model".

Is your remark that we have not formalised a proof system but only the model theory ? Or are you referring to the fact that our language can be used only to describe state transitions and nothing else? With respect to the second interpretation, we have motivated this choice (see above).

With respect to the first interpretation, of course we have not defined any formal proof system in the paper, just a declarative and constructive model semantics. To derive anything about  holds  in a certain state, given a previous state, one can calculate all the  caus  literals in the state transition using the fixpoint operator. Calculating the new  holds  literals from the previous  holds  and the  caus  literals is then straightforward:  holds(ls') = (holds(ls) ^ ¬ caus(-l)) v caus(l. Is your remark that we have not formalised a proof system but only the model theory ?

  Observe moreover that ramifications need not imply state constraints even if they restrict Trans. Of course a restriction of Trans rules out certain transitions, but that does not need to mean that any states are ruled out: a state is ruled out only if all transitions leading to it are invalid. The ``counter'' example in our paper gives an example where this is not the case: whenever an output changes to true, the counter must be augmented. But a resulting state in which the counter were not updated (but all other effects applied) need not be invalid in itself. This state might well have existed at an earlier time in the history of the network. It can only not be reached from the currently given state. So ramifications need not imply any state constraints.

  I do not understand neither this comment nor the "Counter" example. The rules imply a state constraint relating any two consecutive states. This constraint is perfectly valid.

It seems there is just a misunderstanding about terminology here. We use the term "state constraint" to mean a relation between fluents within any one state, not between different states (see also the definition in the paper). As such the term corresponds to for example Reiter's and Shanahan's notions of state constraint and Thielscher's and Lifschitz's notions of domain constraint (to name just a few). We were not aware of a different use of these terms in the literature.

It is evident that any derived effect rule necessarily implies some relation between fluents in two consecutive states, e.g. for any state  s  and any successor state  s'  of  s  after any action, the rule "initiating  F  causes  G  if  H " implies that
    holds(Hs) ^ ¬ holds(Fs) ^ holds(Fs') ·-> holds(Gs'  
for any  F ,  G  and  H . But that is not a state constraint.

Does this clarify the examples and discussions given before ?

  Finally, a question about Definition 1. You say "and body  B  a nonempty set of positive and negative literals of  D ". Do you mean "and body  B , a nonempty set of symbols from  D , including   {t}   and   {f}  "?

No,  D  already includes   {t}   and   {f}   (see just above definition 1). With positive and negative literals from  D  we mean symbols of  D  or their negations. As such the definition allows for  t  or  f  to occur alongside other literals in a rule body. Observe that  ¬ t  or  ¬ f  can not occur since  t  and  f  can never belong to  Defined(D (see just below definition 1).

Thanks for your comments,

Kristof, Marc and Daniele

C2-3. Eugenia Ternovskaia (20.8):

Dear Marc, Daniele and Kristof,

Thank you for answering my questions and clarifying the motivation for your work.

A few words regarding our discussion about proof system versus model-theoretical approach. I was asking, without criticizing, about performing reasoning in your system on the object level, i.e., embedding your approach in a proof system. From your answer I understood that you consider both OLP and some logical representations as candidates for formalizing and performing reasoning. This answer is satisfactory, and it would be interesting to learn more about OLP from your papers.

  I do not understand neither this comment nor the "Counter" example. The rules imply a state constraint relating any two consecutive states. This constraint is perfectly valid.
  It seems there is just a misunderstanding about terminology here. We use the term "state constraint" to mean a relation between fluents within any one state, not between different states (see also the definition in the paper). As such the term corresponds to for example Reiter's and Shanahan's notions of state constraint and Thielscher's and Lifschitz's notions of domain constraint (to name just a few). We were not aware of a different use of these terms in the literature.

This is true, I have not seen this wider notion of a state constraint in the literature either. However, Chitta Baral and Ray Reiter used this term in a conversation recently. I do not see a reason why this wider notion could not be used as well.

Even if we choose not to extend the notion of a state constraint, it's hard for me to see in what way your observation would imply any interesting theoretical consequences. It seems obvious that a causal rule imply universal statements about situations (states) involving as many states as there are those mentioned in the causal rule. Your "counter" example only shows that there are indirect effects of actions depending on two consecutive states, not just one. It's hard to disagree. All causal rules of this king will imply universal statements about two consecutive states, not about just one, of course.

Regarding Definition 1, I do not remember whether you define what a literal is. Maybe I just missed it. When I was reading the definition, I did not assume that   {t}   and   {f}   can be literals. In the definition, you say "and body B a nonempty set of positive and negative literals of D". I understood that you take positive and negative literals of D, leaving   {t}   and   {f}   out because they are not literals. This contradicted to the discussion below. Of course I figured out what you mean, and then wrote to you thinking that a little rewording would help the reader.

Regards,

Eugenia

C2-4. Marc Denecker, Daniele Theseider Dupré, and Kristof Van Belleghem (30.8):

Dear Eugenia,

Here are some thought about your comments.

  The rules imply a state constraint relating any two consecutive states. This constraint is perfectly valid.
  It seems there is just a misunderstanding about terminology here. We use the term "state constraint" to mean a relation between fluents within any one state, not between different states (see also the definition in the paper). As such the term corresponds to for example Reiter's and Shanahan's notions of state constraint and Thielscher's and Lifschitz's notions of domain constraint (to name just a few). We were not aware of a different use of these terms in the literature.

 This is true, I have not seen this wider notion of a state constraint in the literature either. However, Chitta Baral and Ray Reiter used this term in a conversation recently. I do not see a reason why this wider notion could not be used as well. Even if we choose not to extend the notion of a state constraint, it's hard for me to see in what way your observation would imply any interesting theoretical consequences. It seems obvious that a causal rule imply universal statements about situations (states) involving as many states as there are those mentioned in the causal rule. Your "counter" example only shows that there are indirect effects of actions depending on two consecutive states, not just one. It's hard to disagree. All causal rules of this king will imply universal statements about two consecutive states, not about just one, of course.

Recall that our opinion of the current state of the art was that effect rules were much perceived as tightly coupled to state constraints (in the narrow sense; i.e. statements relating properties at the same time point). In any case, if one looks at current work on ramification, then usually effect rules imply state constraints (like in Lin's approach), or their semantics is determined by state constraints (like in Thielschers approach).

The role of our counterexample was simply to show that there may be no such relation.

What are interesting theoretical consequences? We do not point to some interesting new and fundamental relationship; actually we cut one: in general an effect rule may be unrelated to state constraints (but this was never your viewpoint anyway, right?). But there is a gain: the effect propagation can be described mathematically independently of state constraints and also of time topology, action preconditions, initial states, etc..

But yes, also to us
  "it seems obvious that a causal rule imply universal statements about situations (states) involving as many states as there are those mentioned in the causal rule."
An interesting question generated by your remark is whether Thielscher's approach to derive causal rules from state constraints (about one state) and dependency information can be generalised to "constraints between states" (plus dependency information as far as needed).

 Regarding Definition 1, I do not remember whether you define what a literal is. Maybe I just missed it. When I was reading the definition, I did not assume that   {t}   and   {f}   can be literals. In the definition, you say "and body B a nonempty set of positive and negative literals of D". I understood that you take positive and negative literals of D, leaving   {t}   and   {f}   out because they are are not literals. This contradicted to the discussion below. Of course I figured out what you mean, and then wrote to you thinking that a little rewording would help the reader.

We will take care to clarify this in the text. Thanks.

Thanks for pursuing this discussion, Eugenia.

Marc, Daniele, Kristof

C2-5. Eugenia Ternovskaia (11.9):

Dear Marc, Daniele, and Kristof,

Thank you for an interesting discussion regarding your paper. It clarified some subtle points of your work and helped to understand your motivation.

  An interesting question generated by your remark is whether Thielschers approach to derive causal rules from state constraints (about one state) and dependency information can be generalised to "constraints between states" (plus dependency information as far as needed).

This is probably a question to Michael. I found his idea about generating some of causal rules interesting, and I am also curious whether it is possible to generalize it.

Best regards,

Eugenia

C2-6. Michael Thielscher (16.9):

Dear Eugenia, Marc, Daniele, Christoph,

I'd like to second Eugenia's view on the "Counter" counter-example, which implies that the intended behavior can be obtained after all with my approach. The natural "inter-state constraint" in this example would be
    ¬ Holds(outs) ^ Holds(outdo(as)) ^ Holds(count(n), s  
     ·-> Holds(count(n+1), do(as))   
(in conjunction with  Holds(count(m), s) ^ Holds(count(n), s) ·-> m = n )

Then if you consider the causal relationship
    out causes count(n+1) if count(n  
( in conjunction with  count(n) causes ¬ count(m) if m =/ n )

the only consistent successor state would show the expected increment of the counter by one. (A side remark: Inter-state constraints like the above can be expressed in a novel Situation Calculus-style formulation of the Fluent Calculus; if you're interested, have a look at
  http://pikas.inf.tu-dresden.de/~mit/publications/conferences/JELIA98.ps)
You have raised an interesting question in this context, namely, whether and how causal relationships can be extracted from such inter-state constraints. The current procedure is not directly applicable because it is based on formulas without situation or time argument. My feeling is that a generalization must be possible--there definitely is a formal correspondence between the above state constraint and the causal relationship--but working out the details will require some brain work.

Cheers.
Michael


Q3. Michael Thielscher (14.7):

Dear Marc, Daniele and Kristof,

I have a question concerning the notion of negative cycles. If I understand correctly, they arise whenever the absence of some causation is the cause of some effect. Now, on page 23 you consider a construction like
    caus(p) <- ¬ caus(q  
    caus(q) <- ¬ caus(p  
ambiguous or nonsensical. But what if you specified, say, the effect of tossing a coin by "if heads isn't caused, then tail is" and "if tail isn't caused, then heads is." The expected model would be that either of heads or tail is caused, which is what the stable models semantics proposes. Of course it is questionable whether the above is a good specification of tossing a coin, but I would argue that this is due to the notion of "absence of causation as cause" being rather odd from an intuitive point of view in general. I therefore wonder if this notion is really needed? The suitcase scenario, for example, can be handled with Lin's approach (and others) without this notion.

Also I would like to make a remark on your discussion on zero vs. non-zero delays (Section 6). This is a very interesting question, and I have recently discovered a related problem in connection with my method of using causal relationships: If effects with zero delay are not distinguished from effects with real delay, then some desirable conclusions may not follow. Since your approach has much in common with mine, I wonder if the distinction between so-called steady and stabilizing state constraints, which solves the problem in my approach, is needed here as well? (C.f. my paper entitled "Steady vs. Stabilizing State Constraints" at this year's CommonSense workshop; see

  http://www.ida.liu.se/ext/etai/nj/fcs-98/listing.html
Michael

A3. Marc Denecker, Daniele Theseider Dupré, and Kristof Van Belleghem (18.7):

Michael,

Here is an answer to your comments.

First of all, we agree with you that it is unnatural to have just the absence of an effect as a cause for anything. in fact as you can see in the paper, our high-level effect rules (which syntactically look a lot like Lin's) contain only the presence of an effect (on a complex formula) as possible trigger for further changes. But at a low level, we need to introduce negative cause literals to define initiation of complex formulas.

For example, when is the formula  sw1 ^ sw2  initiated? Is it initated when  sw1  is true in the initial state and is made false and  sw2  is made true simultaneously? No: it is initiated when both are made true, or when one of them is made true and the other is already true and is not made false. To represent this, a negative causal literal is needed. Observe however that any syntactically correct high-level rule corresponds to a set of low-level rules in which each rule contains at least one positive cause literal. There is always one actual effect at the basis of any propagation, but the propagation can be prevented by other effects - hence the negative cause literals.

One could argue that one should define that  sw1 ^ sw2  is initiated by comparing two successive states (and so there would be no need for resorting to low-level rules), which is what you do in your approach. However, our approach is constructive in that the next state is computed using the previous one and the computed causal literals. This definition of the next state is of course only mathematically sound if we can compute the caused literals without knowing the next state already.

In Lin's approach (the IJCAI95 paper), absence of causation is not present because the formalisation is entirely based on the values of fluents, not on their changes. One might say that all (present or absent) causations are implicit because it is the resulting state that is considered rather than the transition, as apparent in the rule
    up(L1s) ^ up(L2s) ·-> Caused(opentrues  
Wrt our approach, things are somewhat mixed since no concurrent actions are considered in that paper, but it's true that the approach would also work for the concurrent flipping of the two switches. The fluent-triggered causal statements are similar to our complex initiation formulae. One difference is of course that Lin's rules necessarily imply the corresponding state constraint. Related to this is the fact that a number of "useless" (even if correct) instances of "Caused" can be derived for any situation where  L1 ^ L2  is true but not initiated. "Useless" here just means that it does not give rise to any changes, but just maintains a reason for  open  being true. For cases where the causal rules have a corresponding state constraint, it is probably a matter of taste if our formalisation or Lin's one is preferable. In our version
     initiating L1open ^ L2open causes open   
it is the change of  L1open ^ L2open  that causes a change in  open . In Lin's view, it is the state of  L1open ^ L2open  that causes the state of  open .

This different view allows our approach to be more general: in cases where causal rules do not correspond to state constraints, such as example 1 in our paper (with the electronic counter), the relation is inherently dependent on the value change: it's the change of "out" from true to false that makes "count" change. This is not a case where absence of initiation is needed, since the dependency is on the single fluent "out" but how would this be modeled in Lin's causal rules?

We had another example (in our NAIC'97 paper and in Kristof's PhD thesis): an alarm system that detects if somehow people enter a building. While the system is active, anyone entering the building triggers the alarm: if  in  (someone in the building) becomes true when  active  is already true,  ring  becomes true. Under one interpretation of that example, the alarm does not ring if it is deactivated at the same time where someone enters. Then we would have the rule:
    caus(ring) <- init(in)commaholds(active)comma¬ init(¬ active  
Could the same be achieved in Lin's causal rules? A rule
    in(s) ^ active(s) ·-> Caused(ringtrues  
is not correct precisely because the state constraint
    in ^ active ·-> ring   
is not true: we suppose in fact that the bell does not ring if the system is activated where someone is already in.


Regarding your interpretation of
    caus(p) <- ¬ caus(q  
    caus(q) <- ¬ caus(p  
as a representation of  caus(p) xor caus(q.

As indicated above, we agree that having only "not caus" literal as the body of a causal rule is counterintuitive, precisely because such rules are not constructive. However negative cycles can also occur if there are positive caus literals present (just add  caus(r to both rules).

We wanted the constructive principle of inductive definition to model the propagation of effects. It is hard to interpret the above two rules as a well-defined inductive definition. As you say, it could be interpreted as a cause for  p  or for  q , but it seems (as you also say) to be a very peculiar way of modeling this. We prefer to model that there is either a cause for  p  or for  q  explicitly by using a nondeterministic effect rule (something like  toss causes headsxortails ). This may be a matter of taste, but we do not want unintended nondeterministic effects, therefore, we force the expert to make nondeterministic effects explicit.

It is true that in stable semantics the above clauses do represent the exclusive disjunction  p xor q . But we do not regard stable semantics as the right sort of semantics for modeling definitions. For more discussion on this, see the paper by Marc which appeared at NM-98. In Kristof's PhD thesis a version of our approach is presented in a narrative-based (i.e. Event-Calculus-like) temporal structure, also including actions with alternative effects like tossing a coin. Only a more restricted form of nondeterminism is in the ETAI paper, but the idea in the semantics is the same.


As regards your "Steady vs stabilizing state constraints". There, you say that mixing the two types of ramifications is responsible for the unintended conclusion in the example. From our point of view, the problem in the specific example would be that the causal rules are not correct: in our formalisation we would say that  up(lhs) causes stain  if $not up(rhs)$ is and remains true, i.e.  ¬ caus(up(rhs)) . Again, absence of causation solves the problem, but it again appears in combination with a positive causation, due to an interaction of causes. So on the basis of this example we can not give a definite answer.

Nevertheless, the issue of delays is intriguing, and your distinction between zero and non-zero delays is an interesting contribution, even if there could be a more general view on it. First, it could be possible that even what you consider zero delays (steady constraints and ramifications) are an abstraction wrt reality. Maybe in quantum physics it is not true that an object (a sub-atomic particle, in particular) cannot be in two locations at the same time? (is any expert of quantum physics reading ETAI?)

More importantly, you argue that the important distinction in qualitative reasoning is between zero and non-zero delays, but still small enough to be considered virtually instantaneous by common sense, so to be distinguished by actually "delayed" effects. That makes sense, even if one could also view a continuum between zero delays (or delays that are too small for classical physics), non-zero but "commonsense" instantaneous delays, and delayed effects. Knowledge about different delays could be still qualitative or imprecise, but still allow to conclude that a delay is smaller than another and then it is not correct to apply the (non-zero-delay) causal rules in any order.

Marc, Daniele and Kristof


Q4. Anonymous Referee 1 (8.3):

The following are my answers to the standard ETAI refereeing questions:

  1. Yes

  2. Yes, mostly. Unfortunately, the authors did not include proofs of their results. This is disappointing because the proofs would support some of the claims made in the summary. Even a short outline of the proofs would improve the paper.

  3. No, to the best of my knowledge.

  4. The article is reasonably organized. I found some parts of sections 2 and 6 too long.

    The authors define a supporting set of fluent literals on page 23. This definition implies that in general, a supporting set is not necessarily consistent. However, just before the definition, on page 22, they say "a supporting set ... is a consitent set of fluent literals which entails F."

    On the same page, the authors define  Caus(L which is never used in the paper. Instead, they use  cause(L which, I think, refers to the same thing.

    I found the double notation for the negation of fluent literals (using an overbar and the  ¬   sign) unnecesssary.

  5. I think the article can be accepted with the condition that the changes specified above are made.


Q5. Anonymous Referee 2 (8.3):

The results of the article, as specified in the summary, are new and interesting, and the body of the article justifies these results. The article is clearly written and well-organized.

Comparison with related work in Sec. 9 would be more complete if the authors added a few words about four papers that are not included in the bibliography:

[1] H. Turner, "Splitting a default theory", in Proc. AAAI-96.

[2] H. Turner, "Representing actions in logic programs and default theories: a situation calculus approach", JLP, Vol. 31, 1997.

[3] N. McCain and H. Turner, "Causal theories of action and change", in Proc. AAAI-97.

[4] E. Giunchiglia and V. Lifschitz, "An action language based on causal explanation", Proc. AAAI-98.

This direction of work is relevant, in particular, because [1] and [4] contain formalizations of the spilling water example similar to the one proposed by the authors in Sec. 5.3.


 

Background: Review Protocol Pages and the ETAI

This Review Protocol Page (RPP) is a part of the webpage structure for the Electronic Transactions on Artificial Intelligence, or ETAI. The ETAI is an electronic journal that uses the Internet medium not merely for distributing the articles, but also for a novel, two-stage review procedure. The first review phase is open and allows the peer community to ask questions to the author and to create a discussion about the contribution. The second phase - called refereeing in the ETAI - is like conventional journal refereeing except that the major part of the required feedback is supposed to have occurred already in the first, review phase.

The referees make a recommendation whether the article is to be accepted or declined, as usual. The article and the discussion remain on-line regardless of whether the article was accepted or not. Additional questions and discussion after the acceptance decision are welcomed.

The Review Protocol Page is used as a working structure for the entire reviewing process. During the first (review) phase it accumulates the successive debate contributions. If the referees make specific comments about the article in the refereeing phase, then those comments are posted on the RPP as well, but without indicating the identity of the referee. (In many cases the referees may return simply an " accept" or " decline" recommendation, namely if sufficient feedback has been obtained already in the review phase).