Moderated by Erik Sandewall. |
Chitta Baral and Son Cao TranRelating Theories of Actions and Reactive Control |
The
article
mentioned above has been submitted to the Electronic
Transactions on Artificial Intelligence, and the present page
contains the review discussion. Click here for
more
explanations and for the webpage of theauthors: Chitta Baral and Son Cao Tran.
Overview of interactions
Q1. Erik Sandewall (3.10):
Dear Chitta and Son Cao, Your article addresses a very interesting topic for our group. I have some specific questions about the contents of the article, and also some comments about how it relates to other current approaches to the same or similar problems. I'll write them as two separate contributions. With respect to the article itself, I have the following questions. 1. For maintenance control modules, which are treated in section 3.3, you write:
Possibly one can reduce this case to the one you consider by identifying the maintenance goal with a subset of the acceptable states, namely those acceptable states where that are at a safe distance from a transition to the unacceptable ones. Is this the way you intend maintenance goals to be defined, and if so, do you claim that the reduction is always valid? (Concurrency and the use of combined achievement and maintenance modules may introduce complications, I believe). 2. A second question concerns to what extent this work relates to current, logic-based theories of actions and change. In section 9.1 you write:
3. Here is a standing question that I now ask regularly to authors in our area. Given that your paper proclaims a number of theorems, as do many of our publications, (a) do your results rely on any theorem previously published in this literature, (b) do you foresee any of your theorems later being used in a forthcoming publication by yourself or someone else? If the answers to both questions are "no", what do you think is the purpose of including theorems in our research articles?
4. There seems to be a close connection between your The relation to ramification is particularly transparent by comparison with my KR-96 paper, [c-kr-96-99]. That article uses two types of state transitions: one result function that maps an action and a state to a set of possible result states, and a binary successor relation that characterizes further transitions. The "closure" of an action, in that case, is the set of all states that can be reached via one member of the set of action result states, followed by an arbitrary number of steps using the successor relation, and that do not in turn have any successor. For ramification, the successor relation characterizes spontaneous and "caused" transitions in the world; in your work they would instead characterize the results of exogenous "actions". These two cases seem to be very close to each other. The KR article is in fact formulated entirely in terms of states and state transitions, and analyses ramification methods without introducing any logic language at all. Its way of looking at actions is therefore quite close to what you use in your paper. (The main results in my article are assessments that characterize the range of sound applicability for a number of previously published approaches to ramification). Relating back to my question 2, this adds to the impression that ramification and exogenous events are related phenomena. 5. I put quotes around "actions" because I can not get used to the idea of calling natural events `actions'. Although I understand the background of this terminology, which comes from the situation calculus, I do think it makes communication with neighboring disciplines a lot more difficult than it ought to be. Best regards, Erik
References:
A1. Chitta Baral and Son Cao Tran (23.11):
Dear Erik: We really appreciate your comments. Following is a point by point response to your comments. In addition, we plan to revise our paper to take into account some of your suggestions and our responses to it.
Best regards 1. We agree that our current formulation in terms of maintaining goals is weaker than the formulation where the control never allows the agent to get into an unacceptable state, regardless of exogenous actions. The later approach is used in (among other approaches) policy determination algorithms for Markov Decision Process based approaches [DKKN95,KBSD97]. (As we point out in the paper, the algorithms in [DKKN95,KBSD97] consider the control module as a whole and do not have the sufficiency condition for correctness of individual goals.) We feel that the later approach is too strong and often there may not be any control module satisfying the stronger criteria. Nevertheless, studying a middle ground between the two is important.
Following your suggestions, one middle ground can be based
on using a maximum number (say
Let
We can then use this parameter m in defining satisfaction
of maintenance goals, by modifying Defn 3.6 in the paper,
to have We do not forsee any problem with concurrency with this approach. But the definition of achievement and maintainance will no longer be uniform, and as a result the definition for mixed control modules will get a little complicated. We will try to pursue this in greater detail (or at least mention this) in the revised version of our paper. 2. As you mention, our paper can be formulated in terms of states, state spaces and transition functions without talking about action theory. We felt that by doing that, the connection between an action specification and the states and transition function it defines in a succinct and elaboration tolerant (often) way is pushed into the background, and we wanted to stress that states and transition functions are not often directly given but are rather specified by an action description. For that reason, we bring in action theories. We will point this out in our revised version.
3. 4. We appreciate your observation. We had not considered this connection. One difference with respect to your KR 96 approach is that in your approach the first step of your ``closure'' is by applying an action and the subsequent steps are due to a successor relation, while in our approach the subsequent steps are due to exogenous actions. Regardless of this difference, we will point out the connection in our revised version. 5. Calling natural events as actions may sound odd some time. The reason we chose to call everything actions are several. (i) The exogenous actions are not just natural actions, but are actions that are beyond the control of the agents. They could be the action of another agent in a multi-agent world. Thus they are a super set of what I would normally think of as ``natural actions''. (ii) By just saying actions, we avoid talking separately about a theory of events. Otherwise every time we talk about effect of actions we would have to talk about effect of events, and their will be a lot of duplication.
Q2. Anonymous Referee 1 (18.5.1999):
1. Are the results of the article, as specified in the summary, of significant interest for researchers in our area? Yes. 2. Does the full text substantiate the claims made in the summary? Yes. 3. Is the text intelligible and concise? Yes. (Additional comments concerning minor corrections are being given to the authors). 4. Do you know of any previous publication of the same result? No. 5. Are there any other considerations that may affect the decision? No.
Q3. Anonymous Referee 2 (18.5.1999):
This paper concerns an important topic, namely the connection between action theories and control. The paper is correct and well-written and I think that the paper should be accepted. However, I have one objection to make. If we instead of "action theory" use the term "open-loop specification" and instead of "reactive control" use "control law" the results of the paper concern the topic of verification and synthesis of control laws w.r.t. to some specification. In this sense the results are not at all original, since these research problems have generated an abundance of work in Control Theory (more precisely, in work on "Discrete Event Dynamic Systems", DEDS). The authors appear to be completely unaware of this work and on a number of points they re-invent the wheel (e.g. the criterion for "correctness of maintenance control modules" corresponds exactly to a particular flavour of stability criteria, see [3]). There is not one single reference to the DEDS (or Hybrid Systems) literature in the bibliography. In other words, I observe that a large body of work in a similar problem area is neglected, in particular since it is is based on similar definitions and concepts (that is, verification and synthesis of control laws from specifications). It would have been an even better paper if a comparison to this work had been included. On the other hand, the usage of these definitions and concepts is done in an original way by the authors. The main issues addressed by this paper have not (to the best of my knowledge) been studied by the DEDS or Hybrid Systems communities, for example modeling and control with sensing actions and conditional plans. It is important to incorporate the achievements in Reasoning about Action and Change with control, but I think that it would be unfortunate if achievements in Control Theory are neglected while doing so. Examples from the literature of control law synthesis and verification for discrete and hybrid systems can be found in [1, 2, 5], and various definitions of stability in [3, 4]. REFERENCES [1] Heymann, Lin and Meyer, Controller Synthesis for a class of hybrid systems subject to configuration-based safety constraints, LNCS 1201, 1997. [2] Kumar and Garg, Modeling and Control of Logical Discrete Event Systems, Kluwer Academic Publishers, 1995. [3] Ozveren, Willsky and Antsaklis, Stability and Stabilizability of Discrete Event Dynamic Systems, Journal of the ACM, 1991 [4] Passino and Burgess, Stability analysis of discrete event systems, John Wiley, 1998. [5] Zhang and Macworth, Synthesis of hybrid constraint-based controllers, LNCS 999, 1995. A3. The Authors (12.6.1999):
The comments of the anonymous reviewers were very useful. We would like to especially thank the second reviewer for pointing us towards the discrete event dynamic system (DEDS) literature and the similarity between our notion of maintainance and the notion of stability there. We have now modified our paper by discussing this similarity and adding a few references to the literature in DEDS. The changes follow below. Chitta and Son 1. Added to the first paragraph of Section 3.3 (Our notion of maintainance is weaker than the notion of stability in discrete event dynamic systems \cite{Ozve91,Ram87a,Ram87b}. We discuss this difference later in Section~\ref{other}.) 2. Modified Section 3.4 The modified first three paragraphs are: In the previous section we described our formulation of maintaining a goal, and correctness of control modules with respect to our definition. But in some cases our definition can be considered weak. For example, in one of the stronger formulations the agent is not allowed to get to an unacceptable state, regardless of what exogenous actions happen. This is used in policy determination algorithms for MDP (Markov decision process) based approaches \cite{dean96,kab97}. (We compare our work to these works in additional detail in Section~\ref{sec-suff1}.) Another formulation that is stronger than our notion, and used in the discrete event dynamic system literature \cite{Ozve91,Ram87a,Ram87b} is the notion of `stability' where an acceptable state must be reached within a finite number of transitions and then be visited infinitely often. The difference here is that it requires that an acceptable state be reached in a finite number of transitions, regardless of the presence of an window of non-interference. Thus modules that satisfy the maintainability criteria may not guarantee stability. We feel that the particular stronger formulations mentioned above may be too strong and often there may not be any control module satisfying such a criterion. Moreover, in domains where the robots or agents actions are deterministic (such as in case softbots, and database updates), our approach of not combining the agent's action with the environments action, and use of the maintainance criteria seem to be more appropriate. Nevertheless, we now briefly discuss some possible in between formulations. 3. Modified Section 10.2. The modified section is as follows: \subsection{Reactive planning, Program synthesis and Discrete event dynamic systems} Recently there has been some proposals \cite{godef91,kab97,dean96} about automatic construction of control rules inspired by research in program verification and synthesis and operations research. In \cite{kab97}, an algorithm to generate control rules for goals given in a temporal logic is given. Two major contributions of this work are that it considers deadlines, and it allows general temporal goals that can specify cyclic behaviors. In \cite{dean96}, transitions between states are represented as Markov processes, and goals are specified using reward functions, and policy iteration algorithms from Operations Research are used to construct control rules. Our approach differs from the approach in \cite{kab97,dean96}, the approaches mentioned in the program verification and synthesis literature (for example, \cite{emerson90,emerson82,pneuli89}), and the approaches in the discrete event dynamic systems literature (for example, \cite{Ozve91,Ram87a,Ram87b}), in that we separate agent actions from exogenous actions and do not combine them and our notion of maintainance is weaker than analogous notions there. Also we insist on developing (sufficiency) conditions for the correctness of {\em individual} control rules. The formulations in \cite{kab97,dean96} and the ones mentioned in \cite{emerson90} only deal with the correctness of a control module as a whole. It is important to consider correctness of individual control rules, because often we {\em learn} (or we are told) a particular control rule in isolation, and we need to satisfy ourselves that it is correct by itself, regardless of the rest of the control rules. Also our methodology allows us to upgrade a control module by simply adding additional correct rules for new states that need to be taken care of. In case of \cite{kab97,dean96}, simply adding new rules may not be always enough to upgrade a module and extra care is needed to avoid getting into the kind of cycles present in the module Goto\_elevator\_3 from Section~\ref{sec-suf1}. Finally we consider sensing actions which are not considered in \cite{kab97,dean96,Ozve91}, and we use AI methodologies such as `planning' which is not considered in the program reasoning approaches described in \cite{emerson90,emerson82,pneuli89}. 4. New references added
\bibitem[Ozv91]{Ozve91}
\bibitem[RW87a]{Ram87a}
\bibitem[RW87b]{Ram87b} 5. Acknowledgment modified We are grateful to Erik Sandewall and Marcus Bjareland for useful discussions related to the subject of this paper. We are also grateful to the anonymous reviewer who pointed us to the literature in discrete event dynamic systems and the similarity between our notion of maintainance and the notion of stability there. This research was partially supported by the grants NSF CAREER IRI-9501577, NASA NCC5-97 and NSF DUE-9750858.
|