CSCE 475/896

Topic Summary Assignment 6: 

Learning in Multiagent Systems

Questions and Answers

October 31, 2003

 

Important Note:  Important questions are denoted with a ‘8’ symbol.

 

Important Note:  For most students, the response to the stupid question neglected one key component:  how would drivers learn to adapt to the new arrangement?  Many of you mentioned that the missing of exits, the use of GPS, and so on.  But few of you mentioned what a driver would to adapt to this new situation.  What about reinforcement learning?  A driver could learn from its mistakes, for example.  A driver could learn to observe other drivers and slow down at the right time.  Humans are usually very resilient—humans learn to adapt; humans do not need GPS or cell phone to figure out where the exits are.  Humans make mistakes and learn to recover from mistakes.  Intelligent agents do too.

 

Q1:  What are some of the cost ratios for agents, in terms of % of computation spent on learning, thinking, perception, etc.?

 

A1:  This depends.  Learning is often regarded as thinking.  And perception is sometimes a part of learning.  Sometimes an agent learns to perceive better; sometimes an agent perceives to learn better.  So, sometimes these are intertwined.  Usually, we want to make sure that learning does not interfere with the actual execution of tasks. 

 

Q2:  Do situations ever arise in which an agent fails to learn because of the way the system distributes learning?  If so, do these agents eventually get phased out based on a sort of evolutionary chain?

 

A2:  It is possible if I think I know what this question is asking for.  Is it possible for a system to distribute learning so much that an agent fails to learn?  Philosophically, yes.  But in that case, what is the purpose of building a MAS like that?  Philosophically, agents who do not learn get phased out.  Realistically, we could actually build a MAS like that so that agents that are no longer useful are removed—making the system a highly dynamic one.

 

Q3:  What happens when bad or incorrect information gets into the system?  If the system is based off of learning from each other and reinforcement, won’t this bad information continue to persist?

 

A3:  First of all, you have to go back to the goal and definition of machine learning.  The idea is to improve performance.  If bad or incorrect information is detrimental to the performance of the MAS, then will it be learned?  A system with reinforcement will learn not to use such bad or incorrect information, eventually.  So, bad information will be thrown away or updated, or will be tagged such that it does not factor into decision making, or tagged in a way that it factors into decision making in a reverse manner.  See Q4 also.  See Q5 also.

 

Q4:  What would happen if a particular agent (always) deliberately gives false feedbacks to other agents?  How would, if possible, the other agents eventually identify this deceitful agent?

 

A4:  Yes, the other agents could identify this deceitful agent.  Now, if the agent lies to everybody, then of course, the other agents will have a hard time catching this deceitful agent.  If the agent is not consistent in its lies, then the other agents could exchange information and find out the inconsistency.  What would motivate the other agents to exchange information?  First, these agents could just exchange information as part of their interaction and collaboration.  Second, these agents may see a drop in their performance based on poor information and may realize that some information may not be true and thus work together to pinpoint the culprit.

 

If false feedbacks lead to poor performance, then the agents will learn not to use them.

 

If false feedbacks lead to good performance, then the agents will not mind being deceived.

 

See Q3 also.  See Q5 also.

 

Q5:  In addition to the above question, should we design our agents to trust feedbacks from incompetent agents?  Also, how about feedbacks from uncooperative agents?  Should we force our agents to verify those questionable feedbacks with other agents?

 

A5:  First of all, do the agents know there are incompetent agents among them?  If yes, which is natural, then we should not design the agents to simply just trust the feedbacks.  Each feedback should be analyzed and cautiously adopted.  Sooner or later, agents will find out which agents are incompetent, which are.  This is the same as finding out who is expert in some areas, who is not.  This is quite common.  It is not a matter of trust.  It is a matter of cautiousness, to determine which agents are capable of what. 

 

Second, what about trusting feedbacks from uncooperative agents?  Suppose we have two agents, A and B.  Does A know that B is uncooperative?  If it knows that, then why would A want to trust the feedback from B?  If B is posing to be cooperative to fool A, then B is once again a deceitful agent. See Q4 for this.

 

Third, yes, we should design our agents to always verify if needed.  In many MASs where the environment is uncertain, noisy, and dynamic, an agent sometimes collate several pieces of information from different agents to identify the correct information.  This is not necessary because that agents are incompetent or deceitful; it is simply that the environment does not allow the agents to be absolutely certain and competent in everything that they do.  As a result, agents should always verify.  Or, if not verification, at least the agents should be able to learn from their mistakes—for trusting information blindly.  The system will thus be more robust.

 

See Q3 and Q4 also.

 

Q6:  What is the difference [between] learning from examples and learning by analogy?  In other words, what is the difference between instance-based learning and case-based learning?

 

A6:  Learning from examples is like this.  Suppose there are many examples, or instances.  Each instance has a set of attributes and the outcome or classification.  You examine these instances and learn what set of attributes lead to which outcome or classification.  That is learning from examples.  Many machine learning algorithms are based on this idea.

 

Learning by analogy is different.  Suppose there is a solution to a problem: how to divide an apple between two children.  Now, you have a new problem: how to solve the land dispute between two people.  You may use the solution to your previous problem to this new problem.  If it works, you learn it.  That is learning by analogy.  This is akin to case-based learning.

 

In instance-based learning, the power of the knowledge or learning capability lies with the many instances.  In case-based learning, the power of the knowledge or learning capabilities lies with each case.

 

8 Q7:  What is the difference of learning and reasoning?  Is it true that reasoning is individual learning and a subset of learning?

 

A7:  Reasoning is deciding what to do.  Learning is finding out what causes a failure or success and identifying the actions or situations that lead to that failure or success, and recognizing the actions that will improve the performance of the system.  I say that learning is part of reasoning.  But agents may reason but not learn.

 

8 Q8:  In decentralized learning where the agents hope to achieve a common learning goal, we know that communication and the sharing of learning are important.  However, if we decentralize learning, how does one insure that the agents are learning different things?  Do we design agents so that they can adapt to learn uniquely?  What else is used to make sure an agent is not learning the exact same thing as another agent?

 

A8:  If we decentralize, why would we want the agents to learn different things?  First of all, we need to keep in mind why we want to build a MAS in the first place.  Is it because of distributed resources, distributed data sources, distributed control, etc.?  If that is the case, then it is very likely that the agents will encounter different experience of their own living in this environment.  As a result, it is very likely that the agents will learn different things.

 

We may design agents so that they can adapt to learn uniquely.  This can be done by using a unique learning mechanism, or by looking at a unique set of data.

 

Usually, we do not usually want to make sure that an agent is not learning the exact same thing as another agent.  If we do, we could do something like this:  whenever an agent learns something, it announces to all other agents.