CSCE 496/896

Topic Summary Assignment 4: 

Search Algorithms for Agents

Questions and Answers

October 3, 2002 

 

The minimax procedure is actually a search algorithm for one person.  But, since the decisions of that person are influenced by its own moves and its opponent’s moves, that person has to consider what its opponent does.  So, at the MAX level, that person will try to maximize its value by picking the best move.  At the MIN level, that person will assume that the opponent will pick the worst move for that person.  So, that person picks the move with the minimum value.  Just as emphasized in the class, you try to search through the space by assuming that your opponent will sabotage your path, will try to make things difficult, and so on.  And you try to get as much as possible out of the search.

 

Several students misunderstand what a snooze button does.  Those need to find out what it does. 

 

Q1:  Does choosing a search strategy depend on the agents being used as well as the problem specifics?  Are the search strategies interchangeable?

 

A1:  For the first part, the response is yes.  In a multiagent system where agents can communicate to share knowledge, then the search strategy could be a cooperative one.  In a multiagent system where agents can sense with high certainty their environments, then the agents can make better decisions in picking the next step in their searches.  It is problem specific.  As we have discussed in class, there are three general types of problems: path-finding, constraint satisfaction, and 2-player games.  And of course, heuristics change.  Some heuristics (on the estimation of the distance to the goal state) are better in some situations and some are not.  In a problem with many goal states, you would probably want to have heuristics that are bold.  In a problem with only a few goal states, you would probably want to have heuristics that are cautious.  In a problem where backtracking is very costly, then you would want to have heuristics that are even more cautious.

 

For the second part, some search strategies are interchangeable, some are not.  For example, a depth-first search can replace a breadth-first search, even a best-first search.  But the efficiency may suffer. 

 

Q2:  … we claim that a searching agent is not well suited to solve every type of problem.  Further we claim that it is important to know the problem environment.  We also state that one must have well-defined environment and operations to perform many of these searches.  Are these claims valid?

 

A2:  In general, these claims are valid.  That is why an agent can help speed up a search.  An agent can gather information from its environment, or from its interactions with other agents, to further prune the search space.  For example, agent A1 considers whether to bring a spicy dish or a vegetarian dish to agent A2’s potluck party.  Agent A1 can prune the search space by asking what agent A2 prefers, or by finding out what the guests like.  There, the agents make use of the problem environment. 

 

This is similar to my example of the Prisoner’s Dilemma in class: the prisoner cannot decide what to do since there isn’t a dominant strategy, but the prisoner can find out what his/her partner is likely to do and can find a strategy that is more likely to succeed.

 

Note that sometimes, you may not need to have a well-defined environment or a set of well-defined operations to perform some searches.  There are high-level searches such as case-based search that thrive on vagueness. 

Q3:  … Real time search (RTBS) is sensitive to the topography of the problem space.  In that context, the problem space is described using adjectives as “large,” “shallow,” and “deep.”  What exactly do these terms mean here?

 

A3:  A large problem space is one with many branches per node.  A deep problem space is one with a large number of levels between the initial node and the goal node.  And the number of nodes and branches greatly influence the strategy of a real time search problem.  In a shallow problem space, you can afford to be reckless—since backtracking does not cost too much.  In a large problem, you have to be pickier, or otherwise you have to consider too many branches.

 

Q4:  In the two-player games, the minimax procedure uses an evaluation metric to measure the value for each node.  This function should be problem-specific.  Is it necessary (or is it a better design) to make this function dynamic so that the values of the nodes change when the environment changes?  If this is the case, it is going to affect the pruning.  Are there any algorithms dealing with this?  If the two-player games are expanded to include three or more players (like in RoboCo), what are the major considerations?

 

A4:  This is a question that is difficult to respond to.  A search problem usually (not always) requires well-defined operations and a well-defined problem.  Thus, you can have a set of domain-specific, problem-specific heuristics to compute the evaluation metric.  But, the environment may change and does change in some problems.  What do you do then?  If the agents are intelligent, they can learn from their failures that start to appear (due to the change in environment), and learn to adjust their heuristics, which in turn adapting the evaluation metric.  This is actually what happens in reinforcement learning—a basic search algorithm that adapts. 

 

What about the minimax procedure then?  In the case where you have dynamic evaluation function, then the pruning is going to be affected.  I haven’t encountered any minimax algorithms that do this. 

 

If the two-player games are expanded to include three or more players, what are the major considerations?  First of all, are there still only two teams?  If yes, then we can apply minimax relatively easily.  Team Max will select the best move out of all its members, and Team Min will select the worst move.  If you have more than 2 opponents, then the minimax procedure does not work.  The search space becomes quickly intractable.  You may break the problem down into playing two games: A1 plays against A2, and A1 plays against A3.  In that case, you can have two minimax trees.  However, A1’s actions/moves are constrained.

 

Q5:  Would computer AI, based on gaming trees, as described above be considered agents?  Has a full gaming tree for a game of chess actually been produced?

 

A5:  No, if they do not have the characteristics of agents.  For example, a chess playing software (like Deep Blue) is not an agent.  Why? The environment is fully defined and discrete and finite, the software does not have to adapt to unexpected events.

 

For the second question, the answer is no.  Too many branches, too many levels.  Impossible to list them all.  In the future, if we have better storage and computing power, it is possible to produce the entire gaming tree for a game of chess.

 

Q6:  In a moving target search, if the problem solver is slower, can it still catch the target by predicting where it is going to be?

 

A6:  Yes.  A baseball travels about 90 miles per hour, thrown by a professional baseball pitcher.  But, a player can still hit the baseball with speed much less than 90 miles per hour in swinging his/her bat.  Several issues are involved: Is the problem solver more flexible (can it move in more directions than the target; e.g., in our Final Project, the hounds can move in all 8 directions, but the fox can only move in 4)?  Does the problem solver know where the target is going? Can the problem solver predict where the target is going?  If I have many problem solvers, can I form a search team that is much slower than the target and still catch the target (for example, in a man-hunt, a team of investigators usually moves very slowly, compared to the fugitive; but through coordination, they can catch up with the fugitive, given time)?

 

Q7:  In decoupled RTBS is it possible that the two problem solvers will miss each other?

 

A7:  Yes.  If you do not design it correctly, then yes, the two problem solvers may miss each other, and drop into an infinite loop of “running around in a circle.”  This actually happens in real life.  That is why decoupled RTBS is usually applied in cases where the problem solvers have good information about where each other may be going.

 

Q8:  Is the cost of communication in CSP a factor?

 

A8:  This depends on whom you ask.  Traditionally, no, it is not a factor.  But to me, it is, from the viewpoint of problem solving and application.  More than that, time is also a factor, including processing time.