CSCE 475/875
Handout 19: Contract
Day Analysis
November 19, 2007
1.
Tables of Results
|
Teams |
Costs For
Performing Own Subtasks |
Total |
|
Final Utility |
|||
|
R1 |
R2 |
R3 |
R4 |
Cost |
Cash |
||
|
Team Win |
$160 |
$40 |
$200 |
$150 |
$550 |
$4203 |
$3653 |
|
Boole’s Fools |
$105 |
$120 |
$200 |
$90 |
$515 |
$4085 |
$3570 |
|
Rocky & Bullwinkle |
$165 |
$40 |
$145 |
$240 |
$590 |
$3864 |
$3274 |
|
The Flood |
$105 |
$180 |
$190 |
$0 |
$475 |
$3710 |
$3235 |
|
Mongii Agent Systems |
$150 |
$160 |
$190 |
$30 |
$530 |
$3631 |
$3101 |
|
Nick |
$180 |
$140 |
$145 |
$30 |
$495 |
$3222 |
$2727 |
|
Total |
$865 |
$680 |
$1070 |
$540 |
|
|
|
Table 1: Costs, cash, and utility for each team
From the above table, we see that Team Win won the game day with $3653 in
the end. Boole’s Fools finished second,
closely behind. Rocky & Bullwinkle
and The Flood finished 3rd and 4th, separated by only
$39. Mongii Agent Systems and Nick were
5th and 6th, respectively.
Note also that the costs of the subtasks were designed to be the same for
Round 1 and Round 3. Both were
under-constrained. The differences were:
(1) Round 1 was low-cost, and Round 3 was high-cost; and (2) Round 1’s task
utility was smaller than Round 3’s task utility. This did not result in significantly more
subtasks used and sold (as shown in the following table). Then, how was it possible for Round 3 to have
so much higher costs—these are costs on performing subtasks, not taking into
account fees. What does that mean? It is possible that mistakes were made in the
tracking that caused this interesting result.
|
Teams |
# Subtasks
Used/Sold |
Total |
|||
|
R1 |
R2 |
R3 |
R4 |
||
|
Team Win |
8 |
11 |
8 |
1 |
28 |
|
Boole’s Fools |
6 |
6 |
9 |
3 |
24 |
|
Rocky & Bullwinkle |
8 |
2 |
7 |
8 |
25 |
|
The Flood |
6 |
9 |
8 |
0 |
23 |
|
Mongii Agent Systems |
8 |
11 |
8 |
1 |
28 |
|
Nick |
9 |
7 |
7 |
1 |
24 |
|
Total |
45 |
46 |
47 |
14 |
152 |
Table 2: Subtasks used and sold for each team
The differences between Round 2 and Round 4 were: (1) the utility of the tasks:
Round 4 had a $800-task for every team, (2) the subtasks in Round 4 were more
costly ($30 vs. $20), and (3) Round 3 was low-cost and Round 4 was
high-cost. The three factors did make an
impact on the teams’ behavior: in Round 4, significantly fewer subtasks were
sold and used (14 vs. 46).
The following table shows the number of bids from one team to another,
based on the paper copies of “Bid” collected from all the Game Day
packages. In the following table, the
entry at ith row and jth column indicates the number of bids
from team at ith row to team at jth column.
|
|
Nick |
TeamWin |
TheFlood |
R&B |
BsFs |
MAS |
Total |
|
Nick |
NA |
4 |
1 |
|
2 |
|
7 |
|
TeamWin |
1 |
NA |
1 |
3 |
2 |
|
7 |
|
TheFlood |
2 |
1 |
NA |
1 |
2 |
|
6 |
|
R&B |
2 |
2 |
2 |
NA |
4 |
1 |
11 |
|
BsFs |
2 |
3 |
|
|
NA |
|
5 |
|
MAS |
3 |
|
1 |
2 |
2 |
NA |
8 |
|
Total |
10 |
10 |
5 |
6 |
12 |
1 |
44 |
Table 3: The number of bids proposed and received for each team
As can be seen from the above, Rocky & Bullwinkle (R&B) made the
most bids (11), Mongii Agent Systems (MAS) made the second most bids (8). Boole’s Fooles (BsFs) made the fewest bids
with only 5. This shows that Rocky
& Bullwinkle was the most aggressive team in trying to secure subtasks to
accomplish their tasks, and Boole’s Fooles was the least aggressive team.
Further, from the above table, Boole’s Fools received the most bids (12),
while Nick and Team Win received the second most bids (10). Mongii Agent Systems received the least
number of bids with 1. Indeed, Mongii
Agent Systems only made one announcement for the entire game day. This shows that Boole’s Fools, Nick, and Team
Win were three teams motivated to sell their subtasks—in a way, motivated to be
“cooperate”. Mongii Agent Systems,
however, was a team that was not interested in accomplishing their tasks, and
instead relied on selling their own subtasks to gain utility. This strategy assumed that the would-be
utility gained from accomplishing tasks was smaller than the utility gained
from selling subtasks, which was not so in this system.
The following table shows the discrepancies between self-recorded activities
and activities based on the paper copies that I calculated. The percentage was computed from taking the
sum of the absolute difference in the number of announcements and the absolute
difference in the number of bids and dividing that sum with the total reported
numbers of announcements and bids.
|
Teams |
R1 |
R2 |
R3 |
R4 |
Total |
Hardcopies |
Diff |
% |
|||||||
|
Ann |
Bid |
Ann |
Bid |
Ann |
Bid |
Ann |
Bid |
Ann |
Bid |
Ann |
Bid |
Ann |
Bid |
||
|
Nick |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
4 |
4 |
4 |
7 |
0 |
3 |
37.5% |
|
Win |
0 |
0 |
3 |
1 |
2 |
3 |
1 |
2 |
6 |
6 |
12 |
7 |
6 |
1 |
58.3% |
|
Flood |
2 |
0 |
1 |
2 |
1 |
2 |
2 |
2 |
6 |
6 |
5 |
6 |
-1 |
0 |
8.3% |
|
R&B |
0 |
4 |
3 |
3 |
1 |
3 |
1 |
3 |
5 |
13 |
9 |
11 |
4 |
-2 |
33.3% |
|
BsFs |
1 |
1 |
1 |
2 |
1 |
1 |
2 |
1 |
5 |
5 |
5 |
5 |
0 |
0 |
0% |
|
MAS |
0 |
3 |
0 |
3 |
1 |
1 |
1 |
0 |
2 |
7 |
1 |
8 |
-1 |
1 |
22.2% |
|
Total |
4 |
9 |
9 |
12 |
7 |
11 |
8 |
9 |
28 |
41 |
36 |
44 |
8 |
3 |
15.9% |
Table 4: The number of self-reported announcements and bids for each team, compared to the numbers obtained from reviewing the hardcopies, and the discrepancy percentage for each team.
As can be seen from the above, only one team managed to track accurately
their bids and announcements: Boole’s Fools (with 0%). The second team was The Flood with 8.3%. Team Win was a poor agent in terms of
tracking. They were at least 58.3% off. Overall, the discrepancy percentage was 15.9%:
the multiagent system did not do a good job in tracking its own activities.
The following table shows the number of tasks accomplished and the
utilities gained by each team for each round.
|
Tasks Solved |
R1 |
R2 |
R3 |
R4 |
Total |
|||||
|
# |
Util |
# |
Util |
# |
Util |
# |
Util |
# |
Util |
|
|
Nick |
2 |
$700 |
1 |
$400 |
2 |
$800 |
1 |
$800 |
6 |
$2700 |
|
Win |
3 |
$900 |
0 |
$0 |
3 |
$1200 |
1 |
$800 |
7 |
$2900 |
|
Flood |
2 |
$700 |
2 |
$700 |
3 |
$1200 |
0 |
$0 |
7 |
$2600 |
|
R&B |
3 |
$900 |
2 |
$600 |
2 |
$800 |
1 |
$800 |
8 |
$3100 |
|
BsFs |
2 |
$700 |
3 |
$900 |
3 |
$1200 |
1 |
$800 |
9 |
$3600 |
|
MAS |
2 |
$700 |
0 |
$0 |
3 |
$1200 |
1 |
$800 |
6 |
$2700 |
|
Total |
14 |
$4600 |
8 |
$2600 |
16 |
$6400 |
5 |
$4000 |
43 |
$17600 |
Table 5: The number of tasks accomplished and the
utilities gained by each team for each round
Boole’s Fools accomplished the most tasks (9) yielding the largest
utility gain ($3600). Team Win, the
winner of the game day only accomplished 7 tasks yielding $2900. I believed this was an error in the
worksheet. Because of this discrepancy
making this table inaccurate, I am not able to draw any conclusions from this
table. In general, few tasks were solved
in Round 4, which is consistent with our observations above. Also, since Rounds 2 and 4 were
over-constrained, the numbers of tasks solved in these rounds were smaller than
those in Rounds 1 and 3.
2.
General Observations
Here are some general observations:
1. Different Strategies: Based on Table 3, we see that there were at least three general types of strategies: (a) agents that focused on selling subtasks to gain utility (Mongii Agent System, with 8 bids proposed and 1 bid received), (b) agents that focused on accomplishing tasks to gain utility (Boole’s Fools, with 5 bids proposed and 12 bids received), and (c) agents that balanced between selling subtasks and accomplishing tasks to gain utility (Nick, Team Win, The Flood, and Rocky & Bullwinkle). The results showed that Mongii Agent System did not perform well on this Game Day, indicating that the strategy focusing on selling subtasks to gain utility was flawed. Overall, the second and third strategies were more likely to lead to better performances in the environment. The first strategy was flawed because it did not take into account the relatively significant utility gains from accomplishing tasks.
2. Environmental Factors: As indicated earlier when describing the Tables of Results in Section 1, there were several environmental factors involved: (1) the utility of each task, (2) the cost of each subtask, (3) the cost of announcements, bids, and infractions, (4) the number of subtasks available in the system. Round 4 was the most stringent setup: with the highest average cost of per subtask, average cost per announcement/bid, and the least number of subtasks available in the system (tie with Round 2). Further, each team had three tasks to accomplish: one with $800 utility, and two with $200 utility each. Thus, 5 out of 6 agents went for the $800-utility task and ignored the other two tasks. The high costs and constraints discouraged the agents from attempting to accomplish the low-utility tasks.
3. Individual Rational (IR), Utility Maximizing (UM), and Game Playing (GP): Individual rationality says that an agent will prefer option A over option B as long as A is better than B, even if just by $1. Utility maximization says that an agent will try to maximize its utility gain as long as it sees there is a chance to improve its utility. Game playing says that an agent will try to prevent other agents from gaining more utility than itself, treating each agent as a player in the game. Here is a quick categorization of the six agents of this game day:
|
Rank |
Team |
Style |
|
1 |
Team Win |
Utility-maximizing, risk-taking |
|
2 |
Boole’s Fools |
Game-playing, mixed with utility-maximizing,
over-thinking |
|
3 |
Rocky & Bullwinkle |
Utility-maximizing, over-thinking |
|
4 |
The Flood |
Utility-maximizing, risk-averse |
|
5 |
Mongii Agent Systems |
Not utility-maximizing, not game-playing, not
individual rational, extremely risk-averse |
|
6 |
Nick |
Not utility-maximizing, not game-playing |
All teams claimed to be utility-maximizing in their pre-game or mid-game strategies. However, there were teams that were too conservative that, in essence, they were neither utility-maximizing nor game-playing. The winner practiced utility-maximizing and risk-taking. Their risk-taking behavior allowed them to explore the search space for solutions quite well.
4. No general observations can be drawn from Tables 4 and 5 because I believe that the numbers provided by Team Win were not accurate.
3.
Team-Specific Observations
·
Nick: This team seemed to be risk neutral,
utility-maximizing and game-playing.
However, the game-playing was not as intensive as Boole’s Fools’, and
the utility-maximizing was too structured and not as risk-taking as Team
Win’s. Mid-game strategies did not
consider what transpired on Day 1. For
Day 2, their mid-game strategy was not to bid at all due to increased fees and
small profit made on Day 1 from bids.
This team was the only team that only announced and placed a bid only
once for each round. That means, this
team did not attempt to complete as many tasks as possible, and did not attempt
to sell as many subtasks as possible.
This overall strategy puzzled me: it was obvious that, though not announcing
and not bidding could save costs, the team would not be able to gain any
additional utility without selling subtasks and without accomplishing
additional tasks. This team also did not observe other agents’ activities and
adapt to their activities. As a result, this team, in effect, was not
utility-maximizing and not
game-playing.
·
Team Win: Though this team won the game day, it was
the worst team in terms of tracking.
Their pre-game strategy was to complete as many tasks as possible. But they ended up with a balanced strategy of
selling subtasks and accomplishing tasks.
Their pre-game strategy also outlined a schedule for announcements that
was not really practical as it did not factor in other agents’ actions. Too static.
Their bidding strategy was also not sufficiently detailed. Their mid-game strategy was more adaptive to
other agents’ actions. And this turned
out to be a big improvement. It seemed
that this team did not plan out their strategies pre-game-day and relied on
during-game decision making. I believe
that their failure to track their own activities is partly due to having to
make during-game decisions and having not enough time to track. Their mid-game strategy also saw them
switching to a balanced strategy of selling subtasks and accomplishing tasks. It was not easy to determine whether this
team perused individual rational, or utility-maximizing, or game-playing
strategies as the reports on the pre-game and mid-game strategies were too
brief. However, they did post a lot of
announcements—exploring the search space for good solutions—and that was a sign
of risk-taking utility-maximizing.
·
The
Flood: This team switched from a
pre-game focus of completing tasks to gain utility to a mid-game focus of
balanced treatment. This team tracked
their activities quite well. Their
strategies were a bit too conservative or risk averse. The cautiousness seemed to hamstring their
activities and prevent them from gaining utility. In their observations, this team pointed out
that some teams were not individually rational as they did not sell “subtask 3”
to them. I have two comments on
this. First, it is possible that those
teams were game-playing or utility-maximizing.
Second, since it was to the team’s benefit to accomplish tasks, thus, if
the team itself was individually rational, then actually, the team should be
motivated to keep making announcements with increased maximum prices as long as
it could receive bids that would help them accomplish a task with even just $1
gain. However, that was not done. Thus, in a way, the team itself was not
individually rational either. The main
strategy by this team was risk-averse
utility-maximizing.
·
Rocky
& Bullwinkle: This team tracked
the second day’s activities quite well but did not do as well on the first
day. Their pre-game and mid-game
strategies were mainly utility-maximizing,
trying to improve their utilities.
However, their strategies did not adapt to other agents too well. Their mid-game strategies also did not take
too much advantage of what transpired on the first day. This team did not price their bids too well,
especially on the first day. They were also concerned that, since they won Game
Day 1, the other agents would try to “hurt” them to prevent them from winning
Game Day 2. This might be “overthinking”
since all other agents aimed to do well first and foremost before thinking
about “hurting” this team.
·
Boole’s
Fools: This team did an excellent
job of tracking and recording their activities.
Their pre-game and mid-game strategies were heavily game-playing, mixed with utility-maximizing tactics. It is possible that their game-playing
strategies hindered their winning the game day as “overthinking” might have
occurred. This team also did a good job
in adapting their strategies during-game and between rounds. This team had a clear set of strategies in
terms of executing what they wanted to do.
·
Mongii
Agent Systems: This team tracked
their activities quite well. However,
they did not observe other agents’ activities at all. And their conclusions at the end of other
teams were not accurate as a result.
This team set out to not to make any announcements on Day 1: basically
they wanted to improve their utility only through selling their subtasks. They were also too conservative at that. In their mid-game strategy, though they
pointed out that they were too conservative on Day 1, they still chose to stay
as conservative on Day 2 due to the perceived increased costs on announcements
and bids. However, they did not take into
account that the increase in costs (+$15) was actually insignificant to the
would-be utility gains if they had accomplished additional tasks. Thus, their strategies, though
utility-maximizing in principle, were not.
This was the only team that only posted one announcement and this
strategy truly puzzled me as it was so conservative that it was irrational.
4.
Lessons Learned
· Strategies that do not take into account other agents’ behaviors or do not provide flexibility will not work well.
· When deciding what actions to take, considering only the costs of the actions without their rewards is unwise; such an agent loses the overall big picture of its utility and that leads to poor performance.
· As a MAS designer, if you want to have the agents accomplish low-utility tasks, then care must be provided to motivate the agents to do so. Otherwise, the agents will not do them (see Round 4).
· Utility-maximizing was a better solution and being risk-taking could take advantage of that as well.
· Game-playing might be overkill especially when other agents did not game-play or only game-played a little.
· “Overthinking” requires assumptions of other agents in the environment; assumptions which might or might not be true. And thus, overthinking might not be worthwhile.
· Being too risk-averse renders any utility-maximization or game-playing strategies ineffective or non-existent.
5.
Game Days League
|
Teams |
Auction Day (10/29-10/31) |
Contract Day (11/14-11/16) |
Cooperation Day |
Total |
|
Team Win |
2 |
1 |
|
3 |
|
Rocky & Bullwinkle |
1 |
3 |
|
4 |
|
Boole’s Fools |
4 |
2 |
|
6 |
|
The Flood |
3 |
4 |
|
7 |
|
Mongii Agent Systems |
6 |
5 |
|
11 |
|
Nick |
5 |
6 |
|
11 |
The top four teams are still in the running for winning the league. The best that Mongii Agent Systems and Nick could
place is 2nd in the league after Cooperation Day.