码迷,mamicode.com
首页 > 其他好文 > 详细

COMP6203-2019/20 Intelligent Agents

时间:2019-10-04 19:08:57      阅读:59      评论:0      收藏:0      [点我收藏+]

标签:similar   elf   dev   with   phi   cto   document   weight   org   


COMP6203-2019/20 Intelligent Agents Coursework
Specification
Deliverable Deadline Feedback Marking Scheme Weight
Negotiation
Agent
Dec 10,
4pm
Jan 7 The score that your agent achieves in the class
tournament will determine 20% of the total module
mark. The score will be an average of the performance
based on multiple criteria and negotiation
scenarios.
Feb 4 Top scoring reports will describe in detail the challenge
that the agent faces, and the design of the
strategy implemented. They will present qualitative
and quantitative analysis of the agents performance,
and will show evidence that related literature
has been read, understood and applied.
20%
Table 1: Deliverables and Deadlines
Contents
1 Introduction 2
2 Negotiation Setup 2
3 Submitting the Agent 2
4 The Tournament 3
5 The Report 4
6 Individual Contribution and Team Coordinator 5
7 Plagiarism 5
8 Late Submissions 5
9 The International Automated Negotiating Agents Competition 7
10 A Brief Negotiation Tutorial 7
1 Introduction
This assignment is about building your own negotiation agent. Negotiation is a form of interaction in
which two (or more) parties, with conflicting interests and a desire to cooperate, try to reach a mutually
acceptable agreement. Negotiation between parties can in many ways be modeled as a game, and game
theory is useful to analyze the behavior of negotiating parties. Negotiation is, however, also different from
many board games such as chess and reversi. One of the most important differences is that negotiation
as we will study it in this practical assignment never is a zero-sum game. That is, a typical negotiation
does not have a winner who takes all and a loser who gets nothing. In order to start a negotiation, it is
only reasonable for both parties to believe that there is a win-win situation where both parties can gain
by obtaining a deal through negotiation. Another difference is that the domain of negotiation (what the
negotiation is about) may be quite different from one negotiation to the other. Finally, in negotiation there
is a lot of uncertainty: a negotiation agent typically does not know the negotiation strategy used by its

代写COMP6203课程作业、代做Intelligent Agents作业、Java语言作业代写、Java编程设计作业调试
opponent, or even know what outcome the other agent prefers. In fact, as we will see, an agent may not
even know his own preferences entirely.
For this assignment you will build your own negotiating software agent to participate in the agent
negotiation competition, which uses the GENIUS framework. The implementation is in Java and you will
work in a team of up to 4 people. In addition, you will need to submit a group report describing the agent
and performing an analysis of the agents. See Table 1 for the hand in dates.
2 Negotiation Setup
The negotiation will run using the GENIUS framework. It will be a bilateral negotiation between 2 agents,
using the Stacked Alternating Offers Protocol. Negotiations are conducted using various multi-issue
preference domains, which are described by additive weighted utility functions. In the negotiation tournament
setup that will be used for the coursework there is preference uncertainty, i.e. the agents have
uncertainty about their own preferences. Specifically, the agents will have a ranking of a limited set of outcomes,
and no access to their entire utility function. However, the agent can elicit additional information
from a virtual user at a fixed cost. Also, the agents will not have any knowledge of the preferences of its
opponent. Negotiation is time based (as opposed to round based) and each negotiation lasts 180 seconds
(3 minutes). Negotiations will have imperfect information about their own preferences, which means that
they only have a partial order of the preferences. More details are available in Section 10 below.
3 Submitting the Agent
Please follow the next instructions carefully or your agent will not work and so the group will receive
no marks for agent performance. All your code should be placed in a package called groupn, where n
is the group number that will be allocated to you.1 The main agent class (containing an implementation of
the methods chooseAction, receiveMessage, etc.) should be named MyAgent. For example, for
group 5, the resulting fully qualified name of the agent object becomes group5.MyAgent.
2
.
You then need to produce a JAR file containing all the relevant binaries (.class files) to run the agent
as well as the corresponding source code (.java files). You should not include any files or classes that
are already part of the genius framework (such as genius-xxx.jar). The name of the JAR file does
not matter as long as it has the jar extension.
Most IDEs such as Eclipse support the generation of JAR files. However, you can also do this on the
command line by using the program jar.exe. If this executable is not found, this is typically located
(on a PC) inside the Program Folder, e.g. C:\Program Files\Java\jdk1.8.0 60\bin\jar. The
exact directory will vary depending on the version of java that you have installed.
1
If you are not familiar with the concept of a java package, check out various online tutorials such as https://www.
w3schools.in/java-tutorial/packages/.
2Note that the dot here indicates that object MyAgent is in package group5, and the term ”fully qualified name” refers to the
complete path including all the packages.
2
For example, for group 1, if your source files are in a folder called src and the binaries in a folder
called bin, then you can create the jar file by the following steps:
1. Create a new group1.jar file and add the source code files by using the create (c) option as
follows:
jar cvf group1.jar -C src .
2. Next, add the binary files using the update (u) option:
jar uvf group1.jar -C bin .
3. Finally, check that all the necessary files are there:
jar tvf group1.jar
When executing these commands, the output should look something like this:
In particular, make sure that both the java and class files are included and they both within the group1
directory (if your group is 1).
Finally, the JAR file needs to be submitted through the handin system by the deadline. Only one agent
per group should be submitted. You can submit multiple times. The most recent valid submission will
be the one used in the final competition, irrespective of who uploads the agent.
Submission Check List. When submitting your agent make sure that:
• The agent object has the appropriate name.
• The agent object is contained in the appropriate package.
• The JAR file contains both the sourcecode and binaries necessary to run your agent.
• You have NOT included any jar files or classes from the GENIUS platform.
4 The Tournament
The submitted agent will participate in a tournament with other agents from the module. Given the number
of participants it may not be possible for each agent to play against all other agents. In that case, each
agent will be randomly grouped with other agents to form a tournament. Within each tournament, each
agent will play against each other on several domains in the configuration as mentioned above. Your agent
should be generic enough to play on different domain sizes and work with different degrees of preference
uncertainty. In this year’s tournament only domains with discrete issues will be used during the competition
(see the GENIUS manual for what these terms mean). Domain sizes will vary from very small (less than
10 different offers possible) to very large (up to 25,000 possible offers).
3
Each agent will participate in thousands of negotiations against agents designed by other teams in the
same cohort. The agent’s mark is proportional to its average performance. This performance is based on
two factors: (1) the individual utility achieved, and (2) the Euclidean distance of the agreement from the
Nash bargaining solution. The latter is given by:
where Ui(·) is the utility for agent i for an offer (and the reserve price in case of a disagreement), o∗ is
the agreed offer (or a disagreement), oNBS is the Nash bargaining solution (see Section 10.2), and m is the
number of agents (normally 2).
In addition, the agents will be tested on domains of different sizes. For each of the two categories and
for each domain, the mark will be scaled such that the best agent (i.e. the highest individual utility for the
first category, and the lowest distance for the second category) will receive the maximum score and the
average score across the cohort will be around 65%. The score will be scaled separately for each category
and domain. Therefore, one group might receive the highest score for the individual utility category,
whereas another group might have the highest score for getting the lowest distance. Both categories have
equal weight.
Agents may be disqualified and receive zero marks if they violate the spirit of fair play. In particular,
the following behaviors are strictly prohibited: designing an agent in such a way that it benefits
some specific other agent, starting new Threads, or hacking the API in any way.
5 The Report
Each group needs to submit a report of up to 2500 words excluding abstract, tables, figures, references
and appendix (this is an upper limit and should not be a goal) and no more than 6 double column pages in
total including references and appendices. Any report exceeding the word limit will be penalised and those
exceeding the page limit may not be marked. The report should be written in the style of a research paper
and describe at a minimum:
1. The overall design of the agent including various aspects of the strategy (e.g. the concession strategy,
how your agent deals with preference uncertainty, the acceptance strategy, etc), using pseudocode
where applicable.
2. The reasons behind the specific design of the strategy that it employed, and how it compares to and/or
varies from existing approaches. It should refer to relevant literature as appropriate.
3. An extensive evaluation of the agents performance based on your chosen metrics, which compares
your agent against some benchmarks (e.g. other agents).
4. A critical evaluation of the agent and how this can be improved.
5. A statement regarding the contribution of individual team members.
You are encouraged to evaluate different variants of the agent strategy and use this to motivate the final
design decisions, and/or how your submitted agent could be improved. You could also evaluate specific
aspects of the strategy, e.g. the utility estimation approach. It is expected that the report will contain several
graphs and/or tables showing the results of the analysis. It should also cite relevant literature especially for
motivating the design choices but also comparing and contrasting the strategy to existing literature. Note
that there is no need to describe the competition itself in detail. A brief overview may be included or if
specific details of the competition are needed to motivate the strategy.
The report should contain the author names, student numbers, group number, word count, and
be formatted as a double column academic paper. You will find a template for this on the course website.
Please see Table 2 for an indicative marking scheme.
4
6 Individual Contribution and Team Coordinator
To recognize that the contributions of each individual in the team may vary, we ask each team to specify the
individual contribution of each group member in a brief statement and to include this as part of the report.
The individual contribution can comprise up to 25% of the overall mark. The statement should contain the
relative percentage contribution for the each individual on the agent development and a separate one for
the report and evaluation. For example:
AGENT: Alice 30%, Bob 20%, Charlie 30%, Dominic 20%.
REPORT AND EVALUATION: Alice 20%, Bob 30%, Charlie 20%, Dominic 30%.
In addition, this needs to be supported by a short explanation, justifying any differences. The percentages
along with the supporting statement should be placed in a section entitled ‘Individual Contribution’
and does not count towards the word count (although the entire document should be within the page limit).
These individual contributions need to be agreed by all team members. The module leader retains the
option to deviate from the stated individual contributions (e.g. if no justification is included).
In deciding on individual contributions, note that contributions are not limited to programming or report
writing. Reading the literature, attending the labs, running experiments, managing the team and organising
meetings, also all count as contributions towards the coursework. Also, some teams may decide to develop
several versions of the agent, and so even when the code is not used in the the final version, this can still
be considered as a contribution. It is advisable that a record of effort is kept in case of disputes, e.g. in the
form of commits and meeting minutes.
Each team should appoint a team coordinator who should be the main point of contact with the lecturer
and submits the final report. Also, although the individual contributions need to be agreed by all, the team
coordinator should be responsible for writing up the individual contribution section.
7 Plagiarism
Both the agent sourcecode and the report need to be the student’s own work unless mentioned otherwise.
For the report, also tables and figures etc should be your own (unless they are from an existing paper and
source is cited appropriately). Be careful with sharing your material (such as tables and figures) with other
groups since enabling others to plagiarise is equally a breach of academic integrity. Hence, if multiple
reports are found with identical text/figures/tables/etc, ALL will be subject to plagiarism penalties and
reported to the Academic Integrity Offer.
For the agent, you can use the code from the ExampleAgent provided, but anything else needs to be
clearly acknowledged in the report and when submitting the agent. Note that you are not allowed to
directly use code of agents programmed by others, including agents who have previously participated in
the international competition. However, you are allowed to use ideas and strategies reported in academic
papers, as long as you implement these strategies yourselves and you acknowledge the papers in your
report. In case of doubt, feel free to ask! Any violations, deliberate or otherwise, will be reported to the
Academic Integrity Officer with no exception.
More information about academic integrity can be found here.
8 Late Submissions
Late submissions will be penalised according to the standard rules. If agents are submitted late, they may
not participate in the full tournament. Note that, if the submission is late due to an invalid submission, this
is still subject to late penalties. It is your responsibility that you submit on time and that the submitted
agent validates correctly.
5
Category Sample Feedback Indicative
Marks
Structure and
Writing (max.
5 points)
The report does not follow a proper layout and structure and contains
many spelling and/or grammar mistakes
0
The report conforms with the requirements, and contains few/no
spelling and grammar mistakes
5
Description
and Understanding
(max. 30
points)
The agent description is inadequate/missing 0
The agent is explained but some parts are incomplete, not sufficiently
formal (e.g. using mostly words with few equations or algorithms), and
with very little motivation
10
The agent is explained and mostly complete, containing some formal
algorithms, and motivation for some of the choices, showing a good
level of understanding
20
There is an excellent agent description which is complete and clear,
yet concisely written and using formal notation and algorithms. The
strategy is well motivated and demonstrates an excellent understanding.
30
Challenge,
sophistication,
originality
(max. 15
points)
The strategy is very basic and shows no originality 0
The strategy is largely based on a single paper with no/ few new elements
5
The strategy is based on existing literature and has been adapted to show
some innovation
10
The strategy has many novel and sophisticated elements, mixing ideas
from several papers
15
Literature
(max. 10
points)
The report contains no references to the literature 0
There are some references to the literature but it is not clear how these
references are used to inform the strategy of the agent
5
There is clear evidence that the literature was read and understood, and
used to motivate and support the development of the agent strategy
10
Evaluation
and Analysis
(max. 40
points)
There is no evaluation of the agent performance 0
There is some evaluation of the agent performance, showing tables and
graphs, but little/no analysis of the results
10
There is an adequate evaluation and some discussion of the results but
little/no critical evaluation or ways to improve the agents
20
There is a very good evaluation and the discussion shows a good understanding
of the performance of the agent, and some ways to improve
the strategy
30
The evaluation is extensive considering various metrics and benchmarks,
and there is an excellent critical discussion of the performance
of the agent, and ways to overcome the deficiencies and improving the
agent
40
Table 2: Indicative Report Marking Scheme. Indicative marks are given out of a 100 points. The report is
worth 20% of the overall mark.
6
9 The International Automated Negotiating Agents Competition
An international agent negotiation competition (ANAC) is being held on a yearly basis, and the coursework
is based on this competition. The University of Southampton is actively involved in shaping this competition.
Last year was the 10th installment, which is typically held at a high profile conference. In the last
few years, it was held at the International Joint Conference on Artificial Intelligence (IJCAI), one of the
very top AI conferences. Groups are encouraged to submit their agent to next year’s competition, which in
2020 will be held at IJCAI in Japan. Some previous competitions can be found here:
• ANAC 2019
• ANAC 2018
10 A Brief Negotiation Tutorial
This section provides a more detailed description of the negotiation process, the challenges, and points to
some of the existing literature on this topic.
10.1 Overview
A negotiation is defined by its negotiation domain, which tells you what issues are negotiable and what
value-range each issue can take. Negotiation can involve a range of issues, from quite personal ones such
as deciding on a holiday destination to business deals such as trading orange juice in international trade. For
example, the party domain, which is one of the pre-defined domains, is about organizing a party together
with some friends and negotiating what you want to spend the available money on. The domain specifies a
preference profile for each agent, that captures each agent’s individual preferences with regards to the party
domain. The result will be a formal preference function that maps each possible outcome of a negotiation
to a utility in the range of 0 to 1.
Given the domain, the next task involves thinking about a strategy used to perform the negotiation itself.
In negotiation you need at least two strategies: an offering strategy (what to bid when), and an acceptance
strategy (when to accept or reject offers, and when to stop negotating - walk away). The most important
part of this assignment concerns the negotiation phase itself, i.e. the exchange of offers between you (or
your software agent) and an opponent.
A negotiation instance is also called a negotiation session. In a session two agents negotiate with each
other to settle a conflict and negotiate a deal. Each negotiation session is limited by a fixed amount of time.
At the end of a session, a score is determined for both agents based on the utility of the deal for each agent
if there is a deal, otherwise the score equals the reservation value (which may be zero but is not always the
case). In a sense, there is no winner since each agent will obtain a score based on the outcome and its own
utility function. A failed negotiation, in the sense that no deal is reached, thus is a missed opportunity for
both parties. In the tournament that will be played, each agent will negotiate with many other agents and
the scores of each session are recorded and averaged to obtain an overall score for the agent. A ranking
will be compiled using these overall scores.
10.2 Understanding a Negotiation Outcome
A negotiation outcome can be assessed based on an agent’s individual benefit, or it can be assessed at a
joint level, e.g. to see whether a better outcome could have been achieved which would benefit both agents,
and whether the outcome is fair and how this is defined.
In terms of the benefit of the outcome to the individual, this is described by a so-called utility function.
As explained, negotiations involve multiple issues. In GENIUS and in this assignment, we assume utility
functions linearly additive, i.e. they are a weighted function of the utility for individual issues, where the
weight indicates the importance of an issue (e.g. price vs travel time when buying flight tickets). More
formally, let o be an offer, where oj
is the proposed value for issue j (e.g. the price). Moreover, let ui, j(oj)
be the utility of agent i for that value. Then the overall utility of the offer, Ui(o), is given by:
7
Offer Utility Agent 1 Utility Agent 2
o1 = h0.5,0.5i U1(o1) = 0.7 ∗ 0.5+0.3 ∗ 0.5 = 0.5 U2(o1) = 0.3 ∗ (1−0.5) +0.7 ∗ (1−0.5) = 0.5
o2 = h1,0i U1(o2) = 0.7 ∗ 1+0.3 ∗ 0 = 0.7 U2(o2) = 0.3 ∗ (1−0.7) +0.7 ∗ (1−0.3) = 0.7
o3 = h0,1i U1(o3) = 0.7 ∗ 0+0.3 ∗ 1 = 0.3 U2(o3) = 0.3 ∗ (1−0.3) +0.7 ∗ (1−0.7) = 0.3
o4 = h0,0i U1(o3) = 0.7 ∗ 0+0.3 ∗ 0 = 0 U2(o3) = 0.3 ∗ 1+0.7 ∗ 1 = 1
o5 = h1,1i U1(o3) = 0.7 ∗ 1+0.3 ∗ 1 = 1 U2(o3) = 0.3 ∗ 0+0.7 ∗ 0 = 0
Table 3: Example of additive utilities for individual agents with 2 negotiation issues.
where n is the number of issues under negotiation, and wi, j
is the weight of issue j for agent i. Crucially,
both the utility function for each issue, and the weights can be different for different agents, which enables
mutually beneficial outcomes.
For example, consider a setting with 2 agents and 2 issues, where o1,o2 ∈ 0,0.1,0.2,...,1.0 and we
have that u1, j = oj and u2, j = 1−oj for agents 1 and 2 respectively. That is, agent 1 prefers a higher value
for each issue (e.g. the agent is a seller and the value is price), and agent 2 prefers a lower value (e.g.
the agent is a buyer). Also, the agents have different weights: w1 = h0.7,0.3i and w2 = h0.3,0.7i. That
is, agent 1 finds the first issue more important and agent 2 the second issue. Consider 3 different offers:
o1 = h0.5,0.5i (both agents get something in the middle), o2 = h1,0i (agent 1 is happy about the value of
issue 1 but not of issue 2, and vice versa for agent 2) and o3 = h0,1i (the opposite to the previous). Now
consider the utility for each offer in Table 3. Note that there are many other possible offers/negotiation
outcomes (in this case, with 2 issues and 11 values per issue, there are 112 possible outcomes). Take time
to make sure you understand the table and how these values are calculated. As an exercise, try and derive
the utility for some other offers not in the table.
All of these 3 offers are ‘fair’ in the sense that, for each offer, both agents get the same utility. However,
some are clearly better than others. In particular, offer o2, where agent 1 gets the best value for his most
preferred issue, and agent 2 gets the best value for her most preferred issue, has a utility of 0.7 for both
agents, compared to the lower utility in other cases. This is called a win-win outcome, since both agents
benefit.
It is important to note that, as seen in the example, these win-win outcomes exist even when the agents
have diametrically opposing preferences for individual issues. Indeed, in such cases with diametrically
opposed preferences, it is only possible to obtain win win situations by having multiple issues. If negotiation
is only about a single issue, e.g. price, it is typically not possible to get a win-win outcome. Such
negotiations are also referred to as competitive: the only way for one agent to gain, is for the other to lose.
A multi-issue negotiation where win-win outcomes are available, as in the example, are called integrative.
Often, in practice, additional issues are introduced with the specific aim to make the negotiations less competitive
and more integrative. For example, a second-hand car dealer may offer a discount on the warranty,
and so the warranty gets introduced as an additional issue alongside the price of the car.
We now analyse the properties of an outcome more formally in terms of the joint utility. A desirable
property is for an outcome to be so-called Pareto optimal (also often referred to as Pareto efficient). An
outcome is Pareto optimal when there does not exist another outcome where both agents can do better. Said
differently, an outcome is Pareto optimal if, for any agent to be better off, the other agent has to be strictly
worse off. In the example, o1 is not Pareto optimal since there exists another offer (e.g. o2) where both
agents are better off. Here, o2 is Pareto optimal, but so are o4 and o5 (see Table 3). This is because there
is no other outcome where both agents are better off. By connecting all the Pareto optimal solutions we
obtain the so-called Pareto frontier. This is visualised in Figure 1, which shows the offers from the table
in terms of both agent’s utilities. Take time to understand these concepts before moving on.
Clearly, aiming to achieve an outcome on the Pareto frontier seems to be the sensible option (although
this in itself is already challenging since there is a lot of uncertainty, as explained further below). However,
a Pareto optimal solution is not unique since there can be many different outcomes on the frontier (as an
8
Figure 1: The outcome space and the Pareto frontier for the 2-issue negotiation domain example.
exercise, try and determine how many outcomes in the example lie on the frontier other than the ones in
the table). Some of these favour one agent over another. Also, in many of the negotiation domains there
can be dozens of issues and hundreds of offers, and the Pareto frontier can look very complex. An example
of a slightly more complex frontier is seen in Figure 2. So what agreement should we aim for? One
possibility is to aim for a ‘fair’ outcome, i.e. one that is good for both parties, since such an outcome is
likely to get accepted by both parties. There are many ways of defining this, but a well known concept
is the Nash bargaining solution or simply Nash solution (note that this notion is NOT related to the Nash
equilibrium other than the fact that they are both introduced by the late Nobel laureate John Nash). A Nash
solution is an outcome that satisfies certain bargaining axioms and may be viewed as ‘fair’ outcome which
is reasonable to accept for both parties. An outcome is said to be a Nash solution whenever it maximises
the product of the utilities minus the disagreement utility (i.e. reserve value). More formally, let Ui(d)
denote the utility of a disagreement, then the aim is to find the outcome o which maximises (for 2 agents):
(U1(o)−U1(d))·(U2(o)−U2(d))
This solution has some interesting properties, e.g. it is invariant to how the utility functions are scaled
(invariant to so-called affine transformations), it is Pareto optimal, and also it satisfied the symmetry property.
The latter means that, if the two agents are symmetric (e.g. diametrically opposed and having the
same reservation values), as in our example, then the agents should get the same utility. In the example
with 2 issues, o2 is the Nash bargaining solution. Note that this solution is typically unique.
Other types of solution concepts include: maximising social welfare, which is the sum of utilities
(instead of the product); the egalitarian solution, which maximises the utility of the agent with the lowest
utility; the Kalai-Smorodinski bargaining solution, which chooses the point on the Pareto frontier which is
closest to linear line which goes through the U
is the maximum utility agent i could
achieve individually (see e.g. [8] for more details).
10.3 Deadlines and Discount Factors
Besides considering the utility of the outcomes of a negotiation, time pressure often plays an important
role. Often, negotiations have firm deadlines. For example, if the date of the party has already been fixed,
then the agents should come to an agreement on how to spend the money in organizing the party well in
9
Figure 2: A more complex Pareto frontier example from the Laptop domain.
advance of that day. In the assignment there is a deadline for reaching an agreement. Both agents have
exactly the same deadline and this is common knowledge (both agents ‘know’ when the deadline is). If
they fail to reach an agreement each participant gets their disagreement utility.
Deadlines ensure that negotiations do not last forever, but often result in a deal being clinched at the
last minute. Hence, in addition to a deadline, negotiations can have a different type of time pressure in the
form of a discount factor. Discount factors model the fact that the desirability of the good being traded
may decline with time or other types of urgency (e.g. lost opportunities). This happens when the good is
perishable, for example fruits or other types of perishable food.
The formal modelling of discount factors in the context of automated negotiation and the GENIUS
platform is as follows. Let δ ∈ [0,1] be the discount factor of a domain, and o be a certain outcome. Let t
in [0,1] be the current normalized time, as defined by the timeline. We compute the discounted utility U
(1)
At the deadline t = 1, the original utility is multiplied by the discount factor. If δ = 1 or not specified at all,
the utility is not affected by time, and such a domain is considered to be undiscounted, while if δ is very
small there is high pressure on the parties to reach an agreement quickly. Note that in GENIUS, discount
factors are part of the preference profiles and therefore can differ between parties.
10.4 Opponent Modelling
Although achieving a Pareto efficient and fair outcome may be desirable, the challenge is that a negotiation
agent has no knowledge of the opponent’s preferences, nor does it have knowledge of the negotiation
strategy. Even the discount factor may be different and unknown. To address this, there is a wide range
of literature on opponent modelling. These papers propose algorithms which try to infer the preferences
and negotiation strategies of the opponent. In terms of additive utility preferences, this means inferring the
weights as well as inferring the utility for individual values for each issue.
There are a wide range approaches, from simple heuristic to machine learning. A simple approach is
to assume that the opponent will start with what is the best offer for them, and will slowly concede. This
will give some indication of which issues the opponent finds more important. A related approach is by
10
using frequency analysis, which has been successfully used by the Hardheaded agent [17]. The assumption
here is that values for issues who appear more frequently in the offers received by the opponent are more
likely to have a high utility for the opponent. Similarly, if certain values for certain issues persist compared
to other issues where there are more frequent changes, this might mean that the weight for that particular
issue is higher.
Examples of agents using more sophisticated approaches include IAMHaggler [23, 24, 22], which uses
Gaussian process regression technique to predict the opponent’s behavior and OMAC Agent [4, 3, 5, 2],
which models the opponent using wavelet decomposition and cubic smoothing spline. In [13], a guessing
heuristic is introduced to infer the opponent preferences.
10.5 Preference Uncertainty
In addition to having no initial knowledge of the opponent, another challenge introduced in the competition
is uncertainty about the agent’s own preferences. To understand why preference uncertainty occurs in
practice, it is important to understand where the agent preferences come from in the first place. By defi-
nition, autonomous agents act on behalf of people or organisations. Hence, the preferences need to reflect
what people and/or organisations want. The process of obtaining these preferences is called preference
elicitation and is generally costly, both in terms of time, and hiring experts to do so. Preferences can be
very complex (a good classic book that describes this nicely is by Raiffa [19]). Typically, people find it
very difficult to give exact utility values for certain outcomes, or understanding what the weights might
mean. It is typically much easier for people to compare offers in a pairwise manner and to say which of the
two they would prefer. By doing many pairwise comparisons one can get closer to understanding the true
utility function, but in complex negotiations there are simply too many pairs to compare. As a result, often
what you end up with is a partial ordering of preferences.
In the competition, preference uncertainty means that the agent only has ordinal information about
different offers, and only of a subset of the full offer space. Therefore, it may know that it prefers A over
B, but not by how much (i.e. it knows the order but not the utility) and it may not know anything about C.
There are two main ways in which to deal with such situation. One way is to derive an estimated (additive)
utility function from the information given, and then use this utility function during the negotiation. Some
built in functions are provided to help with this (see the GENIUS manual for more detail) but you are
encouraged to come up with your own approach or use an existing approaches from the literature (e.g. that
are used for opponent modelling). For example, a more sophisticated approach to derive the preferences
has been developed in [21] using linear programming. Another way is to negotiate with the incomplete set
of ordinal preferences directly, without the ‘translation’ step.
10.6 Negotiation Strategies and Techniques
As already mentioned, a negotiation strategy needs to determine what to bid (the offering strategy), and
whether to accept if an offer is received (the acceptance strategy). These are typically related. Assuming the
agent knows or has estimated its own utility, a common approach is to split the strategy into two separate
components: (1) determining the utility threshold level at which to make and accept offers, often referred
to as the concession strategy, and (2) finding the ‘best’ offer at or above the chosen utility threshold.
Regarding the concession strategy, this can be a time-based schedule or adaptive which responds to the
opponent. Examples of time-based concession strategies are Boulware (concede slow to begin with, and
then concede faster as you get closer to the deadline), Conceder (concede fast from the beginning), Linear
(concede linearly), Hardheaded (don’t concede until the very end). See e.g. [6]. A well known adaptive
strategy is tit-for-tat, which concedes a similar amount to what the opponent is perceived to concede ([7, 1]).
Many approaches use one of the time-based strategies but then change the parameters of this strategy in
response to what the apponent is doing, e.g. changing the target utility (which is the utility threshold at the
deadline). A notable ANAC winner agent is Agent K [15, 16], which calculates its target utility based on the
average and variance of previous bids and employs a sophisticated acceptance strategy. Furthermore, the
CUHK Agent [9, 10] adaptively adjusts its acceptance threshold based on domain and opponent analysis.
Another crucial factor is finding the best offer given the current utility level, i.e. deciding on a specific
value for each issue. An early approach is by Faratin et al. [7] which chooses an offer that is closest in
11
terms of similarity to the opponent’s previous offer, but at the same time is above the utility threshold of
the proposing agent. A similar approach is taken by [20], who propose the Orthogonal strategy which
minimises the Euclidean distance between the most recent offer made by the opponent, and offers at the
agent’s utility threshold level. They show that, for certain types of domains, if concession is sufficiently
slow, and both agents use this approach, the agreed offer is guaranteed to be Pareto optimal. Another
common approach is to first estimate the opponent model (as discussed before), and then choosing amongst
all the offers above the utility thresholds the one which has the highest utility for the opponent. This
maximises the chances that the offer is above the opponent acceptance threshold, and therefore will be
accepted.
There is a vast range of other negotiation agents in the literature that have been developed over the
past decades, e.g. Zeng and Sycara [25], who introduce a generic agent called Bazaar; Karp et al. [14],
who take a game-theoretic view and propose a negotiation strategy based on game trees; Jonker et al. [13],
who propose a a concession oriented strategy called ABMP; and Lin et al. [18], who propose an agent
negotiator called QOAgent; The Fawkes, which combines the best bidding, learning, and accepting strategy
components; Meta-Agent [11, 12], which, for any given negotiation domain, dynamically selects the most
successful ANAC agent to produce an offer, and many more.
In designing your own agent, you should search for and read relevant papers, including those from past
ANAC competitions. See also the Student WIKI where a list of papers is maintained.

因为专业,所以值得信赖。如有需要,请加QQ:99515681 或邮箱:99515681@qq.com

微信:codehelp

COMP6203-2019/20 Intelligent Agents

标签:similar   elf   dev   with   phi   cto   document   weight   org   

原文地址:https://www.cnblogs.com/python34/p/11622599.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!