CSCE 235
Light
Bulb 3
February 10, 2003
1.
Suppose you are a manager for a company. The company has employees of different
education backgrounds: Computer Science (CS), Computer Engineering (CoE),
Mathematics (M), and Business Administration (BA). Now, you want to form groups of employees of mixed education
backgrounds to improve inter-disciplinary cooperation. How many possible groups of different
mixtures are there? If we view this as
a set problem, then we can have four members or elements in the set S =
{ CS, CoE, M, BA }. Since we know the
number of members in the powerset of a set, P(S), is
where n is the
number of elements in S, P(S), in our example, has
= 16 members. However, one of the members is the empty
set! So, there are only 15 possible
groups of different mixtures.
2. Suppose that there are several pieces of information: A = “It is raining outside.” B = “ It is not raining outside.” C = “It is cloudy outside.” D = “It is not cloudy outside.” Is it possible to believe in A and also in B? Is it possible to believe in C and also in D? Is it possible to believe in A or B? Is it possible to believe in A and C? To this end, there is a well-known and popular theory called the Dempster-Shafer theory that allows belief values to be expressed to the members of the powerset of the set of all pieces of information or facts. So, in this case, for example, consider the set S = {A, B, C, D}. The powerset is then { {}, {A}, {B}, {C}, {D}, {A, B}, {A, C}, {A, D}, {B, C}, {B, D}, {C, D}, {A, B, C}, {A, B, D}, {A, C, D}, {B, C, D}, {A, B, C, D} }. In the Dempster-Shafer belief theory, it is possible to have a belief of {A, B} even though A and B are conflicting because it is okay to believe to a certain degree that it is raining outside while believing to a certain degree that it is not raining outside. The powerset allows us to enumerate all possible combinations, a very useful mechanism.
3. Suppose you are a statistician working for ESPN and your tasks include finding interesting statistics to report every night on the SportsCenter. Some of the interesting statistics that you can perform are set-related. For example, list all the NBA players that have more than 5,000 assists, 10,000 rebounds, and 20,000 points. List all the coaches that have more than 1000 victories and more than 500 losses. List all the L.A. Lakers players that are currently averaging more than 25.0 points per game. List all the NBA players that are currently averaging more than 25.0 points per game but are playing for an Eastern conference team that currently has a losing record.
4.
Suppose you are a data analyst for a website that sells
books. One of the strategies that the
website uses is to automatically recommend books to its web users based on the
books that they buy. Your responsibilities include finding out the percentage
of people buying suspense novels also buying history books. If the percentage is high, then you need to
tell the company to put an automated recommendation; otherwise you do not. How do you do it? You may get the list of all buyers of suspense novels through the
web, and the list of all buyers of history books through the web, and denote
them as SU and HI, respectively.
The set of buyers of both types of books is simply
. The set of all
buyers is SU
HI. So the
percentage is
. Some of these
set-related analyses are used in the areas of data mining and knowledge
discovery such as associated rule mining of patterns of high support and/or
confidence.
5. Suppose D = { The dates that more than
25% of students do not show up in class } and
F = { The dates that are Fridays }. What can you say about the intersection of the two sets,
? The intersection is
then the set of all Fridays that more than 25% of student do not show up in
class.
Suppose
is very high (greater
than 0.5), what does it mean? Well, we
could say that students prefer not to show up on Fridays more often than on the
other days.
Suppose we also have AF = { The
Fridays before an away football game on Saturday } and ABF = { The Fridays
before an away basketball game on Saturday }.
It would be interesting to see whether
is high; and it would
be interesting to see whether
is low for this
university’s students.