CSCE 235

Light Bulb 3

February 10, 2003 

 

1.      Suppose you are a manager for a company.  The company has employees of different education backgrounds: Computer Science (CS), Computer Engineering (CoE), Mathematics (M), and Business Administration (BA).  Now, you want to form groups of employees of mixed education backgrounds to improve inter-disciplinary cooperation.  How many possible groups of different mixtures are there?   If we view this as a set problem, then we can have four members or elements in the set S = { CS, CoE, M, BA }.  Since we know the number of members in the powerset of a set, P(S), is  where n is the number of elements in S, P(S), in our example, has  = 16 members.  However, one of the members is the empty set!  So, there are only 15 possible groups of different mixtures.

 

2.   Suppose that there are several pieces of information:  A = “It is raining outside.”  B = “ It is not raining outside.”  C = “It is cloudy outside.”  D = “It is not cloudy outside.”  Is it possible to believe in A and also in B?  Is it possible to believe in C and also in D?  Is it possible to believe in A or B?  Is it possible to believe in A and C?  To this end, there is a well-known and popular theory called the Dempster-Shafer theory that allows belief values to be expressed to the members of the powerset of the set of all pieces of information or facts.  So, in this case, for example, consider the set S = {A, B, C, D}.  The powerset is then { {}, {A}, {B}, {C}, {D}, {A, B}, {A, C}, {A, D}, {B, C}, {B, D}, {C, D}, {A, B, C}, {A, B, D}, {A, C, D}, {B, C, D}, {A, B, C, D} }.  In the Dempster-Shafer belief theory, it is possible to have a belief of {A, B} even though A and B are conflicting because it is okay to believe to a certain degree that it is raining outside while believing to a certain degree that it is not raining outside.  The powerset allows us to enumerate all possible combinations, a very useful mechanism. 

 

3.   Suppose you are a statistician working for ESPN and your tasks include finding interesting statistics to report every night on the SportsCenter.  Some of the interesting statistics that you can perform are set-related.  For example, list all the NBA players that have more than 5,000 assists, 10,000 rebounds, and 20,000 points.  List all the coaches that have more than 1000 victories and more than 500 losses.  List all the L.A. Lakers players that are currently averaging more than 25.0 points per game.  List all the NBA players that are currently averaging more than 25.0 points per game but are playing for an Eastern conference team that currently has a losing record.

 

4.      Suppose you are a data analyst for a website that sells books.  One of the strategies that the website uses is to automatically recommend books to its web users based on the books that they buy. Your responsibilities include finding out the percentage of people buying suspense novels also buying history books.  If the percentage is high, then you need to tell the company to put an automated recommendation; otherwise you do not.  How do you do it?  You may get the list of all buyers of suspense novels through the web, and the list of all buyers of history books through the web, and denote them as SU and HI, respectively.  The set of buyers of both types of books is simply .  The set of all buyers is SU HI.  So the percentage is .  Some of these set-related analyses are used in the areas of data mining and knowledge discovery such as associated rule mining of patterns of high support and/or confidence.

 

5.   Suppose D = { The dates that more than 25% of students do not show up in class } and  F = { The dates that are Fridays }.  What can you say about the intersection of the two sets,?  The intersection is then the set of all Fridays that more than 25% of student do not show up in class. 

 

      Suppose  is very high (greater than 0.5), what does it mean?  Well, we could say that students prefer not to show up on Fridays more often than on the other days. 

 

      Suppose we also have AF = { The Fridays before an away football game on Saturday } and ABF = { The Fridays before an away basketball game on Saturday }.  It would be interesting to see whether  is high; and it would be interesting to see whether  is low for this university’s students.