Mathematics 505D
Data Analysis and
Probability
Summer 2011
Projects
Coin Tossing Project I
The
tradition of tossing a coin to make decisions and resolve disagreements goes
back to the ancient Romans, who believed that a chance occurrence such as the
outcome of a coin toss was an expression of divine will.
It is generally believed that the process of tossing a coin is completely fair and
completely random. But what do we really mean when we say that something is
ÒfairÓ and ÒrandomÓ? If we toss a coin 100 times and get 58 heads and 42 tails,
does that mean that the coin is not fair? If we toss a coin 100 times and get
eight of the same result in a row, does that mean that the coin tosses are not
random? In this project, weÕll use virtual coin tosses to explore some ofthe interesting things that can occur if we toss a coin
many times.
Investigating the Results of Coin Tosses
1.
Suppose you and your friend
have dinner, and you decide to toss a coin in order to decide who is going to
pay. In order to make things as fair as possible, you mutually decide to toss
the coin Þve times. If you get more heads than tails, you have to pay for
dinner; if you get more tails than heads, your friend will pay for dinner. The
coin comes up heads four times and tails once. Does this mean that the coin is
unfair? Explain. What if the coin came up heads on all Þve tosses?
2.
Suppose you toss a coin 100
times. It is actually not very likely that you will get exactly 50 heads and
exactly 50 tails (though it is more likely than any other single outcome). Let h be the
number of heads that come up. What is the largest value of h that you would not Þnd surprising? What is the smallest value
that you would not Þnd surprising? Explain.
3.
You have been given a spreadsheet that
contains 500 rows of 100 coin tosses each. A Ò1Ó represents ÒHeads,Ó and a Ò0Ó
represents ÒTails.Ó (Each coin toss has been generated by selecting a random
real number between 0 and 1, and returning 1 if the random number is greater
than or equal to 1/2 and 0 if the number is less than 1/2 .
You can generate a new set of 10000 coin tosses by pressing CTRL-SHIFT-F9.) Think
of each row of the spreadsheet as a ÒsequenceÓ of 100 coin tosses. Modify the
spreadsheet so that it shows the number of heads and tails obtained in each
sequence, the number of times the result of the toss ÒswitchesÓ from heads to
tails or vice versa in each sequence, and the length of the longest run of
consecutive heads and the length of the longest run of consecutive tails in
each sequence.[1] Also
include the average number of heads, tails, and switches per sequence over all
500 sequences, as well as the average length of the longest run of heads and
the average length of the longest run of tails.
4.
In 500 sequences of 100
tosses each, what is the greatest number of heads obtained in one sequence?
What is the greatest number of tails obtained in one sequence?
5.
Let h be the number of heads
obtained in a sequence of 100 tosses. Look at 500 such sequences, and use Excel
to make a graph showing the distribution of the values of h. Re-generate the
tosses several times so that you can see several different
distributions. What do you notice about the shapes of the distributions? About
what percentage of the time is the number of heads between 45 and 55? Between
40 and 60?
6.
What is the average number of
switches per sequence over all 500 sequences? If you were to run 5000 more
sequences, what do you think the average number of switches per sequence would be?
Why does this make sense?
7.
What is the longest run of
heads obtained over 500 sequences of 100 tosses each? (Assume that when one
sequence ends, any run that is in progress also ends; a run of heads doesnÕt
carry over into the next sequence.) Explain why this longest run is virtually
certain to be at least 10 heads long.
8.
In the sheet ÒCoin2,Ó 500
sequences of 100 tosses each are given, but this time, the ÒcoinÓ is biased so
that heads comes up 55% of the time. Suppose you hadnÕt been given information about
the coin used to generate these tosses. Would you have been able to tell from
looking at one sequence of 100 tosses that the coin was unfair? Would you have
been able to tell from three sequences? How about ten sequences? With the
biased coin, what seems to be the average number of switches per sequence? How
does this compare to the average number of switches per sequence for the fair
coin? Why does this make sense probabilistically?
Detecting Fake Coin Tosses
You
have been given a second spreadsheet Þle, titled ÒHead CountÓ. This spreadsheet
contains a procedure that generates the number of heads that come up in a
sequence of 100 tosses of a coin (but does not show the results of the tosses
themselves). The spreadsheet does this for four different
Òcoins,Ó some of which are unfair or possibly even fake. You can adjust the
number of sequences for each coin by copying the contents of the cells and
pasting them to additional cells in each column. Decide which Òcoin(s)Ó in this
spreadsheet are real, fair coins. For each ÒcoinÓ that does not appear to
qualify, decide whether the numbers of heads seem to come from a biased coin,
or not from a coin at all.
[1] It will take an impossibly long time to count
these manually, and youÕll have to start over if you generate new
tosses! So you should try to Þnd a formula that can
calculate each of these.