Mathematics 363

Mathematics 505D

Data Analysis and Probability

Summer 2011

Projects

Coin Tossing Project I

The tradition of tossing a coin to make decisions and resolve disagreements goes back to the ancient Romans, who believed that a chance occurrence such as the outcome of a coin toss was an expression of divine will. It is generally believed that the process of tossing a coin is completely fair and completely random. But what do we really mean when we say that something is “fair” and “random”? If we toss a coin 100 times and get 58 heads and 42 tails, does that mean that the coin is not fair? If we toss a coin 100 times and get eight of the same result in a row, does that mean that the coin tosses are not random? In this project, we’ll use virtual coin tosses to explore some ofthe interesting things that can occur if we toss a coin many times.

Investigating the Results of Coin Tosses

1. Suppose you and your friend have dinner, and you decide to toss a coin in order to decide who is going to pay. In order to make things as fair as possible, you mutually decide to toss the coin ﬁve times. If you get more heads than tails, you have to pay for dinner; if you get more tails than heads, your friend will pay for dinner. The coin comes up heads four times and tails once. Does this mean that the coin is unfair? Explain. What if the coin came up heads on all ﬁve tosses?

2. Suppose you toss a coin 100 times. It is actually not very likely that you will get exactly 50 heads and exactly 50 tails (though it is more likely than any other single outcome). Let h be the number of heads that come up. What is the largest value of h that you would not ﬁnd surprising? What is the smallest value that you would not ﬁnd surprising? Explain.

3. You have been given a spreadsheet that contains 500 rows of 100 coin tosses each. A “1” represents “Heads,” and a “0” represents “Tails.” (Each coin toss has been generated by selecting a random real number between 0 and 1, and returning 1 if the random number is greater than or equal to 1/2 and 0 if the number is less than 1/2 . You can generate a new set of 10000 coin tosses by pressing CTRL-SHIFT-F9.) Think of each row of the spreadsheet as a “sequence” of 100 coin tosses. Modify the spreadsheet so that it shows the number of heads and tails obtained in each sequence, the number of times the result of the toss “switches” from heads to tails or vice versa in each sequence, and the length of the longest run of consecutive heads and the length of the longest run of consecutive tails in each sequence.[1] Also include the average number of heads, tails, and switches per sequence over all 500 sequences, as well as the average length of the longest run of heads and the average length of the longest run of tails.

4. In 500 sequences of 100 tosses each, what is the greatest number of heads obtained in one sequence? What is the greatest number of tails obtained in one sequence?

5. Let h be the number of heads obtained in a sequence of 100 tosses. Look at 500 such sequences, and use Excel to make a graph showing the distribution of the values of h. Re-generate the tosses several times so that you can see several diﬀerent distributions. What do you notice about the shapes of the distributions? About what percentage of the time is the number of heads between 45 and 55? Between 40 and 60?

6. What is the average number of switches per sequence over all 500 sequences? If you were to run 5000 more sequences, what do you think the average number of switches per sequence would be? Why does this make sense?

7. What is the longest run of heads obtained over 500 sequences of 100 tosses each? (Assume that when one sequence ends, any run that is in progress also ends; a run of heads doesn’t carry over into the next sequence.) Explain why this longest run is virtually certain to be at least 10 heads long.

8. In the sheet “Coin2,” 500 sequences of 100 tosses each are given, but this time, the “coin” is biased so that heads comes up 55% of the time. Suppose you hadn’t been given information about the coin used to generate these tosses. Would you have been able to tell from looking at one sequence of 100 tosses that the coin was unfair? Would you have been able to tell from three sequences? How about ten sequences? With the biased coin, what seems to be the average number of switches per sequence? How does this compare to the average number of switches per sequence for the fair coin? Why does this make sense probabilistically?

Detecting Fake Coin Tosses

You have been given a second spreadsheet ﬁle, titled “Head Count”. This spreadsheet contains a procedure that generates the number of heads that come up in a sequence of 100 tosses of a coin (but does not show the results of the tosses themselves). The spreadsheet does this for four diﬀerent “coins,” some of which are unfair or possibly even fake. You can adjust the number of sequences for each coin by copying the contents of the cells and pasting them to additional cells in each column. Decide which “coin(s)” in this spreadsheet are real, fair coins. For each “coin” that does not appear to qualify, decide whether the numbers of heads seem to come from a biased coin, or not from a coin at all.

[1] It will take an impossibly long time to count these manually, and you’ll have to start over if you generate new

tosses! So you should try to ﬁnd a formula that can calculate each of these.