Category Archives: Asked & Answered

Asked & Answered 9.0

Coronavirus Chart - Buncombe County - June 2020We have been in isolation mode for the coronavirus for so long that maintaining distance in public has pretty much become an automated, fear-induced behavior.  This is sad.  Like everyone else, I want to get out and go to restaurants and live life normally, but the case numbers here have not been encouraging.  I have been keeping a chart (at right) which shows we are averaging 7 new positive tests a day.

Is there some “magic number” of new cases per day that would make me feel comfortable increasing my degree of exposure to the world?  There must be some number, otherwise it wouldn’t be worth my while tracking this data.  So, how do I translate the rate of positive test results in my area to my personal risk level of going out and about?

Having found no clear answer to this on the internet, I thought I would play Dr. Fauci and guesstimate it myself, given the data available to me, plus some assumptions.  Here goes.

• • •

At its most basic level, the formula may be expressed as chance of infection per contact = chance the other person is infected times chance of transmission upon contact.

Let’s start with the chance that the contacted person is infected.  Here I define contact as passing through a person’s airspace for some length of time.  I will ignore transmission via objects for now, as I use gloves, hand-washing and object-cleaning protocols to minimize that risk factor — for me.

My chance of contacting an infected person on a given outing depends on the number of contacts I make and the percentage of infected people among them.  This is a good place to introduce my framework and assumptions, starting with the chart below:

Populations and sub-populations with respect to Coronavirus InfectionMy first assumption is that I only come in contact with the local population (P), defined as those who live in my county.  This population consists of two groups, those infected with the virus (I) and those not infected (N).  The infected group is further divided into three sub-groups: asymptomatic (IA); symptomatic but untested (IS); and those who have tested positive (IT) and are considered contagious.  Those who test negative and those who have recovered from the virus are included in group N.

For our purposes, I assume that infected people are evenly distributed around the county and may travel anywhere within its borders.  Whether I meet an infected person depends on their propensity to travel, which may differ among the various sub-groups.  If we define the baseline (i.e., non-infected group) propensity to travel as fN = 1.0, then we can assign (by guesswork) travel factors fA , fS  and fT to the infected sub-groups, where each f-factor is between 0 and 1.  For instance, we might suppose that those who have tested positive and are still contagious would have a low propensity to travel — so fT might be 0.1, say.

This framework leads to Equation 1, the chance CI that a random contact is infected:

(1)      CI = ( fA IA + fS IS + fT IT ) / P

Now, we know P, and county health officials provide a weekly update on the number of positive tests, but that’s about all we know.  We have to deduce the sizes of the sub-groups IA, IS and IT and the travel factors fA , fS  and fT using official estimates and semi-informed guesswork.  Let’s consider each of these in turn.

First, we estimate the number of people in the IT (tested positive) group as IT = d Q , where d is the duration of a typical infection (also the quarantine time following a positive test) and Q is the rate of new positive test results per day.  We assume that the positive test rate is at steady-state, i.e., Q is constant.  For simplicity, we also assume that those who test positive are tested on the first day of their infections.  So, if Q = 10 new positive test results per day and d = 21 days, then the number of people currently in the IT sub-group is 210.

But there are plenty of stories about symptomatic people who do not bother to get tested and who only slightly modify their behaviors (the Isub-group).  I have found few clues as to the size of this group, and none of the testing statistics are relevant.  The best I can do is take a wild guess at the fraction of infected people who develop symptoms but go untested. Based on human nature (see hurricanes), I would bet that at least a third of symptomatic people “ride it out” without ever seeking care or getting tested, at least in this county.

So the size of the symptomatic group IS = k( I + IT ), where kis the fraction of those symptomatic and infected who go untested.  This may be rearranged as IS = IT k/ (1 – kS ).

Finally, the infected-but-asymptomatic group IA .  There has been much debate about how many infected people are asymptomatic and how infectious they are.  The CDC estimates that 35 percent of infected people are asymptomatic — I have seen figures as high as 80%. For our purposes, we will assume that asymptomatic people also never get tested, and that their propensity to travel is the same as the non-infected group.

The relevant formula here is IA = ( IS + IT  ) k/ (1 – kA ), where kis the fraction of infected people who are asymptomatic.

Given all the above, we are now ready to plug some numbers, guesses and estimates into our equations:

  • Local population (P) = 262,000
  • Local positive test rate (Q) = 7 positive results/day, where we live, as of now
  • Average duration of infection (d) = 21 days, more or less
  • Fraction of symptomatic infected people who go untested (kS ) = 0.33 (?)
  • Fraction of infected people who are asymptomatic (kA ) = 0.35 (per CDC)
  • Relative propensity of asymptomatic-infected people to travel (fA ) = 1.0
  • Relative propensity of symptomatic-infected people to travel (fS ) = 0.7 (?)
  • Relative propensity of tested-positive people to travel (fT ) = 0.1 (?)

This works out to 147 tested-positive people (IT ), 72 symptomatic-infected people (IS ) and 118 asymptomatic-infected people (IA ) in my county at the moment.  So the chance CI that an individual contact is infected, given my assumptions, is…

One of every 1427 contacts.

This is equivalent to selecting one American at random and finding out that she lives in Boise, Idaho.

Extending this scenario, if I were to contact 10 people during an outing here, the chance that at least one of those contacts is infected would be about 1 in 140.  This is roughly the odds of drawing a straight or better in a five-card poker hand.

What about large gatherings, say, a restaurant with 50 patrons?  According to my model, the odds in my county that at least one of them is infected would be about 15 to 1.  So, this gives you a sense of how the risk increases with the number of contacts one makes.

• • •

Transmission factor vs time for coronavirus (an example)But contact is not the same as transmission.  The likelihood of transmission depends on the degree of exposure (i.e., how much virus is being released) and the time of exposure, as well as the overall effectiveness of masks, distancing and the like.  We can quantify this via the transmission factor τ = 1 – (1-x) t where x is the transmission risk per minute of exposure and t is the length of exposure in minutes.  (The chart at right shows τ vs time for x = 0.4).  τ = 1 represents 100% certainty of virus transmission during a given contact.

As with propensity to travel, the different infected subgroups may vary in their ability to infect others.  For example. asymptomatic victims are thought to shed less virus than those who are coughing and sneezing.  On the other hand, knowing that one is interacting with a positive-test victim may lead both parties to take more precautions.  So, the x factors could look something like this in practice:

  • Transmission risk per minute, asymptomatic-infected people (xA ) = 0.1 (?)
  • Transmission risk per minute, untested symptomatic-infected people (xS ) = 0.3 (?)
  • Transmission risk per minute, tested-positive people (xT ) = 0.2 (?)

These guesswork values imply, for instance, that 5 minutes of contact with a symptomatic untested victim is 83% certain to lead to transmission in our current personal-protection environment.  Is this true?  I have no idea.  But again, by building a model and plugging in numbers that sound sort of reasonable, one can at least get an order-of-magnitude sense of the risk.

So, adding transmission risk to Equation 1 produces the model in Equation 2, where CX is the chance of transmission from a random contact:

(2)      CX = (τA  fA IA + τS  fS IS + τT fT IT ) / P

And, the chance of transmission C from an outing with n random contacts (assuming the same exposure time for each) is then…

(3)      C (n) = 1 - (1 - CX ) n

Here then is a chart summarizing my estimated odds of being infected from an outing in my county at the current time, given n random contacts and exposure time t per contact (plus all the other assumptions above):

Local odds of becoming infected with Coronavirus, as a function of exposure time per contact and number of contactsThese are admittedly rough estimates, but the salient point here is that these odds are not a million-to-one, but nor are they three-to-one.  If they were either, one’s rational response would be much clearer.  As it is, these figures call for deliberation, which we have not been given much opportunity to do — health officials are naturally reluctant to express our risks this way.  Epidemiologists are not paid to make back-of-the-envelope calculations — that’s why I had to do it.

• • •

Now, if you’re not happy with my figures, you can play Dr. Fauci yourself.  I have created a risk calculator that will let you enter figures for your own location and estimate your odds of becoming infected.  And you can change all the parameters of the model as you see fit.  The odds are updated instantly when you change one of the entries.

If you play around with the calculator a bit, you will see how the odds depend heavily on the assumed fraction of infected people who are asymptomatic (kA).  The more people who are out and about, unaware that they are infected but still able to infect others, the greater your risk of becoming infected during an outing.*

Now it’s time for the disclaimers.  I remind readers that this is a static model, a snapshot of a person’s contact risk given the current local rate of new cases and enough assumptions to fill an F-150 pickup.  It is not a dynamic model — it does not predict trends and it does not take the weekly rise and fall in local rates into account.  It is obvious from my model, however, that if one wants to lessen her risk of infection, she should make fewer outings, limit her number of contacts and the time she spends with them, and adopt measures to minimize the risk of transmission when exposed to those contacts.

But what if the odds are 1000:1?  Well, it will be Person #1000 who takes that chance and does his part to keep the pandemic alive.

Which is what responsible health officials have been telling us all along.


I ran some numbers for the Tulsa, Oklahoma, Trump rally scheduled for June 20, 2020.  Tulsa County has 620,000 residents and its daily new case rate is 120 per day and rising.  After adjusting some of my figures, given that attendees will be temperature-checked and the more symptomatic may be turned away, I still come up with odds of transmission of about 44:1.  I assume that each attendee will share an airspace for 60 minutes with his 8 nearest neighbors.  If 19,000 people attend, then about 430 cases, and perhaps 5 deaths, may stem from the rally. “A very small percentage,” Trump said.  He doesn’t care.


* Robert Redfield, CDC Director:  “Of those of us that get symptomatic, it appears that we’re shedding significant virus in our oropharyngeal compartment, probably up to 48 hours before we show symptoms This helps explain how rapidly this virus continues to spread across the country because we have asymptomatic transmitters.”
More in  Asked & Answered | Read 6 comments | Subscribe

Asked and Answered 8.0

You’ve been there.  You’re driving on the expressway at a reasonable pace, a bit faster than some drivers, slower than others.  You move into the left lane to pass a slightly slower car.  Just about the time you draw even, you glance up at your rear-view mirror and see a set of fierce headlights bearing down and closing in on you by the second.  You think, where did this guy come from?  Barely ten seconds later he’s riding your tail, making sure you know in no uncertain terms that you’re in his way, so get your ass moving already!

At this point, you make one of two choices, depending on the kind of driver you are and your mood that day.  You either finish your pass without changing speed, no matter how much the guy behind you tries to intimidate you, or, you decide this person is bad news and you speed up to get out of his way and let him go roaring by — which he will.

Doesn’t it seem like aggressive drivers are everywhere?  It makes you wonder what kind of life experiences create such angry, impatient, bullying people.  And can there really be that many of them out there?

To explore this situation, we’re going to do a little thought experiment here at the stay-at-home office of Asked and Answered.  Here’s the setup.  You and 999 other drivers are going to take a 60-mile trip on the same stretch of interstate highway.  The posted speed limit is 60 mph, but most of you drive some other speed.  In our scenario (see diagram), 30% of drivers drive at 60 mph; 30% drive at 65 mph; 30% drive at 70 mph; and the remaining 10%, the most aggressive ones, drive at 80 mph.  We will assume that the four types of drivers enter the on-ramp in random order and at a steady rate of 4 cars per minute.

Now, decide which type of driver fits you best (denoted by silver, blue, green or red) and then answer the question: what will your driving experience be like?  How many vehicles will you encounter and of what type?

To answer this, I originally thought that I would have to write a computer simulation of the problem and keep track of hundreds of cars as they navigated 60 miles of interstate. But then I stumbled upon time-space diagrams in traffic engineering texts.  Simply put, such diagrams capture how a vehicle (or any number of them) covers a stretch of road based on the vehicle’s speed profile.  This concept is the key to the ignition, if you will.

The chart at right is a simplified example of a time-space diagram.  Travel time is on the horizontal axis and total distance traveled in that time is on the vertical axis.  Each of the colored lines (refer to color scheme above) represents a trip made by one of the drivers in our scenario.  In this example, one driver of each speed-type drove the 60-mile trip, with the drivers starting out 5 minutes apart from each other.  The silver car started first, followed by green, blue and finally red.

We will assume each driver maintained constant speed — otherwise these would be curves instead of lines.  Now, consider the trip-line of the red car, the 80-mph driver (Red) who started at the 15-minute mark and finished his trip at the 60-minute mark.  Whenever two trip-lines cross, it means one car passes another.  Here, we see that Red encountered and then passed Blue around Mile 25.  Red then caught up with Silver at Mile 60, the end of the trip for both drivers.  And on this trip, Red and Green never saw each other.

One more example before we turn to the original question.  The chart at right shows the trip-lines for 40 cars on a 60-mile trip.  Four of every five drivers (silver) drive 60 mph; the fifth (pink) drives 90 mph.  From the line crossings, we see that the typical pink car passes 10 silver cars, while a silver car encounters 2 or 3 pink ones.

What we learn from this example is that a driver’s perception of the other types of drivers on the road is distorted by the relative speeds of the drivers.  Based on his encounters with other cars during his trip, a Silver driver could easily conclude that most drivers are Pinks.  Conversely, a Pink driver might think that over 90% of drivers are Silvers.

This brings us to the formula I derived for the expected number of times a given driver will encounter drivers of other types during a constant-speed trip.  The formula is:


EIJ = the expected number of encounters Driver I will have with drivers of type J
fJ   = the fraction of drivers of type J in the general population
C   = the average number of cars per hour (of all types) passing a given point
D   = the distance of the trip in miles
SI    = the (constant) speed in miles/hour of a driver of type I
SJ   = the (constant) speed in miles/hour of a driver of type J
SΔ  = the absolute difference in the speeds of Drivers I and J


Now let’s return to the question posed at the start.  We defined 4 types of drivers based on their speeds: 30% (Silver) drive 60 mph, the next 30% (Blue) do 65 mph, 30% (Green) do 70 mph, and 10% (Red) do 80 mph.  In the figure below, the actual driver population is shown in the middle, along with each driver’s perception of the population based on the sample of cars — including her own — that she encountered during the trip.

A few things to note here.  First, based on her encounters on the road, every driver thinks that her own driver-type is decisively in the minority.  This is because she will neither pass or be passed by anyone driving at the same speed.  While she may follow another car of similar speed for many miles, her own vehicle and the one she sees right in front of her would be the extent of her experience with like drivers.

A second observation is that every type of driver over-estimates the proportion of both speedsters and slowpokes (unless you happen to be one of those types).  In our scenario, both Silver and Blue drivers believe that more than 25% of drivers are Reds, when the actual figure is 10%.  Similarly, both Green and Red drivers believe that Silvers comprise 45% or more of the driving population.  As is evident from the formula, the greater the speed difference, the more encounters one is likely to have with a given type of driver.  Remember this the next time you complain about all the crazies on the road.

I would be remiss here if I didn’t mention something about the real-world distribution of driving speeds.  The most recent data I could find is from a 2015 federal survey of traffic speeds on various classes of roads, compared to the speed limits on those roads.  The data for limited-access highways suggest that 30% of drivers do not speed (Silver), 25% exceed the speed limit by up to 5 mph (Blue), 25% exceed it by no more than 10 mph (Green), and 20% drive more than 10 mph over the limit (Red).

So the hypothetical driver pool that I presented in my original question is not much of a departure from reality.  The main difference is that, in real life, there is a continuum of driving speeds, and most drivers do not maintain a fixed speed for a whole trip.  However, I think my general observations still hold:  relative speed skews a driver’s perspective of the driver pool.  The greater the speed difference, the more prevalent that type of driver appears to be to you.

Thanks for reading.  I trust all your questions have now been answered.  Except of course, the most important one:  what makes aggressive drivers be that way?

More in  Asked & Answered | Read 3 comments | Subscribe

Asked & Answered 7.0

You are the first-year coach of the Texas Lady Longhaulers of the WNBA.  This morning finds you dispirited after an embarrassing 88-60 loss to the Nashville Wynettes last night.  You need a more productive starting lineup for tomorrow’s game.  Where to begin?

Point guard is your most important position and your roster choices are Maya Thomas and Tamika DeShields.  They are good but very different players.  Maya is a steady shooter who has a 50% chance of sinking each shot.  Tamika, on the other hand, is streaky: if she sinks a shot, she makes her next one 80% of the time; but if she misses, she tends to miss again, 80% of the time.

So which point guard would you start, Steady Maya or Streaky Tamika?

I thought I would make this edition of Asked and Answered more interactive than usual. Readers, before I begin the discussion, you are invited to weigh in with your own answer. On the form below, check the box for either Steady, Streaky, no difference, or it depends.  There are no penalties for incorrect answers so just go ahead and check the box that your first-year-coaching gut tells you is the best one.  Then click the Vote button to record your answer and view the totals so far.

Coming Soon
The Better Player: Steady or Streaky?
The Better Player: Steady or Streaky?
The Better Player: Steady or Streaky?

First a little table-setting.  Maya and Tamika are fictional stand-ins.  Top players in the WNBA make 15-20 field-goal (two-point) attempts per game and sink 45-50% of those.  Maya’s performance is realistic, but whether Tamika’s streakiness is seen in actual players is a question for sports statisticians to answer.

Maya’s expected performance is easy to calculate.  Assuming she makes 20 attempts and sinks 50% of them, you may expect 10 field goals from her in a typical game, give or take.  The probability P(n) of Maya scoring exactly k field goals in n shots is given by

P(n) = \frac{n!}{k! (n-k)!} \;p^{k} q^{n-k}

where p is the probability of making any given shot and q is the probability of missing it.  [This is the well-known formula for the binomial probability distribution.  Exclamation points denote the factorial operation — they do not express my surprise.]  So the chance that Maya will score 10 +/- 1 field goals is about 50 percent:

Chance of 9 scores 0.160
Chance of 10 scores 0.176
Chance of 11 scores 0.160
Total 0.497

Predicting what to expect from Tamika is more complicated.  As I learned while studying this problem, her performance is an example of a Markov chain.  This is best explained by the diagram below.  Tamika starts the game in the Initial box.  After taking her first shot, she moves either to the Scored box (blue) or to the Missed box (violet).  The number next to each path shows the chance that she will follow that path when she shoots.

Every time Tamika shoots, she moves along a path.  Some paths return to the same box.  For instance, when Tamika is in the Scored box (i.e., she sank her last shot), she has an 80% chance (0.8) of circling back to the Scored box with her next shot.  Otherwise she moves over to the Missed box.  And so on.

No paths lead to the Initial box because, when taking a shot, Tamika only scores or misses. In my intro, I failed to specify how Tamika typically performs on her first shot.  For now, assume that Tamika starts the game cold as if she has just missed.  This implies that her chance of moving from Initial to Missed is 80% and from Initial to Scored is 20%.

The number of times Tamika visits the Scored box gives us the number of field goals you may expect her to score during the game.  But how do you calculate that?  Luckily, thanks to folks like David L. Deever, professor emeritus of Otterbein University in Ohio, there are such things as Markov chain calculators.  You enter the path information into a table, and the calculator returns the probability that a given box is occupied on the nth step.

Here are the results Dr. Deever’s calculator produced for Tamika.  The Scores column shows the expected number of times Tamika has scored after taking n shots:

The table shows that, after 20 shots, Tamika’s expected number of scores is 9.25, which is 0.75 less than the 10 scores you can expect from Maya.  Tamika never fully recovers from her cold start, and she will trail Maya by 0.75 scores (on average) forever.

So, if you had decided to start Steady Maya, your answer would be correct.

But perhaps you assumed Tamika has a 50/50 chance of scoring/missing on her first shot, after which the 80/20 rule would apply.  If that were the case, then Tamika would not fall behind Maya at all.  Each player would score 10 times in 20 shots, on average.  So if you guessed there would be no difference, your answer would also be correct.

• • • 

Tomorrow has turned into today, and your team is in the final minutes of its next game.  Since Maya had a slight edge over Tamika, you decided to start her today.  Unfortunately, your Lady Longhaulers have allowed too many turnovers, and they have fallen behind by 10 points.  You figure you will get five more possessions before the final buzzer.  To have any chance to win, your team will need to score on every one of those possessions.

You call a time-out.  Do you stay with Steady Maya or do you send in Streaky Tamika?

This gets interesting.  You need a player to sink five shots in a row.  The chance that Maya can do this is 0.5 (the probability of her sinking any one shot) to the fifth power, or 3.1%. The chance that Tamika can do it, coming in cold, is 0.2 (her first-shot success rate) times 0.8 (her repeat success rate) to the fourth power, or 8.2%.  Tamika is more than twice as likely to tie the score than Maya, though her chances are still slim.

So if your answer was to send in Streaky Tamika, you would be correct.  And this means that if your original answer was “it depends” then you would also be correct.  Now, all Tamika has to do is sink her first shot.  And the next.  And the next…

With that, your time-out is over.  May you enjoy the final minutes of your coaching career.

• • • 

David L. Deever, the author of journal articles as well as the Markov calculator that I used,  taught his last mathematics class at Otterbein University in 2003, ending a 37-year career.  His Facebook page (which has not been updated since 2013) reveals Dr. Deever to be a kind person, concerned citizen and a liberal in good standing.  I thank Dr. Deever for his contributions and I wish him good health.

More in  Asked & Answered | Read 5 comments | Subscribe