What does the UDRC use logistic regression for?
By: Kelsey A. Martinez, PhD, Researcher
July 8, 2020
Photo by Clem Onojeghuo
Here at the UDRC, we use logistic regression quite often in our research. You may wonder exactly what it is and why we end up using it so frequently.
Logistic regression is a way to examine an event or circumstance where there are only two possible outcomes. Let’s break down what I mean by ‘two possible outcomes.’ Say we are interested in looking at the probability that I walk to Harmon’s City Creek on my lunch break on any given workday. From a logistic regression standpoint, there are two possible outcomes here, either I will walk to Harmon’s, or I will not walk to Harmon’s. For explanatory purposes, let’s say that if I walk to Harmon’s on my lunch break it is a ‘positive’ outcome (1), and that if I don’t walk to Harmon’s, it’s a ‘null’ outcome (0).
In addition to knowing whether or not I went to Harmon’s, there are a variety of other factors that will influence whether or not I walk to Harmon’s during the workday. You might imagine that things such as weather, air quality, if I brought leftovers from home for lunch, and the amount of daily work tasks I have to complete will all factor in to whether or not I go to Harmon’s. Using logistic regression, we can look at how all of these other factors influence the probability of me walking to Harmon’s using a set of days, or observations.
This analytical approach is therefore useful in social science or economic contexts. We can analyze how different social variables, such as age, race, gender, or ethnicity all play into a particular binary outcome for a group of people. In other words, logistic regression is a good way to understand how belonging to a particular demographic group influences a social outcome.
If you’re interested in how we interpret the results of logistic regression (and like a little bit of math) keep reading below.
Getting into the math of logistic regression just a little bit..
Logistic regression analyses produce odds ratios for each variable examined. Odds ratios tell us how strongly each variable impacts the probability of a positive outcome occurring. Specifically, logistic regression uses a logistic function to predict a binary, or 0/1, outcome.
If we go back to the Harmon’s example, you can probably guess that the different decision factors I listed might weigh differently in determining whether I walk to Harmon’s that day. If it happens to be raining, the probability of me going out for a walk decreases quite a bit. However, whether or not I wore comfortable walking shoes to the office that day has less of an impact on the outcome than weather might, especially if I forgot to bring a lunch.
Odds ratios are on a logistic scale, meaning that if a variable has an odds ratio greater than 1, it has a positive influence on the outcome, or sways the outcome towards 1 instead of 0. Conversely, if the odds ratio is less than 1, it has a negative impact on the outcome, or sways it towards 0.
Going back to the Harmon’s example, the daily variable of remembering to bring a lunch from home would have an odds ratio less than 1, because it decreases the probability of me going to Harmon’s on my break. On the other hand, sunny, 75 degree weather would have an odds ratio greater than 1, because that sort of weather would increase the probability of a positive outcome - me walking to Harmon’s.
I hope this fun example has increased your understanding of logistic regression and its importance in the social sciences. Signing off for my lunch break..