Thank you for having me here. So this is work that I did with my master's student Reza and my postdoc Gema who's now a professor at UBC Okanagan. So we wanted to look at bias when people are evaluating code contribution in open source software, and if there is bias, what the population of the people contributing in open source software is. And so I think I'm going to just go. So open source software often thinks of itself as a meritocracy and oftentimes it is but - and they think that, you know, the quality of the contribution is the key here and it doesn't matter who is contributing, where they're contributing from, and that's - that's that's a popular -. Our past research has kind of shown that there are caveats to this. They've found that gender has a role to play when open source contributions are being evaluated. Research has also shown that contributions from different countries may have different likelihoods of their contributions getting accepted. But one thing that we didn't quite see is race, and - and there is some evidence that developers do understand the race and ethnicity of other members in their open source projects even if they have not met them, even if it's completely remote, right. Today company or industrial research - industrial development happens remotely, but open source software has been happening remotely for decades now, right, and even in that case they kind of are aware of the ethnicity of the members in their team. And in that survey they also found that about 30 percent of them have faced some kind of negative experience due to their diversity. And so what we wanted to see was, we wanted to see if the ethnicity of the person making the contribution has any impact on whether their contribution is going to be accepted. And what we think is that by knowing that ethnicity - by looking at a name - something in my brain gets activated and some bias that I might have towards that race or ethnicity might kick in, and I might view that contribution as something not great, right. In this case just looking at my name and saying, oh this looks like a south asian contributor, so this is going to be a good one, or something like that, and then accepting the contribution. So what we wanted to see was, collect quantitative evidence of whether this exists or not. So we took about 46,000 projects from GitHub, all of which had at least 10 stars, and they were non-trivial in some sense - they were not like student projects. We got about 2.5 million pull requests from these, and we took the names of the people who made these contributions, and using a tool called NamePrism we kind of extracted the race and ethnicity from the name. So we got the race and ethnicity for about 493,000 developers. So one might - so I mean, the tool kind of gives an output and says that this name sounds hispanic with a 97 percent probability or this percent this name sounds white with a 51 percent probability. Now you may think, hey there's going to be problems here, and there are going to be problems here, you know a very - john doe might be a name and you wouldn't know which race or ethnicity that might be. You might think that it's actually a white person even if it may not be. But we found that whenever the tool makes mistakes about someone's race or ethnicity humans make the same mistake about the race or ethnicity of that person as well. So the tool is about as good as humans in determining what the race or ethnicity of a person is from the name. So we took that and the first striking result that we got was that less than 10 percent of the contributions that we were able to identify came from a non-white developer, and that includes Asians, Hispanic, and Black developers put together, right. We found one Alaskan or Native American developer in the whole data set, which is in itself - we could have stopped the study here and said, you know what, this is - this is a bad enough, like, less than 10 percent of the contributions are coming from non-white developers, or perceived non-white developers. We wanted to see, given this smaller population, is there an even further impact on whether their contributions are being accepted or not. And so what we did is, we collected a whole set of other metrics other than just their race or ethnicity, like, you know, their experience they have - how long have they been working in that particular project, how many files have they changed, a lot of other variables. And we built this regression model to find out whether their contribution would be accepted or not. Can we predict whether someone's contribution be accepted - accepted or not, and find the likelihood of their contribution being accepted? What we did find is that there is a relationship between someone's race or ethnicity from their names and whether their contributions are going to be accepted. So what we found is that Hispanic developers have about six percent lower odds of getting their pull request accepted. Keep in mind that this is controlling for their experience and various other metrics as well And API developers which is Asian or Pacific Islander have about 10 percent lower odds of getting their pull request accepted. So there is a - there is strong evidence that they have that non-white people have their contributions accepted at a lower rate. We also wanted to see whether this was true when taking the ethnicity of the person integrating the code as well is taken into account. And we found that non-white developers are actually more likely to get their contributions accepted when the integrator is also of the same ethnicity, right, and to give some results, when it's Hispanic - a Hispanic developer is going to have a 75 percent higher odds of getting their pull request accepted when the integrator is also estimated as Hispanic. And when it comes to Asian Pacific Islander it's about 36 percent higher. This is in comparison to when the integrator is a white developer. And the most stark result is when it's a Black developer and here it's not nine percent it's actually nine times higher odds so that's 900 percent, right, so this is very, very considerable amount of - considerable result here, right. So we know from these results that a - that representation is disproportional to the population of people and that unconscious - unconscious bias may exist. Now it may not be that someone is saying, oh, this person is Asian and therefore I'm going to reject their pull request, or this person is Hispanic I'm going to reject their pull request. But there might be other factors that someone might associate with them, right, the English is not great in their comment or I don't understand, you know, the variable names that they've used, or, you know, or that this person is not as experienced as I thought they were, which is not something that we saw in the first slide - we saw that it - it - only the contribution matters and not so much the other factors, right. So what now? Do we just do, like, an author blinded evaluation in GitHub, just remove the name so that you don't know who it is? I don't think so - I think we can actually use the author names and actively support a diverse group of contributors - contributing in their projects. So know that this person is from a different place or ethnicity, know that they might be a new user, help them get that contribution accepted, don't just ignore it, don't just reject, it even if you're going to reject it please give them constructive feedback so that the next time around they can get their pull request accepted. So with that I'll share some of the papers that we have. This - these papers have more details on the projects that we did and all the specific experimental settings, all our data and scripts are available if you want to take this and run this on your own repositories or compare it you can. The model scripts are also available in the papers and if you want to reach Gema and my email addresses are there and my Twitter handle is also here so thank you.