Gender Bias in Open Source

I was trolling for something to write about today when I ran across this article click-bitingly titled “Women are better at coding than men — if they hide their gender.”

The article reports on an interesting, recently released study of Gender Bias in Open Source which looks at “acceptance rates of contributions from men versus women” on GitHub – an online community where users share, collaborate on, and review code in a variety of programming languages.

The study found that “women’s contributions tend to be accepted more often than men’s. However, when a woman’s gender is identifiable, they are rejected more often. Our results suggest that although women on GitHub may be more competent overall, bias against them exists nonetheless.”

This is troubling.

Interestingly, the article I found the study through takes these finding as a sign that women are better at coding than men – even adding the titillating header “the future really is female.

Of course, that’s not an entirely accurate reading of the study. (To be fair, I imagine that the article’s title and quaintly 1950s header image were not selected by the author.)

As the study’s author’s themselves explain, there are many reasons why their analysis may have found women, on average, to be better coders. A key explanation may be what is known as survivorship bias: “as women continue their formal and informal education in computer science, the less competent ones may change fields or otherwise drop out. Then, only more competent women remain by the time they begin to contribute to open source. In contrast, less competent men may continue.”

That is, there’s no secret coding gene that makes women better programmers – rather, it is much harder for a woman to survive in the coding world, and therefore those who do are the best.

This explanation resonates with research done in other fields, and is underscored by a 2013 survey finding that only 11% of open source developers are female.

With that ratio, it would rather be surprising if the average woman did resemble the average man.

The ironic thing is that attention grabbing headlines declaring women better coders – while seemingly feminist in nature – have the unfortunate effect of obfuscating the real barriers to gender parity.

Women aren’t better coders; the women who are allowed to survive as coders are by necessity only the best. They are held to higher standards and constantly forced to the sidelines. In order to simply do the work they love, they are forced – in the words of one study referencing StackOverflow – to participate in a “relatively ‘unhealthy’ community.”

It’s hardly a wonder that women tend to “disengage sooner.”

facebooktwittergoogle_plusredditlinkedintumblrmail

XML to CSV

With my Ph.D. program starting this fall, I expect I’ll be doing a lot more programming. I used to program a lot as an undergraduate, but, well, that was a long time ago.

I’ve been teaching myself Python, so I was excited when I learned a colleague was looking for a way to convert an .xml file to a .csv file. There was just one specific variable they were looking to export into .csv format, so the code is specific to that.

Since I’ll probably be coding a lot more, I figured I’d post this bit of code here.

_____

import csv
from xml.etree import ElementTree

infile = raw_input(“Name of xml file:  “) # ask user for file to convert

# create name output file, same as input file replacing .xml with .csv
out = ” ”
for letter in infile:
if letter != “.”:
out += letter
else:
break

out += “.csv”

# parse input file
with open(infile, ‘rt’) as f:
tree = ElementTree.parse(f)

#identify data to export to .csv
out_data = []
out_data.append(‘beta’)  # header column: variable we’re interested in
out_data.append(‘source’) # header column: name of file being converted

for node in tree.iter(): #iterate through .xml file
if node.tag == “{http://www.dmg.org/PMML-4_1}PCell”: #look for the tag holding the variable we’re interested in
beta = node.attrib.get(‘beta’) #grab data from variable we’re interested in
out_data.append(beta) # add data to output
out_data.append(infile) # add name of converted file to output

# write .csv file
out_file  = open(out, “wb”)
csv_writer = csv.writer(out_file, quoting=csv.QUOTE_NONE)

count = 0

for row in out_data: #iterate through output data putting commas and line breaks in correct places
count += 1
out_file.write(row) # write data to .csv file
if count%2 == 0:
csv_writer.writerow(” “) # we’re outputing two columns of data, so add a line break if two columns have been added
else:
out_file.write(“,”) #else, add a “,” to seperate data elements on the same row

out_file.close() # close file

print “wrote %s” % out

facebooktwittergoogle_plusredditlinkedintumblrmail