With my Ph.D. program starting this fall, I expect I’ll be doing a lot more programming. I used to program a lot as an undergraduate, but, well, that was a long time ago.
I’ve been teaching myself Python, so I was excited when I learned a colleague was looking for a way to convert an .xml file to a .csv file. There was just one specific variable they were looking to export into .csv format, so the code is specific to that.
Since I’ll probably be coding a lot more, I figured I’d post this bit of code here.
_____
import csv
from xml.etree import ElementTree
infile = raw_input(“Name of xml file: “) # ask user for file to convert
# create name output file, same as input file replacing .xml with .csv
out = ” ”
for letter in infile:
if letter != “.”:
out += letter
else:
break
out += “.csv”
# parse input file
with open(infile, ‘rt’) as f:
tree = ElementTree.parse(f)
#identify data to export to .csv
out_data = []
out_data.append(‘beta’) # header column: variable we’re interested in
out_data.append(‘source’) # header column: name of file being converted
for node in tree.iter(): #iterate through .xml file
if node.tag == “{http://www.dmg.org/PMML-4_1}PCell”: #look for the tag holding the variable we’re interested in
beta = node.attrib.get(‘beta’) #grab data from variable we’re interested in
out_data.append(beta) # add data to output
out_data.append(infile) # add name of converted file to output
# write .csv file
out_file = open(out, “wb”)
csv_writer = csv.writer(out_file, quoting=csv.QUOTE_NONE)
count = 0
for row in out_data: #iterate through output data putting commas and line breaks in correct places
count += 1
out_file.write(row) # write data to .csv file
if count%2 == 0:
csv_writer.writerow(” “) # we’re outputing two columns of data, so add a line break if two columns have been added
else:
out_file.write(“,”) #else, add a “,” to seperate data elements on the same row
out_file.close() # close file
print “wrote %s” % out