Working With XML and JSON Data in Python (90/100 Days of Python)

Martin Mirakyan
4 min readApr 1, 2023

--

Day 90 of the “100 Days of Python” blog post series covering working with XML and JSON data

XML and JSON are two popular data formats used for data exchange and storage. They are widely used in web development, data processing, and data analysis. Python provides built-in modules to handle XML and JSON data. In this tutorial, we will discuss how to work with XML and JSON data in Python.

Working with XML Data in Python

Python provides an XML parsing module called ElementTree, which allows you to work with XML data easily. Let’s see how to parse and manipulate XML data in Python.

Imagine having the following XML data:

<?xml version="1.0" encoding="UTF-8"?>
<root>
<person>
<name>John Smith</name>
<age>35</age>
<address>
<street>123 Main St</street>
<city>New York</city>
<state>NY</state>
<zip>10001</zip>
</address>
</person>
<person>
<name>Jane Doe</name>
<age>27</age>
<address>
<street>456 Elm St</street>
<city>Los Angeles</city>
<state>CA</state>
<zip>90001</zip>
</address>
</person>
</root>

Parsing XML Data

The first step is to parse the XML data into an ElementTree object. You can parse XML data from a file or a string:

import xml.etree.ElementTree as ET


tree = ET.parse('data.xml') # parse an XML file
root = tree.getroot() # get the root element
print(root.tag) # get the tag name of the root element

In this example, we imported the ElementTree module and parsed an XML file called data.xml. The getroot() method returns the root element of the XML document.

Accessing XML Data

Once you have the root element, you can access the XML data using ElementTree’s methods:

for child in root:                      # get the attributes of an element
print('attrib:', child.attrib) # print the attribute dictionary

for child in root: # get the text of an element
print(child.text) # Print the text if it exists or an empty string

for child in root.findall('person'): # find an element by tag name
print(child.find('name').text) # Find the name and print the value

# find people with age > 30
print('---------------------')
for child in root.findall('person'):
if int(child.find('age').text) > 30:
print(child.find('name').text)

In this example, we accessed the tag name, attributes, and text of the elements in the XML document. We also demonstrated how to find elements using tag names and attribute values.

Modifying XML Data

You can also modify XML data using ElementTree’s methods:

# add a new person
new_person = ET.Element('person')
person_details = ET.Element('name')
person_details.text = 'Anna'
new_person.append(person_details)
root.append(new_person)

# modify an element
for child in root.findall(".//*[name='John Smith']"):
child.attrib = {'modified': 'Modified Person Text'}

# remove an element
for child in root.findall(".//*[name='Jane Doe']"):
root.remove(child)

In this example, we added a new element, modified an existing element, and removed an element from the XML document.

Writing XML Data

You can also write the modified XML data back to a file or a string using ElementTree’s methods:

# write the modified XML data to a file
tree.write('modified_data.xml')

# write the modified XML data to a string
xml_string = ET.tostring(root)
print(xml_string)

In this example, we wrote the modified XML data to a file called modified_data.xml and to a string.

So, the final result might look something like the following:

<root>
<person modified="Modified Person Text">
<name>John Smith</name>
<age>35</age>
<address>
<street>123 Main St</street>
<city>New York</city>
<state>NY</state>
<zip>10001</zip>
</address>
</person>
<person>
<name>Anna</name>
</person>
</root>

Working with JSON Data in Python

Python also provides built-in support for working with JSON data. Let’s see how to parse and manipulate JSON data in Python.

Parsing JSON Data

Python provides a JSON module that allows you to parse JSON data easily:

import json

# parse a JSON string
json_string = '{"name": "John", "age": 30, "city": "New York"}'
data = json.loads(json_string) # returns a dictionary

In this example, we imported the JSON module and parsed a JSON string using the loads() method. The data variable contains a dictionary object that represents the parsed JSON data.

Accessing JSON Data

Once you have the JSON data as a Python object, you can access it using Python’s dictionary syntax:

print(data['name'])                 # access a value by key

for key, value in data.items(): # loop through all key-value pairs
print(key, value) # print key and value

In this example, we accessed the values in the dictionary using their keys. We also demonstrated how to loop through all key-value pairs in the dictionary.

Modifying JSON Data

You can also modify JSON data using Python’s dictionary syntax:

data['email'] = 'john@example.com'  # add a new key-value pair
data['age'] = 31 # modify a value
del data['city'] # remove a key-value pair

In this example, we added a new key-value pair, modified an existing value, and removed a key-value pair from the dictionary.

Writing JSON Data

You can also write the modified JSON data back to a file or a string using the json.dumps() method:

# write the modified JSON data to a file
with open('modified_data.json', 'w') as f:
json.dump(data, f)

# write the modified JSON data to a string
json_string = json.dumps(data)
print(json_string)

In this example, we wrote the modified JSON data to a file called modified_data.json and to a string using the json.dumps() method.

What’s next?

--

--

Martin Mirakyan
Martin Mirakyan

Written by Martin Mirakyan

Software Engineer | Machine Learning | Founder of Profound Academy (https://profound.academy)

No responses yet