Working With XML and JSON Data in Python (90/100 Days of Python)
XML and JSON are two popular data formats used for data exchange and storage. They are widely used in web development, data processing, and data analysis. Python provides built-in modules to handle XML and JSON data. In this tutorial, we will discuss how to work with XML and JSON data in Python.
Working with XML Data in Python
Python provides an XML parsing module called ElementTree, which allows you to work with XML data easily. Let’s see how to parse and manipulate XML data in Python.
Imagine having the following XML data:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<person>
<name>John Smith</name>
<age>35</age>
<address>
<street>123 Main St</street>
<city>New York</city>
<state>NY</state>
<zip>10001</zip>
</address>
</person>
<person>
<name>Jane Doe</name>
<age>27</age>
<address>
<street>456 Elm St</street>
<city>Los Angeles</city>
<state>CA</state>
<zip>90001</zip>
</address>
</person>
</root>
Parsing XML Data
The first step is to parse the XML data into an ElementTree object. You can parse XML data from a file or a string:
import xml.etree.ElementTree as ET
tree = ET.parse('data.xml') # parse an XML file
root = tree.getroot() # get the root element
print(root.tag) # get the tag name of the root element
In this example, we imported the ElementTree module and parsed an XML file called data.xml
. The getroot()
method returns the root element of the XML document.
Accessing XML Data
Once you have the root element, you can access the XML data using ElementTree’s methods:
for child in root: # get the attributes of an element
print('attrib:', child.attrib) # print the attribute dictionary
for child in root: # get the text of an element
print(child.text) # Print the text if it exists or an empty string
for child in root.findall('person'): # find an element by tag name
print(child.find('name').text) # Find the name and print the value
# find people with age > 30
print('---------------------')
for child in root.findall('person'):
if int(child.find('age').text) > 30:
print(child.find('name').text)
In this example, we accessed the tag name, attributes, and text of the elements in the XML document. We also demonstrated how to find elements using tag names and attribute values.
Modifying XML Data
You can also modify XML data using ElementTree
’s methods:
# add a new person
new_person = ET.Element('person')
person_details = ET.Element('name')
person_details.text = 'Anna'
new_person.append(person_details)
root.append(new_person)
# modify an element
for child in root.findall(".//*[name='John Smith']"):
child.attrib = {'modified': 'Modified Person Text'}
# remove an element
for child in root.findall(".//*[name='Jane Doe']"):
root.remove(child)
In this example, we added a new element, modified an existing element, and removed an element from the XML document.
Writing XML Data
You can also write the modified XML data back to a file or a string using ElementTree’s methods:
# write the modified XML data to a file
tree.write('modified_data.xml')
# write the modified XML data to a string
xml_string = ET.tostring(root)
print(xml_string)
In this example, we wrote the modified XML data to a file called modified_data.xml
and to a string.
So, the final result might look something like the following:
<root>
<person modified="Modified Person Text">
<name>John Smith</name>
<age>35</age>
<address>
<street>123 Main St</street>
<city>New York</city>
<state>NY</state>
<zip>10001</zip>
</address>
</person>
<person>
<name>Anna</name>
</person>
</root>
Working with JSON Data in Python
Python also provides built-in support for working with JSON data. Let’s see how to parse and manipulate JSON data in Python.
Parsing JSON Data
Python provides a JSON module that allows you to parse JSON data easily:
import json
# parse a JSON string
json_string = '{"name": "John", "age": 30, "city": "New York"}'
data = json.loads(json_string) # returns a dictionary
In this example, we imported the JSON module and parsed a JSON string using the loads()
method. The data
variable contains a dictionary object that represents the parsed JSON data.
Accessing JSON Data
Once you have the JSON data as a Python object, you can access it using Python’s dictionary syntax:
print(data['name']) # access a value by key
for key, value in data.items(): # loop through all key-value pairs
print(key, value) # print key and value
In this example, we accessed the values in the dictionary using their keys. We also demonstrated how to loop through all key-value pairs in the dictionary.
Modifying JSON Data
You can also modify JSON data using Python’s dictionary syntax:
data['email'] = 'john@example.com' # add a new key-value pair
data['age'] = 31 # modify a value
del data['city'] # remove a key-value pair
In this example, we added a new key-value pair, modified an existing value, and removed a key-value pair from the dictionary.
Writing JSON Data
You can also write the modified JSON data back to a file or a string using the json.dumps()
method:
# write the modified JSON data to a file
with open('modified_data.json', 'w') as f:
json.dump(data, f)
# write the modified JSON data to a string
json_string = json.dumps(data)
print(json_string)
In this example, we wrote the modified JSON data to a file called modified_data.json
and to a string using the json.dumps()
method.
What’s next?
- If you found this story valuable, please consider clapping multiple times (this really helps a lot!)
- Hands-on Practice: Free Python Course
- Full series: 100 Days of Python
- Previous topic: Working with Excel Sheets and CSV Files Using Pandas for Data Processing
- Next topic: Mastering Image Processing in Python with Scikit-Image