Select Page

How to Solve Python JSONDecodeError: extra data

by | Programming, Python, Tips

If you want to load a JSON file using json.loads() and you have multiple records not contained in an array, you will raise the ValueError: extra data. The method json.loads() is not able to decode more than one record at once.

You can solve this error by reformatting your JSON file to contain an array or by reading the JSON file line by line, for example:

data = [json.loads(line) for line in open('extra.json','r')]

This tutorial will go through the error in detail and how to solve it with code examples.


JSONDecodeError: extra data

In Python, JSONDecodeError occurs when there is an issue with the formatting of the JSON data. In this specific case, the JSON file contains multiple JSON strings. The json.loads() method can only parse one JSON string at a time.

Value: extra data

Python developers have also encountered the error as a ValueError: extra data. In Python, a value is a piece of information stored within a particular object. We will encounter a ValueError in Python when using a built-in operation or function that receives an argument that is the right type but an inappropriate value. The data we want to read is the correct type, JSON string, but the file contains multiple JSON strings that are not inside an array, which is an inappropriate format.

Example

Let’s look at an example where we want to read JSON data into a program using json.loads(). First, let’s look at the JSON data, which contains information about five different pizzas.

{"pizza":"margherita", "price":7.99, "Details":"Contains cheese. Suitable for vegetarians"}
{"pizza":"pepperoni", "price":9.99, "Details":"Contains meat. Not suitable for vegetarians"}
{"pizza":"marinara", "price":6.99, "Details":"Dairy free. Suitable for vegetarians."}
{"pizza":"four cheese", "price":10.99, "Details":"Contains cheese. Suitable for vegetarians"}
{"pizza":"hawaiian", "price":9.99, "Details":"Contains meat. Not suitable for vegetarians"}         

Next, we will attempt to load the data into a Python object using json.loads():

import json

fi = open('sample.json','r')
pizzaJson = json.loads(fi.read())
print(pizzaJson)

Let’s run the code to see the result:

JSONDecodeError: Extra data: line 2 column 1 (char 92)

Our code throws the JSONDecodeError because the records in the JSON file are in an incorrect format. The json.loads() method is only able to read one JSON string at a time.

Solution #1: Reformat the JSON file

We can reformat the JSON file so that the records are in a list with a key pizzas. Let’s look at the revised JSON file:

{"pizzas":
[
{"pizza":"margherita", "price":7.99, "Details":"Contains cheese. Suitable for vegetarians"},
      {"pizza":"pepperoni", "price":9.99, "Details":"Contains meat. Not suitable for vegetarians"},
      {"pizza":"marinara", "price":6.99, "Details":"Dairy free. Suitable for vegetarians."},
      {"pizza":"four cheese", "price":10.99, "Details":"Contains cheese. Suitable for vegetarians"},
      {"pizza":"hawaiian", "price":9.99, "Details":"Contains meat. Not suitable for vegetarians"}
   ]
}

The code from the example does not need to change.

import json

fi = open('sample.json','r')
pizzaJson = json.loads(fi.read())
print(pizzaJson)
print(type(pizzaJson))

Let’s run the code to see the result:

{'pizzas': [{'pizza': 'margherita', 'price': 7.99, 'Details': 'Contains cheese. Suitable for vegetarians'}, {'pizza': 'pepperoni', 'price': 9.99, 'Details': 'Contains meat. Not suitable for vegetarians'}, {'pizza': 'marinara', 'price': 6.99, 'Details': 'Dairy free. Suitable for vegetarians.'}, {'pizza': 'four cheese', 'price': 10.99, 'Details': 'Contains cheese. Suitable for vegetarians'}, {'pizza': 'hawaiian', 'price': 9.99, 'Details': 'Contains meat. Not suitable for vegetarians'}]}
<class 'dict'>

We successfully loaded the JSON data into a Python dictionary. If we want to access the individual records we can use the key pizzas with the pizzaJson dictionary.

records = pizzaJson['pizzas']

for pizza in records:

    print(pizza)
{'pizza': 'margherita', 'price': 7.99, 'Details': 'Contains cheese. Suitable for vegetarians'}
{'pizza': 'pepperoni', 'price': 9.99, 'Details': 'Contains meat. Not suitable for vegetarians'}
{'pizza': 'marinara', 'price': 6.99, 'Details': 'Dairy free. Suitable for vegetarians.'}
{'pizza': 'four cheese', 'price': 10.99, 'Details': 'Contains cheese. Suitable for vegetarians'}
{'pizza': 'hawaiian', 'price': 9.99, 'Details': 'Contains meat. Not suitable for vegetarians'}

Solution #2: Use List Comprehension with json.loads()

The second way we can solve this error is to read the JSON file line by line and the JSON string on each line to the json.loads() method. The JSON file is in the original format:

{"pizza":"margherita", "price":7.99, "Details":"Contains cheese. Suitable for vegetarians"}
{"pizza":"pepperoni", "price":9.99, "Details":"Contains meat. Not suitable for vegetarians"}
{"pizza":"marinara", "price":6.99, "Details":"Dairy free. Suitable for vegetarians."}
{"pizza":"four cheese", "price":10.99, "Details":"Contains cheese. Suitable for vegetarians"}
{"pizza":"hawaiian", "price":9.99, "Details":"Contains meat. Not suitable for vegetarians"}         

We can write the command in one line of code using list comprehension. Let’s look at the revised code:

import json

pizzaJson = [json.loads(line) for line in open('sample.json','r')]

print(pizzaJson)

print(type(pizzaJson))

Let’s run the code to get the result:

[{'pizza': 'margherita', 'price': 7.99, 'Details': 'Contains cheese. Suitable for vegetarians'}, {'pizza': 'pepperoni', 'price': 9.99, 'Details': 'Contains meat. Not suitable for vegetarians'}, {'pizza': 'marinara', 'price': 6.99, 'Details': 'Dairy free. Suitable for vegetarians.'}, {'pizza': 'four cheese', 'price': 10.99, 'Details': 'Contains cheese. Suitable for vegetarians'}, {'pizza': 'hawaiian', 'price': 9.99, 'Details': 'Contains meat. Not suitable for vegetarians'}]
<class 'list'>

We successfully loaded the JSON strings into a list.

Summary

Congratulations on reading to the end of this tutorial! The JSONDecodeError: Extra data occurs when the JSON file contains multiple JSON strings that are in an incorrect format. If you have multiple JSON strings, they should be in a list as a value in a dictionary, with an appropriate key. Otherwise, you can read the JSON strings in the file line by line using a for loop or list comprehension.

For further reading on errors involving JSON, go to the articles:

Have fun and happy researching!

Research Scientist at Moogsoft | + posts

Suf is a research scientist at Moogsoft, specializing in Natural Language Processing and Complex Networks. Previously he was a Postdoctoral Research Fellow in Data Science working on adaptations of cutting-edge physics analysis techniques to data-intensive problems in industry. In another life, he was an experimental particle physicist working on the ATLAS Experiment of the Large Hadron Collider. His passion is to share his experience as an academic moving into industry while continuing to pursue research. Find out more about the creator of the Research Scientist Pod here and sign up to the mailing list here!