I was going through Django's source code & I found some fascinating details for loaddata
command which provides initial data in your application using fixtures.
These details were not present in Django's documentation directly So I thought of writing this article to explain my findings.
We will start with very basic & will go into more details gradually. If you never used it & just don't know what a fixture is in Django then let me quote the Django doc here,
A fixture is a collection of data that Django knows how to import into a database
So in simple words, If you have some static data which you want to load into your database you can simply do that by writing that data in a specific way that Django understands in the file format of JSON
, XML
, or YAML
. So can call the loaddata
command and provide the filename & it will load that into the database & that file is known as a fixture.
Where to store these fixture files?
So by default, Django looks in the fixtures
directory of all the installed apps you have but you can configure additional directories for it & Django will look into that as well.
You can configure FIXTURE_DIRS
in the settings.py
where you can provide a list of directories where you want Django to look & find the fixture files. Read more here
Our setup.
Before we dive into the details let's have some basic setup & assumptions in place. Let's assume we have already initialized a Django project & create an app called testapp
. For this example, I will use the SQLite database. I created a simple model in which we need to work with fixtures because the data in the fixture files will be represented in the form of a model object only. Our model looks like this,
from django.db import models
class Person(models.Model):
class Meta:
db_table = 'persons'
name = models.CharField(max_length=255)
The model has only one field called name
which we will use to store the name of the Person.
After you have created the model just don't forget to run the migration command like this
python manage.py makemigrations && python manage.py migrate
JSON Format.
The most common way to write a fixture is in JSON format. You can create a folder named fixtures
inside your app & can name it whatever you want. Let's create a fixtures
directory inside testapp
where I create a file named persons.json
which will look like this,
[
{
"model": "testapp.person",
"pk": 1,
"fields": {
"name": "John"
}
},
{
"model": "testapp.person",
"pk": 2,
"fields": {
"name": "Paul"
}
}
]
In the JSON file you will provide a list of objects & each object will be considered as a single row in a database. These objects are self-explanatory. As you can see the first key is the model
which tells Django to use this model while loading this data into the database. This is the reason we had created a model file initially. You have to provide the complete model name with the app name.
The second key is the pk
which is nothing but the primary key of the database. You have to provide the pk
field for all the rows you want to have in your database.
The third key fields
takes the JSON object of key-value pairs where a key is the field
name in your model & value is the value you want to store in the database.
Now to load this data into our database we just have to fire the command,
python manage.py loaddata persons.json
YAML Format.
The YAML file format is also pretty straightforward like JSON. You just have to create a YAML file in our case It will be the persons.yaml
file. The only problem with using the YAML file as the fixture is that you need to have the PyYAML
package installed in your environment. Django depends on this package to read YAML files.
You can install this package using pip
like this,
pip install pyyaml
Our YAML file looks something like this,
-
model: testapp.person
pk: 1
fields: {name: John}
-
model: testapp.person
pk: 2
fields: {name: Paul}
It is very much the same as JSON only you have to follow YAML conventions to write the fixture data.
We have to use the same command to load this YAML file as well, the only thing which needs to change is the file name,
python manage.py loaddata persons.yaml
Django will figure out that this is a YAML file based n the file extension.
XML Format.
Now, this is interesting as you can find examples of JSON & YAML format in the official documentation but at the time of writing this article, there is no example of the XML format available there. Not many people use XML format these days but in case you have to use it then follow this example.
<?xml version="1.0" encoding="utf-8"?>
<django-objects version="1.0">
<object model="testapp.person" pk="1">
<field name="name" type="CharField">John</field>
</object>
<object model="testapp.person" pk="2">
<field name="name" type="CharField">Paul</field>
</object>
</django-objects>
This is the valid format to write fixtures in XML. If you don't follow it then Django will not be able to interpret the data into the model object. I don't have to explain the different options you fill in here which I have already done.
The only extra effort you have to put in for adding the type
option for each record.
Use the same command to load the data just don't forget the .xml
extension in the file name.
What if you don't provide pk
field in fixtures?
We have seen in order to create a fixture file need to specify pk
field for each entry. It is not a hard requirement for fixture entry but if do not provide pk
field then Django will automatically generate one for you. The problem is that it will create duplicate entries with that data in the database each time you run the command.
So if you are ok with Django creating duplicate entries then you can surely skip the pk
field.
You can compress fixture files.
If you have a fixture file that contains many entries & the size of the file is large so you can compress that fixture file to reduce its size still Django can read those files directly without you doing manual decompression. This is another detail that is currently not present in the documentation & I came to know after I looked into the source code. So Django supports the following compression formats,
- gz
- zip
- bz2
- bz2
- xz
You can compress your fixture file in that format & Django will be able to read those files. While compressing the fixture files you have to keep these things in mind.
- The compressed file should not have multiple files in that, once fixture file should be compressed to one single compressed file
- You need to follow this file name convention in order to generate a compressed fixture file
file-name.file-ext.compression-ext
for examplepersons.json.gz
You can we are providing the.json
extension in the compressed file. This way Django will know that after decompression it will get a JSON file & will act accordingly. If you usingXML
files then you name it like thispersons.xml.gz
. Now while loading this compressed fixture file you will have to provide a compression format like this,python manage.py loaddata persons.json.gz --format gz
.
There is actually a way to generate compressed fixtures data directly if you have that in your database already. You can read about it here. I will talk about it in some other article.
How to load all the fixture files in one go?
In your project, you could more than a couple of files as fixtures. It can be tedious to type all the filename names one by one. Unfortunately, there is no such command or options exist out of the box in Django. This is the only reason I was looking into the source code to find the answer. There is one way I could find on StackOverflow to load all the fixtures at one go, You will have to use regex for that, simply use this
python manage.py loaddata **/fixtures/*.json
**/fixtures/*.json
will be evaluated as tuple of file paths when executed
This is it, hope you have learned something new about fixtures. Feel free to drop your feedback or questions in the comments...