Python Specific File Types

Okay, so by now you have probably figured out that we are all pretty big fans of Python. So it shouldn’t a big surprise that we pass along/save/archive data in Python specific files. There are two types *.npy and *.npz that we commonly use.

Note: these types of files can only be written/read by NumPy. If you want to exchange data between languages, look into using a standard text format like JSON.

.npy Files

A *.npy is a file extensition specifically for storing NumPy arrays. This file type is able to save all the information required to reconstruct the same NumPy array on any computer, i.e. it knows the dtype and shape information of the array.
This comes in handy particulary when you are passing data around between group members or have worked up a large set of data and want to save it for later.

How to Use:

  • In order to create one of these files you will make use of the np.save() command. Read about it here
  • In order to read one of these files you will make use of the np.load() command. Read more here

.npz Files

A *.npz is a file extensition for storing multiple NumPy arrays or variable values. With these files, you can store multiple arrays that don’t neccessarily have to have the same shape. This comes in handy when you need to send/save data, parameters, results, etc all together. These files are essentially “zipped” .npy files where the file name of the npy is constructed by some key.

How to Use:

  • In order to create one of these files you will make use of the np.savez() command. Read about it here
  • In order to read one of these files you will make use of the same np.load() command as you would for a *.npy. Read more here

An Example

If I wanted to store the results of a DVR calculation along with the parameters to make it easily reproducible, I would do something like:

grid = ... # type: np.ndarray
wfns = ... # type: np.ndarray
energies = ... # type: np.ndarray
params = ... # type: dict

np.savez("DVRSpectrum_harmonic.npz", grid=grid, energies=energies, wavefunctions=wfns, params=params)

Alternatively, if I recieved the file DVRSpectrum_harmonic.npz and wanted to get the data back out, I’d load it in like

data = np.load("DVRSpectrum_harmonic.npz")

then were I not sure what was in the file I could check the keys similarly to a dict by

>>> print(data.keys())
['grid', 'energies', 'wavefunctions', 'params']

and so if I just wanted the DVR parameters I’d access them like

>>> params_dict = data["params"]
{...}

Next: Gaussian, an Intro
Previous: Exporting Data Out

Got questions? Ask them on the McCoy Group Stack Overflow


Edit on GitHub