Data & Input/Output

I think it’s probably unsurprising to say that we run on data. We’re scientists, data is what we do. A big part of this is data presentation, but we also need to develop tricks and strategies for managing the raw data we’re working with, both loading it into our code and writing out our results.

When you’re prototyping a new method or just getting started, there’s a decent chance you’re copying and pasting. As you start to do serious work, though, copy-paste becomes both cumbersome and error prone (with the latter being the biggest issue).

We all learned how to work with data the hard way. Our hope is that we can provide you some tips and tricks so that you can hit fewer pitfalls than we did.

Here’s the roadmap:

Now that you have the basics down, let’s dig into some particular cases:

Or ignore us and listen to MolSSI instead (spoiler alert: they agree, data management is key):


Edit on GitHub