Either one or everyone: Two paths for accountability in data science πŸ‘¨πŸΌβ€βš–οΈ

Professionals need to hold each other accountable. Especially data scientists.

In data science this can be difficult. Non data scientists might struggle to judge your work and chances are, that there are only a few other data scientists in your group / department / company.

If there is nobody who can judge you work, what keeps you from cheating / slacking / lying?

There are two paths you can take. A hard one and a scary one.

First, the hard path: Make your body of work a part of your character. Remove the distance between yourself at work and yourself at home. A liar remains a liar, whether you lie through deceiving graphs, selective reporting, or by making up stories.

“The only statistics you can trust are those you falsified yourself " Attributed to Sir Winston Churchill

Let’s work in a way, that at least we ourselves dare to trust our own work.

This path gets harder when nobody is looking and the stakes are high. Data science requires more character than people think.

Now, the scary path: Holding yourself accountable is hard, but it is comfy. Attack your own ego by enabling your world to follow your work. Write it down. Give a talk about it. Publish your code. Open source your code.

Take accountability to the limit: Create evidence of what you did or didn’t do. Do you feel good about the paper trail you left?

You had nobody watching you. But now, everybody could be watching you.

Watch your step.