More than half of the digital data firms generate is collected, processed and stored for single-use purposes. Often, it is never re-used. This could be your multiple near-identical images held on Google Photos or iCloud, a business’s outdated spreadsheets that will never be used again, or data from internet of things sensors that have no purpose.
This “dark data” is anchored to the real world by the energy it requires. Even data that is stored and never used again takes up space on servers—typically huge banks of computers in warehouses. Those computers and those warehouses all use lots of electricity.
This is a significant energy cost that is hidden in most organizations. Maintaining an effective organizational memory is a challenge, but at what cost to the environment?
In the drive towards net zero many organizations are trying to reduce their carbon footprints. Guidance has generally centered on reducing traditional sources of carbon production, through mechanisms such as carbon offsetting via third parties (planting trees to make up for emissions from using petrol, for instance).
A digital carbon footprint
While most climate change activists are focused on limiting emissions from the automotive, aviation and energy industries, the processing of digital data is already comparable to these sectors and is still growing. In 2020, digitization was purported to generate 4% of global greenhouse gas emissions. Production of digital data is increasing fast— this year the world is expected to generate 97 zettabytes (that is: 97 trillion gigabytes) of data. By 2025, it could almost double to 181 zettabytes. It is therefore surprising that little policy attention has been placed on reducing the digital carbon footprint of organizations.
When we talk to people about our work, we find they often assume that digital data, and indeed the process of digitization, is carbon neutral. But that is not necessarily the case—we are in control of its carbon footprint for better or worse. To help reduce this footprint, we have introduced the idea of “digital decarbonization“. By this, we don’t mean using phones, computers, sensors and other digital technologies to reduce an organization’s carbon footprint. Rather, we are referring to reducing the carbon footprint of digital data itself. It is key to recognize that digitization is not itself an environmental issue, but there are huge environmental impacts that depend on how we use digital processes in daily workplace activities.
To illustrate the magnitude of the dark data situation, data centers (responsible for 2.5% of all human-induced carbon dioxide) have a greater carbon footprint than the aviation industry (2.1%). To put this into context, we have created a tool that can help calculate the carbon cost of data for an organization.
Using our calculations, a typical data-driven business such as insurance, retail or banking, with 100 employees, might generate 2,983 gigabytes of dark data a day. If they were to keep that data for a year, that data would have a similar carbon footprint to flying six times from London to New York. Currently, companies produce 1,300,000,000 gigabytes of dark data a day—that’s 3,023,255 flights from London to New York.
The rapid growth of dark data raises significant questions about the efficiency of current digital practices. In a study recently published in the Journal of Business Strategy we identified ways to help organizations reuse digital data, and highlight pathways for organizations to follow when collecting, processing and storing new digital data. We hope this can reduce dark data production and contribute to the digital decarbonization movement, which we will all need to engage with if net zero is to be realized.