The advent of cloud computing has provided people around the world with unprecedented access to computational power and enabled rapid growth in technologies such as machine learning, the computational demands of which come with a high energy cost and a commensurate increase in carbon footprint. As a result, recent scholarship has called for better estimates of the impact of AI on greenhouse gas emissions. However, data scientists today do not have easy or reliable access to measurements of this information, which precludes consideration of how to reduce the costs (computational, electricity, environmental) associated with machine learning workloads. We argue that cloud providers presenting information about software carbon intensity to users is a fundamental stepping stone towards minimizing emissions.
In this paper, we provide a framework for measuring software carbon intensity, and propose to measure operational carbon emissions by using location-based and time-specific marginal emissions data per energy unit. We provide measurements of operational software carbon intensity for a set of modern models covering natural language processing (NLP) and computer vision applications, including four sizes of DenseNet models trained on MNIST, pretraining and finetuning of BERT-small, pretraining of a 6.1 billion parameter language model, and five sizes of Vision Transformer. We confirm previous results that the geographic region of the data center plays a significant role in the carbon intensity for a given cloud instance. We also present new results showing that the time of day has meaningful impact on operational software carbon intensity. We then evaluate a suite of approaches for reducing emissions in the cloud: using cloud instances in different geographic regions, using cloud instances at different times of day, and dynamically pausing cloud instances when the marginal carbon intensity is above a certain threshold. We find that choosing an appropriate region can have the largest impact, but emissions can be reduced by the other methods as well. Finally, we conclude with recommendations for how machine learning practitioners can use software carbon intensity information to reduce environmental impact.