Apply Now

Prefer using email? Say hi at hello@datasciencesouth.com

climate-news-db

A climate change newspaper articles dataset.

Data Science South is founded on two dreams - a multi-objective optimization problem:

  1. to contribute to data science education in New Zealand,
  2. to use data to solve climate change problems.

The climate-news-db is focused on the second objective.

What is the climate-news-db?

climate-news-db is a database of climate change newspaper articles.

The goal is to curate a dataset for researchers to analyse how climate change is being covered in the media.

The project started as a collaboration between Adam Green and Marius Drozdzewski in 2019 Berlin - and has grown to nearly 6,000 articles in June 2021.

How does it work?

The climate-news-db is composed of the following services, both deployed on PythonAnywhere:

  1. a Flask front end with a SQLite database, Jinja templating, Bootstrap & Charts.js,
  2. a Python backend scraping HTML with requests & Beautiful Soup and stored as JSON.

Articles are collected using a daily Google search for climate change on 16 newspaper websites.

Can I download the data?

Yes - you can download the dataset at climate-news-db.com/download.

The dataset is composed of:

  1. climate-news-db-dataset.csv - containing metadata for each article,
  2. articles/{article_id}.txt - one text file per article (article text doesn’t play well with CSV).

If you would like the articles as JSON or a copy of our SQLite database (or if you have any suggestions for improvements.), send us an email at hello@datasciencesouth.com .

Where can I find it?

climate-news-db.com - code here.

Thanks for reading!