Climate Analysis Project: Data Visualization with GHCN-M Dataset

This comprehensive project explores climate data analysis using the Global Historical Climatology Network (GHCN) dataset. The project is divided into three sessions, covering data acquisition, processing, analysis, visualization, and interactive dashboard creation.

Session 1: Data Acquisition and Processing

Objectives:

  • Introduction to climate data sources and the GHCN-M dataset
  • Setting up the project environment with pandas, numpy, and matplotlib
  • Data cleaning and preprocessing techniques
  • Basic data exploration and statistical analysis

Code Overview:

# Sample code from climate_analysis.py
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Load and preprocess GHCN-M data
def load_temperature_data(filepath):
    """Load temperature data from CSV file."""
    df = pd.read_csv(filepath)
    # Data cleaning and preprocessing
    df = df.dropna(subset=['temperature'])
    df['date'] = pd.to_datetime(df['date'])
    return df

# Calculate global average temperatures
def calculate_global_avg(df):
    """Calculate global average temperatures by year."""
    return df.groupby(df['date'].dt.year)['temperature'].mean()
          

Session 2: Data Visualization and Analysis

Objectives:

  • Creating advanced visualizations with matplotlib
  • Analyzing global temperature trends over time
  • Regional temperature comparisons
  • Seasonal temperature analysis
  • Identifying extreme temperature events

Code Overview:

# Sample code from visualization.py
def plot_global_temperature_trend(annual_avg):
    """Plot global temperature trend over time."""
    plt.figure(figsize=(12, 6))
    plt.plot(annual_avg.index, annual_avg.values, 'r-')
    plt.title('Global Average Temperature Trend (1900-2023)')
    plt.xlabel('Year')
    plt.ylabel('Temperature (°C)')
    plt.grid(True, linestyle='--', alpha=0.7)
    plt.savefig('plots/global_temperature_trend.png')
    
def plot_regional_heatmap(regional_data):
    """Create heatmap of regional temperature changes."""
    pivot = regional_data.pivot(index='region', columns='decade', values='temp_change')
    plt.figure(figsize=(12, 8))
    sns.heatmap(pivot, cmap='RdBu_r', center=0, annot=True)
    plt.title('Regional Temperature Changes by Decade')
    plt.savefig('plots/regional_heatmap.png')
          

Session 3: Interactive Dashboard with FastAPI

Objectives:

  • Introduction to web frameworks and FastAPI
  • Creating an interactive dashboard
  • Serving visualizations through a web interface
  • Implementing interactive data filtering
  • Deployment and testing

Code Overview:

# Sample code from app.py
from fastapi import FastAPI, Request
from fastapi.templating import Jinja2Templates
from fastapi.staticfiles import StaticFiles
import pandas as pd

app = FastAPI()
app.mount("/static", StaticFiles(directory="static"), name="static")
app.mount("/plots", StaticFiles(directory="plots"), name="plots")
templates = Jinja2Templates(directory="templates")

@app.get("/")
async def dashboard(request: Request):
    """Serve the main dashboard page."""
    return templates.TemplateResponse(
        "dashboard.html", 
        {"request": request, "title": "Climate Analysis Dashboard"}
    )

@app.get("/api/temperature_data")
def get_temperature_data():
    """API endpoint to return temperature data."""
    df = pd.read_csv("results/annual_global_avg.csv")
    return df.to_dict(orient="records")
          

Homework Assignment

Task: Enhance the climate analysis project by implementing one of the following features:

  1. Add a new visualization type (e.g., polar plot for seasonal data)
  2. Implement additional data filtering options in the interactive dashboard
  3. Create a prediction model for future temperature trends
  4. Compare the GHCN-M dataset with another climate dataset

Resources: