Data and Methodology


Four types of climatic datasets with 3 parameters each (maximum temperature, minimum temperature, and rainfall) over Pakistan are used in this study;

  1. PMD observed data
  2. GCM simulated data
  3. High resolution statistically downscaled GCM data
  4. High resolution dynamically downscaled GCM data

PMD observed data

The PMD observed daily data is taken over a wide range of 50 stations spread all over the country. Details of the 50 stations are given in Table 2.1. Selection of stations includes elevation perspective as well as representation from all possible climatic zones of the country (Salma et al., 2012).

GCM simulated data

GCMs daily simulated data for historical period (1970-2004) is taken under Coupled Model Intercomparison Project Phase 5 (CMIP5) protocols. Details of models are given in Table 2.2. GCMs describe three-dimensional geometry of the atmosphere and other components of Earth’s climate system. Atmospheric GCMs numerically solve the equations of physics (e.g., dynamics, thermodynamics, radiative transfer, etc.) and chemistry applied to the atmosphere and its constituent components, including the greenhouse gases. The CMIP5 provides a multi-model context for assessing the mechanisms responsible for model differences in poorly understood feedbacks associated with the carbon cycle and with clouds. It also examines climate “predictability” and explores the ability of models to predict climate on decadal time scales, and, more generally, determines why similarly forced models produce a range of responses (Taylor et al., 2012).

High resolution statistically downscaled GCM data

Downscaling is a method for obtaining high-resolution climate or climate change information from relatively coarse-resolution global climate models (GCMs). Typically, GCMs have a resolution of 150-300 km by 150-300 km. Many impacts models require information at scales of 50 km or less, so some method is needed to estimate the smaller-scale information. Statistical downscaling first derives statistical relationships between observed small-scale (often station level) variables and larger (GCM) scale variables, using either analogue methods (circulation typing), regression analysis, or neural network methods. Future values of the largescale variables obtained from GCM projections of future climate are then used to derive the statistical relationships and so estimate the smaller-scale details of future climate (see e.g., Burhan et al., 2015a).

Table 2.1:PMD stations with coordinates and elevations.

This study has used ensemble mean of three NEX Global Daily Downscaled Climate Models viz. “CanESM2” of Canadian Centre for Climate Modelling and Analysis, “CNRM-CM5” of Centre National de Recherches Meteorologiques, France, and “MRI-CGCM3” of Meteorological Research Institute, Japan, all bias corrected with quantile mapping (Thrasher et al., 2012). Common historical time period of 35 years (1970-2004) is taken for the analysis. The NASA Earth Exchange Global Daily Downscaled Projections (NEX-GDDP) dataset is comprised of downscaled climate scenarios for the globe that are derived from the GCM runs conducted under the CMIP5 and across two of the four greenhouse gas emissions scenarios known as Representative Concentration Pathways (RCPs). The CMIP5 GCM runs were developed in support of the Fifth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC AR5). The NEX-GDDP dataset includes downscaled projections for RCP 4.5 and RCP 8.5 from 21 models and scenarios for which daily scenarios were produced and distributed under CMIP5. Each of the climate projections includes daily maximum temperature, minimum temperature, and precipitation for the periods from 1950 through 2100. The spatial resolution of the dataset is 0.25 degrees (~25 km x 25 km).

Table 2.2: GCMs with their modeling institutes and horizontal resolutions.

High resolution dynamically downscaled GCM data

Dynamical downscaling uses a limited-area, high-resolution model (a regional climate model, or RCM) driven by boundary conditions from a GCM to derive smaller-scale information. RCMs generally have a domain area of 106 to 107 km2 and a resolution of 20 to 60 km. This study has used Coordinated Regional Climate Downscaling Experiment (CORDEX) data of six dynamically downscaled GCMs viz. CSIRO CCAM ACCESS-1, CSIRO CCAM CCSM4, CSIRO CCAM GFDL-CM3, CSIRO CCAM MPI-ESM-LR, REMO2009 MPI-ESM-LR, and SMHI RCA4 ICHEC-EC-EARTH, for South Asian region at 25 Km horizontal resolution. For the South Asian region, CORDEX presents an unprecedented opportunity to advance knowledge of regional climate responses to global climate change (see e.g., Burhan et al., 2015b), and for these insights to feed into Working Groups One and Two of the IPCC Fifth Assessment Report as well as to on-going climate adaptation and risk assessment research and policy planning in the region.


Climate data from 50 PMD stations were first put to quality check and control using RClimDex (1.0) (Zhang et al., 2004). Missing years, months and days were sorted out and logged for each of the stations, and a missing percentage calculated for the selection of stations (Figure 2.1). It was found that years prior to 1970 had higher percentages of missing climate data. Moreover, higher percentages of missing data were concentrated to south-west regions of the country. The stations with higher percentage of continuously available climate data were selected for which the missing data percentage was less than 5% in 54 years. The selected stations (Figure 2.2) were then formatted for further quality check and control in RClimDex (1.0) software.

missing data info.png
Figure 2.1: Station-wise missing climate data percentage shown both graphically and spatially.

The following checks were taken into consideration while quality controlling the data;

  • The outliers were identified and set to missing using a 3 × Standard Deviation criteria for maximum and minimum temperature.
  • For precipitation, corresponding daily values of monthly maxima were searched in “Monthly Climatic Normals of Pakistan”, and set as upper limit thresholds for the identification of precipitation outliers.
  • Quality checks were also done for values showing minimum temperature ≥ maximum temperature.

After the quality check and control procedure, extreme climate indices were calculated with PMD observed data for 27 stations based on provincial division. Each selected station was strategically chosen so that it may deliver maximum coverage to the province it represents. For GCMs and RCMs, province-wise indices were calculated and compared with observed to find for identical climate extremes in the region.

Figure 2.2: Province-wise selected stations with relief in meters