USE OF GIS IN
ANALYZING ENVIRONMENTAL CANCER RISKS AS A FUNCTION of GEOGRAPHIC SCALE
Mid-semester
Report
By
Zheng Cai
The overall goal of the research project is to provide an estimate of the Arsenic concentration in groundwater for every residence that every subject has occupied over the course of its lifetime.
During the first few weeks, I studied several articles on the spatial analysis, and learnt from Dr. Myers on several software using for the spatial analysis, as well as the theories behind them. Each of software has its own strengths and weaknesses. We will need to choose the “best” one to use in the research.
The vast majority of effort
on any GIS project generally involves data acquisition, or preparing the data
for analysis in the GIS. Up to the present, I have been preparing water quality
data from a variety of sources for geostatistical and geospatial analysis.
These data have required a significant amount of processing before they can be
imported into a GIS for exploratory spatial data analysis, deterministic
spatial modeling and geostatistical analysis. I transformed the raw data from
different resources into excel, then altered them into the SPSS file. We try to
keep the data into the format that in the next step we will be doing analysis
convenient.
The next steps will provide further insight as to the nature of Arsenic in groundwater, and the potential for human exposure. I will move to the steps of exploratory spatial data analysis, deterministic spatial modeling and geostatistical modeling. They are as follows:
1. Exploratory Spatial Data Analysis (ESDA)
Tidy up the summary
measures of central tendency and dispersion for each county in AZ. In the cases
that data permits, I will need to summarize measures for certain communities
within counties. We will need univariate and bivariate descriptive data by county,
community and well. Some of these analyses will require further use of SPSS
software. These analyses will focus on Arsenic, as well as some other variables
along with, such as well depth and other contaminants that may correlate with
Arsenic.
2. Deterministic Spatial Modeling
After having characterized
and described the arsenic data for each well, we will be interested in
interpolating arsenic concentrations at unmeasured locations. These
interpolations will be conducted using ARCGIS and ARCINFO. We will interpolate
Arsenic concentration using IDW, Spline, radial basis functions, local and
global polynomials.
3. Geostatistical Modeling
This interpolation method
assumes that the distance or direction between sample points reflects a spatial
correlation that can be used to explain variation in the surface. Kriging fits
a mathematical function to a specified number of points, or all points within a
specified radius, to determine the output value for each location. Kriging is a
multiple step process; it includes exploratory statistical analysis of the
data, variogram modeling, creating the surface, and exploring a variance
surface. This function is most appropriate when knowing there is a spatially
correlated distance or directional bias in the data.
Kriging has several advantages over other deterministic interpolation methods.
In addition to kriging, however, we will explore the realm of stochastic
simulations. In this type of analysis, observations are re-sampled a large
number of times and point estimates with confidence intervals may be developed
from the re-sampled data (this should sound a lot like the bootstrap).
While kriging has a tendency to smooth distributions, these simulations
maintain closer resemblence to the true 'shape' of the data. These simulations
will be conducted using the GSLIB freeware package.