Research in Cluster Analysis
Daniel Reich and Professor Marc W. Spiegelman
Fall 2003 - Spring 2004


Overview Matlab Code Sample Output


world map

A Little Background On Where the Data Comes From
Rock samples have been collected from volcanic structures throughout the ocean. These locations are shown in the map above by red dots. The NSF (National Science Foundation) founded a research project, which started in March 1996 and is being carried out by LDEO (Lamont-Doherty Earth Observatory) of Columbia University in the City of New York, sponsored by RIDGE (Ridge Inter- Disciplinary Global Experiments). The project's ultimate goal is to provide a robust, complete reference data library for the petrology and whole earth science communities. Data sets consisting of samples and their compositions can be found on the PETDB webpage http://www.petdb.org/pg1.jsp

A Little Background on the Science that Will Bring Us to Our Questions
Oceanic crust is formed at mid-ocean ridges and spreads at varying rates from approximately 1 to 10 centimeters per year. The crust re-enters the deep earth at subduction zones: One such zone is along the coast of Japan. In the absence of these two effects counterbalancing one another, drastic changes to the earth would occur over long periods of time (or perhaps some scientists would consider it short periods of time, but to be clear we are talking about millions of years).

Our Questions in Technical Terms
What relationships, if any, are evident between chemical variability of volcanic rocks? What, if any, are evident between dynamics of plate tectonics and magma migration? What is the statistical description of global chemical variability? Are there correlations between elements? Or between elements and isotopes? What can we discover about trace elements and in particular about their isotope ratios? Is clustering in chemical space possible? And will it lead to an accurate system of classifications?

And for You Non-Science People
Can we find important relationships between elements in the rock samples from these volcanic structures? Are there correlations between where the rock is taken from and its chemical composition? Are certain compositions found in areas where the oceanic crust is forming? Or where it's re-entering the deep earth? Do certain percentages of elements present in a given sample imply that other elements will also be present in specific percentages?

How We're Going About Finding the Answers
This problem is about working through data sets and finding clusters in them. There are two main ways to find groups or relationships: the first is to find similarities between samples and the second is to find relationships between elements in the samples. We are trying to sort the data in several ways and make use of all of the possible groupings. We are in the process of designing a system of Matlab functions that will cluster the data, and allow for the clusters to be inspected, tested for accuracy, and renegotiated.

Our Goal
We hope that the results will yield useful information about the statistical variability of global geochemistry that can be compared to the output of theoretical models.