What's this?
  Raw Data
  Loudness Sensation
  MFS
  Median
  PCA
What's this?
What's this?
Sitemap
 Last Updated: 20.01.2002

Self-Organizing Maps

The Self-Organizing Map (SOM) is an artificial neural network, which models biological brain functions. The algorithm and its variations have been employed several times in domains such as machine vision, image analysis, optical character recognition, speech analysis, and engineering applications in general.

The SOM is a powerful tool that can be used in most data-mining processes especially in data exploration. Moreover, the SOM is very efficient compared to other non-linear alternatives such as the Generative Topographic Mapping, Sammon's Mapping, or generally Multi Dimensional Scaling. An example for the efficiency of the SOM is the WebSOM project.

The algorithm is illustrated using Figures 1 to 3. The dataset (cf. Figure 1) is 2-dimensional and contains three groups each represented by 100 members and normally distributed around their centers. Figure 2 depicts the form of the map in the data space. Starting with a random initialization, 8 subsequent training iterations are shown. After these 8 iterations the map fits the data rather well (cf. Figure 3). There are several interesting aspects of the SOM that can be seen.
Figure 1: A simple 2-dimensional toy data set.
Figure 2: Starting with a random initialization 8 training iterations are calculated and the position of the model vectors in the data space visualized.
Figure 3: The map units are labeled with the most frequent class they represent.
First of all the SOM does not map the data linearly. A linear mapping could be achieved using, for example, a Principle Component Analysis (PCA). The SOM uses its units efficiently, as few units as possible are wasted not representing any data. Furthermore, areas in the data space with a high density of data items are represented by more units than areas with a lower density. This is also known as magnification. The model vectors are most dense in the area of the 'o' class (cf. Figure 2, 8th iteration).

Interesting is also the training itself. Notice how the map slowly unfolds to fit the data. When the neighborhood radius is high initially, each map unit represents a large amount of data and thus the model vectors of the units tend to be around the mean of the whole dataset. As the neighborhood radius decreases the units become more flexible and are able to fit the data better. Notice also how the smoothness decreases drastically in the last two training iterations. These are the final steps of the fine tuning phase, regardless of the neighbors the units try to match their data as well as possible.