Menu

Clustering

Clustering Process

After calculating the topological metrics of each artist, we applied a clustering algorithm to group those with similar topological features. In this way, we find groups of artists with similar topological values and, consequently, similar collaboration profiles. Here, we use K-means clustering algorithm, which is the simplest and most commonly used cluster method. One of the first steps when using this algorithm is to define the number of clusters to work with. We use a common solution to identify the optimum number of clusters: the Elbow method. According to the chart below, the curve decreases as k increases, but it can be seen a bend (or "elbow") at k = 3. This bend indicates that additional clusters beyond the third one would negatively affect the results by increasing k. Therefore, k = 3 is our optimal number of clusters.

Results of Clustering

The figure on the right shows the clustering results, with coloring each data point according to its cluster assignment. We can see from the results that there is a natural division between communities.

This cluster presents high metric values related to the Interaction, Distance and Influence categories. However, it has a median value for Similarity.

This cluster only presents high values in Distance and Similarity.

This cluster presents only null values for all the categories.

K-Means

Identifying Clusters' Collaboration Profiles

The radar chart below shows the profiles' characterization of each cluster.


Comparing the results

Now we can compare and identify which collaboration profile each cluster belongs to.