I have recently introduced a method of hyperlink analysis. In this post, I would like to show one approach, how to use this method and social network analysis in web analytics.
The hyperlink analysis is based on a passive presence of hyperlink connection between two websites. It is not important if the connection (the hyperlink) had ever been used by a user of a website. Therefore, the hyperlink analysis is a suitable method for revealing intentional relationships between actors of the certain hyperlink network.
On the other hand, if we can focus on a single website and we have an access to web analytics data, we may use the same method to see the actual performance of the website network.
Let´s demonstrate that with an example.
Firstly, I downloaded a web browsing activity of the users of a website via the Google Analytics API. I downloaded page and previous page path dimensions and pageviews metric. (It is possible to discuss the most appropriate choice of metrics and dimensions)
Secondly, as for any data analysis, it was necessary to clean data, group certain similar page URLs, etc.
Thirdly, I created nodes and edges tables. As a node, I used any page URL in Page or Previous Page Path columns. As an edge, I used any pair of Page and Previous Page Path with at least one pageview.
Finally, I imported these databases into Gephi, applied basic algorithms and visualized the network. The resulting network looked like this (the page titles were anonymised):
And this is the moment, when we can talk about traffic hubs and dead-ends. Even the easiest visual analysis of the network reveals that there are several web browsing paths that end in the user leaving the website (for example the blue path on the left or a brown path at the bottom part of the diagram. As a result of this analysis, it might be worth exploring these paths in more detail.
Let´s look at the center of the network in more detail.
Again, the most basic visual analysis reveals several tips for an analysis:
several blue colored nodes are in the very centre of the network, it might be worth exploring why would these pages eventually lead to a user leaving the website
the centre of the network is very important for the behavior of users as the majority of nodes are connected with the centre
you may see several websites that are capable of connecting the green cluster with the major pink cluster
In this way, the social network analysis enables us to get several insights about the user behavior and test them using more detailed tools or methods. However, there are still so many things to improve about this method, so feel free to leave your comments or stay in touch for further articles about this topic.