- This event has passed.
The anatomy of a population-scale social network
July 21 , 20:00 – 21:00 UTC+2
Conference poster presentation by Eszter Bokányi at IC2S2 at the Harper Center of the Booth School of Business at the University of Chicago.
Authors: Eszter Bokanyi, Yuliia Kazmina, Rachel de Jong, Frank Takes and Eelke Heemskerk
The analysis of large-scale societal networks has recently seen tremendous growth, in part because of the relative abundance of digital data sources such as online social networks or mobile communication datasets1–3. However, most of these data sources lack demographic data on users or are uncertain with respect to the representativity of the user sample. Moreover, it is often not clear what exact social relations these online or communication ties represent, thus, it is difficult to interpret findings4.
We overcome a number of these drawbacks by presenting a thorough overview of the structure of a 17M node population-scale social network of a European country containing roughly 1.6B edges. This network is derived from highly curated official data sources of the country’s national statistics institute. As such, it includes every resident registered on a certain day in 2018. In addition, rich individual-level demographic and socio-economic attributes on the nodes are available alongside the network structure, as well as the precise type of each social relationship we observe: family, household, work, school, or neighbor relationship, each extracted from country-level register data. Just as a typical (online) social network data may suffer from missing connections, the studied population-scale social network data may miss informal friendship connections not captured in the formal ties in this network. However, we know that we have precisely all nodes (people), and we know that for the types of connections that we have, data is very complete, which is a unique setting in social network analysis research. In this work, we present first results of how such a high quality population scale social network is markedly different from many of the large scale social networks we typically study. Below, we in particular do so by revisiting the well-known concept of closure in a population-scale social network context.
First, we show how the degree distribution of this network is a composition of the degree distributions of the different types of edges. In the overall degree distribution, we find a characteristic value that is in sharp contrast to the scale-free or other fat-tailed distributions found in online social networks or communication networks5. Second, we discuss different types of clustering in this multilayer network, and show how closed or open network structures emerge for people of certain ages. In particular, we introduce a normalized multilayer clustering coefficient that we call excess closure, that captures the fraction of triangles in people’s social circles that span across multiple types of relationships.
Figure 1 shows how degree and excess closure change with age (a demographic attribute) in the population. Young children have low degrees and very high excess closure since they are only part of family, neighborhood, and household structures. Subsequent levels of education paired with working opportunities come with both an increasing median degree, and decreasing excess closure, reaching its minimum around the university age. Working years are characterized by a slight increase in closure, and gradually decreasing degree, giving place to low degrees and increased closure in retirement years. Finally, we find that long-range ties that span large distances are very scarce in this network, only 0.02% of all edges not being part of any triangles, which is in contrast to findings in online social networks, and does not promote fast and efficient diffusion processes over this structure.
Figure 1. Median degree (red) and median excess closure (blue) in ego networks of people of a certain age. Shaded areas correspond to the 25th and 75th percentiles for each age year.
Concluding, our results show a sharp transition from closed to open network structures as young adults engage in higher levels of education, and a reverse process as people retire. The findings empirically confirm using large-scale data that individuals have very different resource structures throughout their lives, which affects their access to opportunities and information6. Our measurements are first steps in building both methods and universal insights on the rich network structure of highly curated population-level network datasets.