Chapter 6 Homophily and Heterophily
6.1 Assortativity
Assortativity measures how likely vertices are to connect to others based on some common attributes. The assortativity value spans from -1 to +1. An assortativity value of 0 indicates no overall pattern of preference for individuals of the same category. An assortativity of +1 indicates that individuals only attach to individuals of the same category, while an assortativity of -1 indicates that individuals do not possess links with individuals of other categories. It’s a statistical measure.
## [1] 0.008153553
## [1] 0.041119
In the US Politician network, the assortativities for both categorical variables in consideration (class and party) are positives.
The tendency towards connection is very balanced. The assortativity coefficient is positive if vertices with a similar parameter tend to connect with each other, and negative otherwise. In the US Political Network, the assortativity of members of the same party and class of being connected is closed to 0, signaling that, considering the distribution of nodes, there is not much difference between the nodes.
The result is close to 0, showing that the likelihood of a connection is statistically insignificant.
6.2 Homophily
In sociology, homophily can be defined as the principle that a contact between similar people occurs at a higher rate than among dissimilar people. It can be summed up with the idiomatic expression: “birds of a feather flock together”, meaning that individuals of similar character, taste, or background tend to stay together.
In network analysis, a graph is homophilic if nodes with similar characteristics in a graph tend to stick together. Whether a graph is homophilic or not is determined by computing three measures: connectance, dyadicity, and heteroplicity.
In political context, and in the case of Twitter, we hypothize that politicians belonging to the same party or that work together in the chambers, tend to follow each other.
A useful measure to evaluate the degree of homophily is connectance. It measures the strength of the connections between the elements of a network. In statistical terms, it represents the average probability that two vertices are connected.
It is computed by multiplying for two the number of edges and then dividing this value by the multiplication between the number of nodes and the number of nodes minus 1.
## [1] 0.474833
6.3 Dyadicity
Dyadicity is the connectedness between nodes with the same label compared to what is expected in a random configuration of the network. It is computed by dividing the actual number of same label edges divided by the expected number of same label edges.
When dyadicity is higher than 1, then the it is said that that community is dyadic, that means that the nodes with that propriety tend to connect with each other. If it is lower than 1, then it is said that the propriety is anti-dyadic and that vertices without that property tend to connect. If the dyadicity is close to 1, then the connection distribution is random.
6.4 Counting nodes and edges
In order to compute the dyadicity among various party affiliations and politician class, I first need to count the number of nodes for each party and class and the number of edges that connect vertex with the same property.
While it is relatively immediate to retrieve the number of nodes for each community, counting the same level edges requires some additional transformation. Using dplyr, I created a new edge attribute that describes the politician class and the party affiliation of the node of origin and the node of arrival.
Now it is possible to compute the dydacity for party and class
6.5 Dydacity by party
## DemDyad IndDyad RepDyad
## NaN NaN NaN
All of the dyadicity computed for the different parties are higher than 1. This means that for each of the proprieties taken into consideration (democratic, independent, or republican), nodes tend to be connected with each other. On the other hand, the lower value of the Republican dyadicity underlines that Republican tend to be less connected with each other than Democrats or Independents.
6.6 Dydacity by class
## CabDyad HouDyad SenDyad
## 1.230002 1.240750 2.531034
The dyadicity values computed for the different politician class are higher than 1. This means that for each of the class (senator, house, cabinet), nodes tend to be connected with each other. However, the higher value of the senator dyadicity underlines that Senators tend to be more connected with each other than Cabinet or House of Representatives members.
6.7 Heterophilicity
Heterophilicity is the connectedness between nodes with different labels compared to what is expected for a random configuration of the network. It is computed by dividing the actual number of cross label edges by the expected number of cross label edges.
When heterophilicity is higher than 1, then it is said that that community is heterophilic, that means that the nodes with that propriety tend to connect with nodes with other labels. If it is lower than 1, then it is said that the community is heterophobic and that vertices tend not to connect with vertices with different labels. If the heterophilicity is 1, then it represents a random distribution.
6.8 Heterophilicity by party
## DemHeter IndHeter RepHeter
## NaN NaN NaN
The heterophilicity computed for Democrats and Republicans are lower than 1, meaning that these two groups of nodes are heterophobic. Nodes tend to avoid connecting with nodes with different labels. On the other hand, the heterophilicity for Independents is higher than 1, meaning that this group is heterophilic. Independents nodes tend to connect with nodes of different party affiliation more often than not. The heterophilic character of Independent politicians may be explained by the fact that their small number (4), compared to the remaining sample population.
6.9 Heteroplicity by class
## CabHeter HouHeter SenHeter
## 0.3139436 0.1077531 0.4270091
The heterophilicity computed for members of the House of Representatives and Senate are lower than 1, meaning that these two groups of nodes are heterophobic. Nodes tend to avoid connecting with nodes with different labels. On the other hand, the heterophilicity for members of the Presidential Cabinet is higher than 1, meaning that this group is heterophilic. Cabinet members nodes tend to connect with nodes of different politician class more often than not. The heterophilic character of Cabinet members may be explained by their small number, compared to the remaining sample population.