Thinking unthinkable - can human do it with the help of machines? This study group investigates the duality between reasoning and the geometry of data representation learned by artificial neural networks. Based on this investigation we will discuss how to achieve intelligence augment.
“Just as there are odors that dogs can smell and we cannot, as well as sounds that dogs can hear and we cannot, so too there are wavelengths of light we cannot see and flavors we cannot taste. Why then, given our brains wired the way they are, does the remark “Perhaps there are thoughts we cannot think,” surprise you?
Evolution, so far, may possibly have blocked us from being able to think in some directions; there could be unthinkable thoughts.”
- Richard Hamming, The Unreasonable Effectiveness of Mathematics
1. Machines, and not humans, define computability
1.1 Leibniz: thought symbolics to settle dispute
1.2 Church-Turing thesis on "effective computability": Turing machine, lambda computability, and general recursive functions
1.3 Three schools of foundations of mathematics: D. Hilbert - formalism, L.E.J Brouwer - intuitionism, and B.Russell - logicism
2. Shapes know the answers: landscape is intelligence
2.1 Two worldviews: Newtonian (complex dynamics + simple geometry) vs Einsteinian (simple dynamics + complex geometry)
2.2 Introduction to artificial neural networks: machines see worlds differently to solve puzzles
What is intelligence? How does it emerge from such a disordered universe? Can we design a system that exhibits intelligent behavior by extracting information from the environment and hardware it in topology? The Hopfield Network is an innovative brainchild of these questions. It suggests looking at machine learning tasks from the perspective of statistical mechanics and creates the traditional of energy-based learning. It connects information and energy by representing information using a linking structure on which energy is designed. "Correct" structures have low energy, whereas "wrong" structures have high energy. The system randomly evolves and converges to the low-energy states, exhibiting the resilence to intervene, or, "remembering."
How do we evolve an energy-based learning system to the state of minimum energy or maximum entropy? A naive way is to do the Monte Carlo Simulation and count on chance. In this paper, Hinton proved that the Hebbian Rule, also known as "fire together wire together," can be used to evolve the system. It minimizes the Kullback-Leibler divergence (information gain) between models (systems) and data (environment). This paper also inspires us to ask: if we evolve (neural) networks to learn, then what are evolving (social) networks learning?
Modularity (Q) is a measure of the degree to which a network has a community structure: nodes are more likely to connect within than between communities. Random networks have low Q values, whereas most real-world networks have high Q values. Note that the formula for modularity is very similar to the energy functions defined in Hopfield Network. Can we assume that social networks are continuously learning from the environment and forming communities to store information? According to the principle of energy-based learning, systems converge at the lowest energy state. Can Q modularity be understood as an energy formula to describe networks?
Understanding individual behaviors based upon the social groups they belong to is one of the dominant perspectives of social sciences. Blau Space is a geometric model of this perspective. It is a Euclidean space of demographical dimensions such as race, gender, income, etc. Linear regression models following the iid assumption are applications of this model. An individual is a data point in this high dimensional space. And the statement that "Birds of a feather flock together" (McPherson et al., 2001) assumes that points close to each other are more likely to connect. Meanwhile, social organizations compete with each other for the same niche in the Blau space.
Harrison White is one of the first scholars who introduced the concept of block modeling (White et al., 1976). This model suggests that the linking behavior of nodes is determined by the underlying social "blocks" to which they belong. Within a block, nodes are interchangeable with each other. Stochastic block models (SBM) connects block models to the Ising model and uses the Monte Carlo Simulation to infer the most likely block structures.
The DeepWalk model is based on a neurolinguistic model called word2vec (Tomas Mikolov et al., 2013). Word2vec represents words as vectors by embedding words in a high-dimensional, Euclidean space. It uses artificial neural networks as an optimization algorithm to find the optimal position of words in predicting words nearby. DeepWalk simulates random walks on social networks and generates sequences of nodes that can be input to the word2vec model and predict community structures. The word2vec model follows the "distributional hypothesis" to define the meanings of words by its neighbors. Similarly, DeepWalk defines the social roles of individuals by surrounding nodes.
Word embeddings capture culture dimensions (e.g., man vs. woman, rich vs. poor, black - white, liberal vs. conservative) underlying collective narrative and reveal how society thinks and what society knows. It also models the hierarchy of languages. For example, nouns like "man" or "woman" can are modeled as vectors, while adjectives and verbs, such as "masculine" or "feminine," which are operations of nouns, can be represented as vector differences. This provides insight into the hierarchy of society and the functions of individuals.
Blau space and Block models use demographic variables to predict social connections. In contrast, Graph Neural Networks (GNN) infer demographic variables (node features) from social networks. Different from DeepWalk, GNN is a supervised machine learning technique. It assumes and models the diffusion of labels between nodes.
PhD Student
Department of Sociology
University of Chicago
PhD Student
Kellogg School of Management
Northwestern University
MS Student
Department of Statistics
Columbia University
PhD Student
School of Systems Science
Beijing Normal University
MA Student
Masters in Computational Social Science
University of Chicago
PhD student
ASU-Santa Fe Institute Center for Biosocial Complex Systems
Arizona State University
MA Student
School of Journalism and Communication
Nanjing University
PhD Student
Department of Sociology
University of Chicago
MA Student
Masters in Computational Social Science
University of Chicago