Skip to content

Latest commit

 

History

History
162 lines (146 loc) · 12.4 KB

intro.md

File metadata and controls

162 lines (146 loc) · 12.4 KB

Welcome to PHYS 7332 (Network Science Data)

Course Overview

This course offers an introduction to network analysis and is designed to provide students with an overview of the core data scientific skills required to analyze complex networks. Through hands-on lectures, labs, and projects, students will learn actionable skills about network analysis techniques using Python (in particular, the networkx library). The course network data collection, data input/output, network statistics, dynamics, and visualization. Students also learn about random graph models and algorithms for computing network properties like path lengths, clustering, degree distributions, and community structure. In addition, students will develop web scraping skills and will be introduced to the vast landscape of software tools for analyzing complex networks. The course ends with a large-scale final project that demonstrates the proficiency of the students in network analysis. This course has been built from the foundation of the years of work/development by Matteo Chinazzi and Qian Zhang, for earlier iterations of Network Science Data. This syllabus may be updated and can be found here: https://brennanklein.com/phys7332-fall24.

Our course is a Jupyter Book! Find it here: https://asmithh.github.io/network-science-data-book/.

Course Learning Outcomes:

  • Proficiency in Python and networkx for network analysis.
  • Strong foundation of complex network algorithms and their applications.
  • Skills in statistical description of networks.
  • Experience in collecting and analyzing online data.
  • Broad knowledge of various network libraries and tools.

Materials:

There are no required materials for this course, but we will periodically draw from:

Additionally, we recommend engagement with other useful network science and/or Python materials:

Coursework, Class Structure, Grading

This is a twice-weekly hands-on class that emphasizes building experience with coding. This does not necessarily mean every second of every class will be live-coding, but it will inevitably come up in how the class is taught. We are often on the lookout for improving the pedagogical approach to this material, and we would welcome feedback on class structure. The course will be co-taught, featuring lectures from the core instructors as well as outside experts. Grading in this course will be as follows:

  • Class Attendance & Participation: 10%
  • Problem Sets: 45%
  • Mid-Semester Project Presentation: 15%
  • Final Project — Presentation & Report: 30%

Final Project

The final project for this course is a chance for students to synthesize their knowledge of network analysis into pedagogical materials around a topic of their choosing. Modeled after chapters in the Jupyter book for this course, students will be required to make a new "chapter" for our class's textbook; this requires creating a thoroughly documented, informative Python notebook that explains an advanced topic that was not deeply explored in the course. For these projects, students are required to conduct their own research into the background of the technique, the original paper(s) introducing the topic, and how/if it is currently used in today's network analysis literature. Students will demonstrate that they have mastered this technique by using informative data for illustrating the usefulness of the topic they've chosen. Every chapter should contain informative data visualizations that build on one another, section-by-section. The purpose of this assignment is to demonstrate the coding skills gained in this course, doing so by learning a new network analysis technique and sharing it with members of the class. Over time, these lessons may find their way into the curriculum for future iterations of this class. Halfway through the semester, there will be project update presentations where students receive class and instructor feedback on their project topics. Throughout, we will be available to brainstorm students' ideas for project topics.

Ideas for Final Project Chapters (non-exhaustive):

  1. Graph Embedding (or other ML technique)
  2. Network Reconstruction from Dynamics
  3. Link Prediction
  4. Graph Distances and Network Comparison
  5. Motifs in Networks
  6. Network Sparsification
  7. Spectral Properties of Networks
  8. Mechanistic vs Statistical Network Models
  9. Robustness / Resilience of Network Structure
  10. Network Game Theory (Prisoner’s Dilemma, Schelling Model, etc.)
  11. Homophily in Networks
  12. Network Geometry and Random Hyperbolic Graphs
  13. Information Theory in/of Complex Networks
  14. Discrete Models of Network Dynamics (Voter model, Ising model, SIS, etc.)
  15. Continuous Models of Network Dynamics (Kuramoto model, Lotka-Volterra model, etc.)
  16. Percolation in Networks
  17. Signed Networks
  18. Coarse Graining Networks
  19. Mesoscale Structure in Networks (e.g. core-periphery)
  20. Graph Isomorphism and Approximate Isomorphism
  21. Inference in Networks: Beyond Community Detection
  22. Activity-Driven Network Models
  23. Forecasting with Networks
  24. Higher-Order Networks
  25. Introduction to Graph Neural Networks
  26. Hopfield Networks and Boltzmann Machines
  27. Graph Curvature or Topology
  28. Reservoir Computing
  29. Adaptive Networks
  30. Multiplex/Multilayer Networks
  31. Simple vs. Complex Contagion
  32. Graph Summarization Techniques
  33. Network Anomalies
  34. Modeling Cascading Failures
  35. Topological Data Analysis in Networks
  36. Self-organized Criticality in Networks
  37. Network Rewiring Dynamics
  38. Fitting Distributions to Network Data
  39. Hierarchical Networks
  40. Ranking in Networks
  41. Deeper Dive: Random Walks on Networks
  42. Deeper Dive: Directed Networks
  43. Deeper Dive: Network Communities
  44. Deeper Dive: Network Null Models
  45. Deeper Dive: Network Paths and their Statistics
  46. Deeper Dive: Network Growth Models
  47. Deeper Dive: Network Sampling
  48. Deeper Dive: Spatially-Embedded and Urban Networks
  49. Deeper Dive: Hypothesis Testing in Social Networks
  50. Deeper Dive: Working with Massive Data
  51. Deeper Dive: Bipartite Networks
  52. Many more possible ideas! Send us whatever you come up with

Instructors

Brennan Klein is an associate research scientist at the Network Science Institute, with a joint affiliation at the Institute for Experiential AI. He is the director of the Complexity & Society Lab. His research spans two broad topics: 1) Information, emergence, and inference in complex systems -- developing tools and theory for characterizing dynamics, structure, and scale in networks, and 2) Public health and public safety -- creating and analyzing large-scale datasets that reveal inequalities in the United States, from epidemics to mass incarceration. Dr.Klein received a PhD in Network Science in 2020 from Northeastern University and got his BA in Cognitive Science & Psychology from Swarthmore College in 2014. Website: http://brennanklein.com/.

Alyssa Smith is a fourth-year PhD student in Network Science at Northeastern University. Her current work focuses on the ways that structure and agency interact in social networks to encourage mobilization. She is interested in making big data and computational tools usable by academics without specialized technical training. She use mixed methods, ranging from terabyte-scale datasets to autoethnography, to make sense of the world. Her dissertation work revolves around structure -- the place one occupies in a social network -- and agency -- an individual’s characteristics and proclivities -- which are thought to be the two main driving forces behind engagement in social movements. We can think of structure and agency as two separate, competing factors, or we can think of them as a duality: in much the same way that light is both a particle and a wave, the interplay of structure and agency is what governs mobilization. Before joining the Network Science Institute, Alyssa received a BS in Humanities and Engineering with Comparative Media Studies and Computer Science from MIT in 2017; after that, she worked in tech for 4 years. Website: https://asmithh.github.io/.

What You'll Learn

Students should leave this class with an ever-growing codebase of resources for analyzing and deriving insights from complex networks, using Python. These skills range from being able to (from scratch) code algorithms on graphs, including path length calculations, network sampling, dynamical processes, and network null models; as well as interfacing with standard data science questions around storing, querying, and analyzing large complex datasets.

Schedule

DATE CLASS
Wed, Sep 4, 24 Class 0: Introduction to the Course, Github, Computing Setup
Thu, Sep 5, 24 Class 1: Python Refresher (Data Structures, Numpy)
Fri, Sep 6, 24 ---
Wed, Sep 11, 24 Class 2: Introduction to Networkx 1 — Loading Data, Basic Statistics
Thu, Sep 12, 24 Class 3: Introduction to Networkx 2 — Graph Algorithms
Fri, Sep 13, 24 Announce Assignment 1
Wed, Sep 18, 24 Class 4: Distributions of Network Properties & Centralities
Thu, Sep 19, 24 Class 5: Scraping Web Data 1 — BeautifulSoup, HTML, Pandas
Fri, Sep 20, 24 ---
Wed, Sep 25, 24 Class 6: Scraping Web Data 2 — Creating a Network from Scraped Data
Thu, Sep 26, 24 Class 7: Big Data 1 — Algorithmic Complexity & Computing Paths
Fri, Sep 27, 24 Assignment 1 due September 27
Wed, Oct 2, 24 Class 8: Data Science 1 — Pandas, SQL, Regressions
Thu, Oct 3, 24 Class 9: Data Science 2 — Querying SQL Tables for Network Construction
Fri, Oct 4, 24 Announce Assignment 2
Wed, Oct 9, 24 Class 10: Clustering & Community Detection 1 — Traditional
Thu, Oct 10, 24 Class 11: Clustering & Community Detection 2 — Contemporary
Fri, Oct 11, 24 ---
Wed, Oct 16, 24 Class 12: Visualization 1 — Python + Gephi
Thu, Oct 17, 24 Class 13: Project Update Presentations
Fri, Oct 18, 24 Assignment 2 due October 18
Wed, Oct 23, 24 Class 14: Introduction to Machine Learning 1 — General
Thu, Oct 24, 24 Class 15: Introduction to Machine Learning 2 — Networks
Fri, Oct 25, 24 Announce Assignment 3
Wed, Oct 30, 24 Class 16: Visualization 2 — Guest Lecture (Pedro Cruz, Northeastern)
Thu, Oct 31, 24 Class 17: Dynamics on Networks 1 — Diffusion and Random Walks
Fri, Nov 1, 24 ---
Wed, Nov 6, 24 Class 18: Dynamics on Networks 2 — Compartmental Models
Thu, Nov 7, 24 Class 19: Dynamics on Networks 3 — Agent-Based Models
Fri, Nov 8, 24 Assignment 3 due November 8
Wed, Nov 13, 24 Class 20: Big Data 2 — Scalability
Thu, Nov 14, 24 Class 21: Network Sampling
Fri, Nov 15, 24 ---
Wed, Nov 20, 24 Class 22: Network Filtering/Thresholding
Thu, Nov 21, 24 Class 23: Dynamic of Networks: Temporal Networks
Fri, Nov 22, 24 ---
Wed, Nov 27, 24 Thanksgiving Break (No Class)
Thu, Nov 28, 24 ---
Fri, Nov 29, 24 ---
Wed, Dec 4, 24 Class 24: Instructor's Choice: Spatial Data, OSMNX, GeoPandas
Thu, Dec 5, 24 Class 25: Wiggle Room Class / Office Hours
Fri, Dec 6, 24 ---
Wed, Dec 11, 24 Class 26: Final Presentations