BiDirect is a minimalist implementation of a distributed social graph service, designed to demonstrate the core concepts of building scalable social networking systems like LinkedIn or Facebook's social graph infrastructure.
This implementation is a micro-instance of how large-scale social networks manage connection data and calculate relationship distances. While production systems handle billions of connections across thousands of nodes, this implementation uses a smaller scale to demonstrate the key architectural concepts.
- Distributed graph storage using Redis
- Partitioned data architecture
- Connection degree calculation (1st, 2nd, 3rd degree connections)
- Shared connection discovery
- Network distance computation
-
Partitioning Strategy
- Current: Simple modulo-based partitioning across 3 Redis nodes
- Production: Would use consistent hashing or range-based sharding across thousands of nodes
-
Caching Layer
- Current: Single Redis instance for second-degree connections
- Production: Multi-tiered caching with in-memory, near-memory, and disk-based caches
-
Node Management
- Current: Static node configuration
- Production: Dynamic node discovery and automated rebalancing
GET /api/connections/{memberID} # Get member's direct connections
GET /api/shared-connections/{id1}/{id2} # Find shared connections
POST /api/distances # Calculate network distances
- Uses partition-based distribution to demonstrate how social graphs can be split across multiple nodes
- Each node manages multiple partitions to show how load can be distributed
- Simplified partitioning function for demonstration purposes
- Implements efficient 2nd and 3rd-degree connection discovery
- Uses Redis sorted sets for quick connection lookups
- Demonstrates caching of frequently accessed paths
- Implements a greedy approach for finding minimum node sets
- Shows how to optimize multi-node queries in a distributed system
- Go 1.21+
- Docker and Docker Compose
- Start the infrastructure:
docker-compose up -d
-
The service will be available at
http://localhost:8080
-
Load sample data:
python imp.py
This implementation is intentionally simplified. In a production environment, you would need to consider:
-
Scalability
- Current: 3 Redis nodes, 10 partitions per node
- Production: Thousands of nodes, dynamic partition allocation
-
Reliability
- Current: Basic error handling
- Production: Circuit breakers, fallbacks, redundancy
-
Performance
- Current: Simple caching strategy
- Production: Multi-level caching, pre-computation of common paths
-
Monitoring
- Current: Basic logging
- Production: Comprehensive metrics, tracing, alerting
-
Security
- Current: No authentication
- Production: OAuth, rate limiting, encryption
- Demonstrates in-memory graph storage principles
- Sorted sets provide efficient connection lookups
- Easy to understand and set up for demonstration
- Shows basic concepts of data sharding
- Demonstrates how to handle cross-partition queries
- Simplified version of production-grade distribution strategies
This is an educational project designed to demonstrate distributed systems concepts. Contributions that help clarify these concepts or add new educational examples are welcome.
MIT License
This implementation draws inspiration from real-world social graph systems while maintaining a focus on educational value and clarity over production-grade features.