System Design Interview

ylxn發表於2024-04-14

1 Proximity Service

1) requirements

Functional Requirements Non-Functional Requirements
1

Return all businesses based on user's location (latitute and longitute pair) and radius (5km)

Low latency.

  • Users see nearby business quickly
2

Business owners can use Restful API to deal with a business.

  • not to be refelcted in real-time

Data privacy

  • comply with GDPR & CCPA
  • location info is private
3 Custerms can view business detail

High availability and scalibility requirements.

  • Proximity Service can handle the spike in traffic during peak hours

2) Basic Culculatiion

QPS

Seconds in a day = 24 * 60 * 60 = 86400, round it up for 10**5.

Users: 100 million

Searches: 5 times.

QPS: 100m * 5 / 10**5 = 5000

5000
Users 100 million
Business 200 million

3) High Level Design

User API design

Users search for business.

  • Restful API
  • Pagination

Parameters.

  • latitude
  • longitude
  • radius

{

"radius": 10,

"business": [business Object]

}

GET / search / nearby
Business API design

Restful API.

  • GET
  • POST
  • UPDATE
  • DELETE
Data model

Read / Write Ration

Read:

  • Read-heavy system

Write:

  • Write is infrequent operation.

Schema

  • Geo index table (geohash and businessid)
  • business table (detail about business)

MySql. (PostgreSQL)

Algorithms to find near by business

1) Geohash

problems:

  • not enough business. 1) return the results directly. 2) remove the last index.

2) Quatree

4) design diagram

Load Balancer

receive requests (latitute, longitute, radius) from users

Location-based service

Responsibility

  • calculate geohash from a user and neibor geohash
  • call redis for nearby businessids and business_objects
  • calculate distance extra
  • rank result to return
  • pagination

characters:

  • read-heavy service
  • QPS is high during the peak
  • the service is stateless so it is easy to scale horizontally

it is a multi location services.

  • users are physically close to local services
  • can set up a region DNS to follow the local laws or requirements.
Business Service
  • write infrequently
Database Cluster
  • primary-second setup
  • data is saved in primary database then replicated to replicas
  • some discrepency between replicas and primary databse is not an issue

Scale:

1) Business Table

  • Good for sharding

2) Geo index Table

  • not sharding (sharding is not a good choice)
  • use read replicas
Redis Cluster

Caching is not necessary. Because read geo index from database is fast enough.

we can use cache to handle the spike during peak hours. We use caching to enhance the performance.

1) key: geohash, value: [business_list]

storage for value: 200 m * 32 bytes * 3 precisions = 17gb

storage for key: negligible.

we deploy this cache globally to ensuer hive availability

2) key: businessid, value: business_object

2 nearyby friends

相關文章