More web function requests go online concurrently, and web service deployment is faster and more economical!

騰訊雲開發者 發表於 2022-11-24

Web function is a function type of cloud function. It is different from event function. By supporting native HTTP/WebSocket protocol, web function is compatible with any web service written by native web framework. Traditional projects can be deployed to functions to ensure the same experience as the local development service.

Since its launch in June 21, it has become very popular, and tens of thousands of traditional web projects have been easily migrated to functions, allowing old code to continue to shine on serverless infrastructure. However, as Web functions are applied in more scenarios, some problems are gradually exposed.


1. Web function request single concurrency

By default, cloud functions allocate a concurrent instance to handle the request or event when the function is called. After the function code finishes running and returns, the instance handles other requests. If all instances are running when the request comes in, the cloud function will allocate a new concurrent instance. A concurrent instance only processes the running logic of one event at a time, ensuring the processing efficiency and stability of each event.

In most cases, request-single concurrency is a recommended mode, and there is no need to consider the typical concurrency problems caused by simultaneous processing of multiple requests when writing code, such as thread safety, blocking calls, exception handling, etc.

In web applications, typical business scenarios are IO-intensive—downstream services such as access to databases or interfaces of other systems within functions will spend more time waiting for responses from these downstream services. This kind of waiting is generally doing iowait and does not consume much CPU. At this time, the single-instance single-concurrency mode will cause two problems:

The waste of computing resources is especially prominent in the long connection of websocket;
The billing is slightly higher. Because each incoming request is allocated to a different instance for processing and billed separately, the overall cost has no advantage over traditional container or host solutions.

2. Multiple concurrent web function requests


Web functions currently support multiple concurrent requests, which you can enable and configure according to your business needs. Request multiple concurrency supports two modes: custom static concurrency and intelligent dynamic concurrency.

(1) Custom static concurrency

When enabled, when there are multiple requests at the same time, requests that do not exceed the specified concurrency value will be dispatched to the same function instance for execution. The increase in concurrency will increase the CPU and memory consumption of the function instance. It is recommended to cooperate with the stress test to make reasonable settings to avoid abnormal function execution. The currently supported concurrency range is 2 to 100 concurrency.

(2) Intelligent dynamic concurrency

When enabled, intelligently and dynamically schedules more requests to run within the same function instance when the function instance load allows it. It will be launched later, so stay tuned.

3. Pressure measurement

Multi-concurrency vs single-concurrency stress test results: the maximum request delay is reduced by 82.7%, and the cost is saved by 98.8%

test introduction:

Prepare a Web function in the Beijing region, simulate a request response time of 500ms, and configure the function to 512MB.

Prepare a public network stress test environment in Beijing, and test the performance when multiple concurrency is not enabled and when more than 100 concurrency are enabled. The stress test conditions are: 100 concurrency and 50,000 consecutive requests.

Multiple concurrent requests are not enabled: 60 instances are required, the maximum response time is 4177ms, and the billing memory time is 12500GBs

Client pressure test results: the maximum response time is 4177ms, and the average response time is 569ms. At the beginning of the stress test, the function received 100 concurrent requests, and immediately started to cold start to pull up the function instance. During the cold start process, the first batch of requests began to wait, and the longest response time was 4177ms, and as the instance was pulled up, the request was processed normally. , new concurrent requests will be immediately and evenly distributed to multiple instances for processing, so most subsequent requests have a small waiting time, and the overall average response time will be leveled and returned to normal levels.


Screenshot of function monitoring: 100 instances were pulled up instantly at the beginning of the cold start, and subsequent instances were rotated and reused, and the number was stable at 60 instances. The billing time is 0.5GB 0.5s 50000=12500GBs.


Enable multiple concurrent requests: only one instance is required, the maximum response time is 723ms, and the billing time is 150GBs

Client stress test results: the maximum response time is 723ms, and the average response time is 565ms. At the beginning of the stress test, the function received 100 concurrent requests and immediately started cold start to pull up the function instance. During the cold start process, the first batch of requests began to wait, and the longest response time was 723ms, and as the first instance was pulled Up, new concurrent requests will be immediately allocated to the instance for processing. The number of requests waiting for a cold start is significantly less than when multi-concurrency is not enabled.

Screenshot of function monitoring: concurrent requests come in, and all requests are assigned to a function instance for immediate processing.


Screenshot of function billing time memory: only one instance is needed. In multi-concurrency mode, requests allocated to the same instance for processing will only be calculated once, and the billing time is 0.5GB*282.728s=141GBs. Save 98.8% on fees.


Fourth, the advantages of multiple concurrent requests

(1) Lower cost

When multiple concurrent requests are not enabled, a single function instance will only process one request at a time, and the next request will be processed only after the first request is processed. The billing time for memory time is the sum of the execution time of each request, as shown in the following figure shown:

After enabling multiple concurrent requests, a single function instance will process multiple concurrent requests at a time. When the first request is not over, if the second request comes in, there will be two requests being processed at the same time for a period of time. At this time, the overlapping This period of time will only be counted once. As shown below:

It can be seen that in IO-intensive scenarios, such as the Websocket long connection service, the billing execution time can be reduced and costs can be saved.

(2) Performance improvement

Multiple concurrent requests can reuse the database connection pool in the same instance, reducing the pressure on downstream services.

When request concurrency is intensive, multiple requests only need one instance to process, and there is no need to pull up multiple instances, thereby reducing the probability of instance cold start and reducing response delay.

Cloud function product introduction:

Web function request multi-concurrency product documentation: