【超值分享】為何寫伺服器程式需要自己管理記憶體，從改造std::string字串操作說起。。。

一隻會鏟史的貓發表於2021-07-27

原文網址 : https://www.cnblogs.com/softlee/p/15064740.html

伺服器程式為何要進行記憶體管理，管中窺豹，讓我們從string字串的操作說起。。。。。。

new/delete是用於c++中的動態記憶體管理函式，而malloc/free在c++和c中都可以使用，本質上new/delete底層封裝了malloc/free。無論是上面的哪種記憶體管理方式，都存在以下兩個問題：
1、效率問題：頻繁的在堆上申請和釋放記憶體必然需要大量時間，降低了程式的執行效率。對於一個需要頻繁申請和釋放記憶體的程式由於是伺服器程式來說，大量的呼叫new/malloc申請記憶體和delete/free釋放記憶體都需要花費系統時間，這就必然會降低程式的執行效率。

2、記憶體碎片：經常申請小塊記憶體，會將實體記憶體“切”得很碎，導致記憶體碎片。申請記憶體的順序並不是釋放記憶體的順序，因此頻繁申請小塊記憶體必然會導致記憶體碎片，可能造成“有記憶體但是申請不到大塊記憶體”的現象。

對於客戶端軟體，記憶體管理不是很重要，起碼你可以重啟機器。但對於需要24小時長期不間斷執行的伺服器程式來說就顯得特別的重要了！比如無處不在的web伺服器，它採用的是HTTP協議，基於請求—應答的超文字傳輸方式，這種一問一答的協議非常簡單，請求頭和響應頭都是非二進位制的字串。當服務端收到客戶端的GET或POST請求時，伺服器程式要先構造一個響應頭並拼接響應體，如下：

	// 構造響應頭
	string strHttpResponse;
	strHttpResponse += "HTTP/1.1 200 OK\r\n";
	strHttpResponse += "Server: HttpServer \r\n";
	strHttpResponse += "Content-Type: text/html; charset=utf-8\r\n";
	strHttpResponse += "Content-Length: 9527\r\n";
	strHttpResponse += "Last-Modified: Sat, 13 Apr 2019 14:27:06 GMT\r\n";
	strHttpResponse += "\r\n";				// 空行，空行後就是真正的響應體	
	
	// 構造響應體
	strHttpResponse += "<html><head><title>Hello，我是9527！</title>"
						"</head><body>Hello，我是9527的body，假裝我有9527那麼長!</body></html>";

對於動態網頁或者後臺應用來說，通常需要查詢資料庫以及各種業務上的操作，然後將結果拼接為json或xml這種半結構化資料返回給客戶端。

當然這篇文章並不是要介紹什麼是HTTP協議，關於HTTP協議介紹的文章已經非常多了。我們是想通過一次正常的HTTP會話，來看看字串操作是如何應用的？是否有優化提升的可能？

字串操作能有多大事啊！

對於客戶端來說，問題確實不大，但對於每天24小時不關機長期執行的web伺服器程式來說可能就會產生效能問題。字串在累加賦值時，可能導致記憶體的不斷開闢和銷燬，也就是上面我們說的產生了記憶體碎片。

產生記憶體碎片能有多大事啊！

如果在高併發的情況下，效能就可能會有影響，頻繁的malloc/free本身就會大量的佔用CPU時間，過多的碎片將會讓實體記憶體過於碎片化，從而導致無法申請更大的連續的記憶體塊。

無論是標準庫中的string還是微軟MFC庫中的CString，內部都會維護一個字串快取。當拼接後的字串長度小於內部快取時，直接將兩個字串連線即可；當拼接後的字串長度大於內部快取時，就需要重新開闢一個新的更大的快取，然後將字串重新拼接起來。為了直觀的進行比較，我們編寫一個自己的字串封裝類CFastString（文末有CFastString的全部實現）。並過載操作符“+=”。


const CFastString& CFastString::operator+=(const char *pszSrc)
{
	assert(pszSrc);
	
	int iLenSrc = _tcslen(pszSrc);
	int iNewSize = iLenSrc + length() + 1;	// 0結尾，所以+1

	// 當內部快取足夠時，直接進行拼接，不足時則需要開闢新的記憶體
	if(m_iBuffSize >= iNewSize)
	{
		memcpy(m_pszStr+m_iStrLen, pszSrc, iLenSrc);
		*(m_pszStr+iNewSize-1) = 0;
	}
	else
	{
		// 分配一塊新的記憶體
		char* pszNew = AllocBuffer(iNewSize);
		// 將字串拷貝拼接到新開闢的記憶體中
		// 方法一：strcpy+strcat
 		strcpy(pszNew, m_pszStr);		
 		strcat(pszNew, pszSrc);
	
		// 方法二：直接使用記憶體拷貝
//		memcpy(pszNew, m_pszStr, m_iStrLen);
//		memcpy(pszNew+m_iStrLen, pszSrc, iLenSrc);
		
		free(m_pszStr);
		m_pszStr = pszNew;
	}
	m_iStrLen = iNewSize-1;
	return *this;
}

通過上面的程式碼可以看到，如果內部快取不足時，將會重新申請新的快取，字串在不斷累加過程中，可能會導致記憶體的反覆申請和銷燬，那麼如何提升效能呢？

我們寫個測試函式比較CFastString和string的累加函式（+=）的效能，測試程式碼如下：

void TestFastString()
{
	int i = 0;
	int iTimes = 5000;

	// 測試CFastString
	printf("CFastString 測試：\r\n");
	CFastString fstr = "Hello";
	DWORD dwStart = ::GetTickCount();
	for(i = 0; i < iTimes; i++)
	{
		
		fstr += "10000000000000000000000000000000";
		fstr += "20000000000000000000000000000000";
		fstr += "30000000000000000000000000000000";
		fstr += "40000000000000000000000000000000";
	}
	DWORD dwSpan1 = ::GetTickCount()-dwStart;
	printf("CFastString Span = %d\n", dwSpan1);

	// 測試string
	printf("std::string 測試：\r\n");
	string str = "Hello";
	dwStart = ::GetTickCount();
	for(i = 0; i < iTimes; i++)
	{
		str += "10000000000000000000000000000000";
		str += "20000000000000000000000000000000";
		str += "30000000000000000000000000000000";
		str += "40000000000000000000000000000000";
	}
	DWORD dwSpan2 = ::GetTickCount()-dwStart;
	printf("std::string Span = %d\n", dwSpan2);

	printf("測試結束！\r\n");
}

執行一下，結果如下：
在這裡插入圖片描述
我們發現CFastString並不fast，反而相當的slow。重新封裝的字串操作類還不如不封裝，會不會是strcpy和strcat比較慢？

改進一：

我們修改CFastString::operator+=(const char *pszSrc)函式程式碼，將如下拼接語句：

// 方法一：strcpy+strcat
strcpy(pszNew, m_pszStr);		
strcat(pszNew, pszSrc);

改為：

// 方法二：直接使用記憶體拷貝
memcpy(pszNew, m_pszStr, m_iStrLen);
memcpy(pszNew+m_iStrLen, pszSrc, iLenSrc);

再次執行看下結果：
在這裡插入圖片描述

還不錯，比string快了一點，但好像並不顯著。過載的+=函式中，每次記憶體分配的大小為前一個字串加後一個字串的大小，這就導致了一旦字串的內部快取已滿時，後面每次的累加操作都會觸發一次記憶體的重新申請和釋放。舉個極端的例子，假設str在累加操作前內部快取已滿：

str += "0";
str += "1";
str += "2";
str += "3";
str += "4";
str += "5";
str += "6";
str += "7";
str += "8";
str += "9";

和

str += "0123456789";

兩者雖然結果一樣，但第一種寫法會觸發10次記憶體的申請和釋放，而後者只觸發了一次。
如果我們每次申請記憶體時多分配一點，效果如何呢？

改進二：

我們將：

char* pszNew = AllocBuffer(iNewSize);

改為：

// 分配一塊新的記憶體，將之前的按原尺寸分配改為增加1.5
char* pszNew = AllocBuffer(iNewSize, 1.5);

累加字串時，我們並不是按照實際需要的尺寸來分配記憶體，而是在此基礎上多分50%。執行結果如下：
在這裡插入圖片描述
CFastString快的彷彿飛了起來。如果上面測試函式中的iTimes不是迴圈次數而是併發數，也就是伺服器同時處理了5000個HTTP請求，那麼可以看到，CPU的處理速度得到了極大提升，也就說讓CPU避免了頻繁的malloc和free操作，在處理速度提升的同時，記憶體碎片也得到了降低。

當然你可能會說，記憶體多分配了50%，但這個50%換來了效能上的極大提升，伺服器程式設計中以空間換時間非常正常，記憶體閒著也是閒著，又不是不還。回到AllocBuffer(int iAllocSize, double dScaleOut)這個函式上，我們只是增加了一個控制引數dScaleOut而已。

上面並不是嚴格意義上的記憶體管理，只能說是記憶體分配的技巧。真正的記憶體管理是需要預先分配N多連續的記憶體塊（也就是記憶體池），當String需要記憶體時從記憶體池中申請一塊，釋放時再還給記憶體池，記憶體池的實現很多，已經寫的太多了，就下次再介紹吧。
回到主題，如果想寫好一個高效能的伺服器程式，很多細節問題都要考慮，哪怕是不起眼的字串操作，哪怕是字串中不起眼的累加操作。

我的HttpServer就是使用了自定義CFastString同時結合了真正的記憶體管理，IOCP只是保證高併發的前提，真正的把記憶體管理起來才能確保伺服器發揮最佳的效能。

下面是CFastString案例簡單原始碼，拿走不謝！
標頭檔案


#include <TCHAR.h>
#define DEFAULT_BUFFER_SIZE		256
class CFastString  
{
public:
	CFastString();
	CFastString(const CFastString& cstrSrc);
	CFastString(const char* pszSrc);
	virtual ~CFastString();

public:

	int length() const{
		return m_iStrLen;
	}

	// 這種方式獲取字串的長度要慢於length()函式
	int GetLength() {
		return m_pszStr ? strlen(m_pszStr) : -1;	
	}
	char* c_str() const{
		return m_pszStr;
	}

	// =============運算子過載=============
	const CFastString& operator=(const CFastString& cstrSrc);
	const CFastString& operator=(const char* pszSrc);
	const CFastString& operator+=(const CFastString& cstrSrc);	
	const CFastString& operator+=(const char *pszSrc);
	
	// =============友元函式=============
	friend CFastString operator+(const CFastString& cstr1, const CFastString& cstr2);
	friend CFastString operator+(const CFastString& cstr, const char* psz);
	friend CFastString operator+(const char* psz, const CFastString& cstr);

	// 型別轉換過載	
	operator char*() const{
		return m_pszStr;
	}
	operator const char*() const{
		return m_pszStr;
	}

	
protected:
	// =============連線兩個字串=============
	void Concat(const char* psz1, const char* psz2);

protected:
	char* AllocBuffer(int iAllocSize, double dScaleOut = 1.0);
	void  ReAllocBuff(int iNewSize);

protected:
	char*	m_pszStr;		// 字串Buffer
	int		m_iStrLen;		// 字串長度
	int		m_iBuffSize;	// 字串所在Buffer長度
};

實現檔案


#include "stdafx.h"
#include "FastString.h"
#include <stdlib.h>
#include <assert.h>
#include <TCHAR.h>

//////////////////////////////////////////////////////////////////////
// Construction/Destruction
//////////////////////////////////////////////////////////////////////

CFastString::CFastString()
{
	m_iBuffSize = DEFAULT_BUFFER_SIZE;
	m_pszStr = (char*)malloc(m_iBuffSize);
	memset(m_pszStr, 0, m_iBuffSize);
	
	m_iStrLen = 0;
}

CFastString::CFastString(const CFastString& cstrSrc)
{
	int iSrcSize = cstrSrc.length()+1;
	m_pszStr = AllocBuffer(iSrcSize);
	m_iStrLen = 0;
	
	//_tcscpy(m_pszStr, cstrSrc);
	memcpy(m_pszStr, cstrSrc.c_str(), iSrcSize);
	m_iStrLen = iSrcSize-1;
}

CFastString::CFastString(const char* pszSrc)
{
	assert(pszSrc);
	
	int iSrcSize = _tcslen(pszSrc) + 1;
	m_pszStr = AllocBuffer(iSrcSize);
	m_iStrLen = 0;
	
	//_tcscpy(m_pszStr, pszSrc);
	memcpy(m_pszStr, pszSrc, iSrcSize);
	m_iStrLen = iSrcSize-1;
}

CFastString::~CFastString()
{
	free(m_pszStr);
	m_pszStr = NULL;
	m_iStrLen = 0;
	m_iBuffSize = 0;
}

char* CFastString::AllocBuffer(int iAllocSize, double dScaleOut)
{
	if(dScaleOut < 1.0)
		dScaleOut = 1.0;

	int iNewBuffSize = int(iAllocSize*dScaleOut);
	if(iNewBuffSize > m_iBuffSize)
		m_iBuffSize = iNewBuffSize;
	char* pszNew = (char*)malloc(m_iBuffSize);
	return pszNew;
}

void CFastString::ReAllocBuff(int iNewSize)
{
	if(iNewSize <= 0)
	{
		assert(0);
		return ;
	}

	if(iNewSize <= m_iBuffSize)
		return ;

	m_iStrLen = 0;
	// 重新分配一塊記憶體
	free(m_pszStr);
	m_pszStr = (char*)malloc(iNewSize);
	m_iBuffSize = iNewSize;
}

void CFastString::Concat(const char* psz1, const char* psz2)
{
	assert(psz1);
	assert(psz2);
	if(NULL == psz1 || NULL == psz2)
		return;
	
	int iLen1 = _tcslen(psz1);
	int iLen2 = _tcslen(psz2);
	int iNewSize = iLen1 + iLen2 + 1;
	if(m_iBuffSize < iNewSize)
		ReAllocBuff(iNewSize);
	
	// 拷貝字串1
	memcpy(m_pszStr, psz1, iLen1);
	// 拷貝字串2
	memcpy(m_pszStr+iLen1, psz2, iLen2);
	m_iStrLen = iNewSize-1;
	
	*(m_pszStr+m_iStrLen) = 0;
}

const CFastString& CFastString::operator=(const char* pszSrc)
{
	assert(pszSrc);
	
	int iSrcSize = _tcslen(pszSrc)+1;
	if(m_iBuffSize < iSrcSize)
		ReAllocBuff(iSrcSize);
	
	//strcpy(m_pszStr, pszSrc);
	memcpy(m_pszStr, pszSrc, iSrcSize);
	m_iStrLen = iSrcSize - 1;
	return *this;
}

const CFastString& CFastString::operator+=(const CFastString& cstrSrc)
{
	cstrSrc.length();
	int iNewSize = cstrSrc.length() + length() + 1;
	if(m_iBuffSize >= iNewSize)
	{
		memcpy(m_pszStr+m_iStrLen, cstrSrc.c_str(), cstrSrc.length());
		*(m_pszStr+iNewSize-1) = 0;
	}
	else
	{
		char* pszNew = AllocBuffer(iNewSize, 1.5);
		memcpy(pszNew, m_pszStr, m_iStrLen);	
		memcpy(pszNew+m_iStrLen, cstrSrc.c_str(), cstrSrc.length());
		
		free(m_pszStr);
		m_pszStr = pszNew;
	}
	m_iStrLen = iNewSize-1;
	return *this;
}

const CFastString& CFastString::operator+=(const char *pszSrc)
{
	assert(pszSrc);
	
	int iLenSrc = _tcslen(pszSrc);
	int iNewSize = iLenSrc + length() + 1;

	// 當內部快取足夠時，直接進行拼接，不足時則需要開闢新的記憶體
	if(m_iBuffSize >= iNewSize)
	{
		memcpy(m_pszStr+m_iStrLen, pszSrc, iLenSrc);
		*(m_pszStr+iNewSize-1) = 0;
	}
	else
	{
		// 分配一塊新的記憶體，將之前的按原尺寸分配改為增加1.5
//		char* pszNew = AllocBuffer(iNewSize);
		char* pszNew = AllocBuffer(iNewSize, 1.5);

		// 將字串拷貝拼接到新開闢的記憶體中

		// 方法一：strcpy+strcat
// 		strcpy(pszNew, m_pszStr);		
// 		strcat(pszNew, pszSrc);
	
		// 方法二：直接使用記憶體拷貝
		memcpy(pszNew, m_pszStr, m_iStrLen);
		memcpy(pszNew+m_iStrLen, pszSrc, iLenSrc);
		
		free(m_pszStr);
		m_pszStr = pszNew;
	}
	m_iStrLen = iNewSize-1;
	return *this;
}

// ===============friend函式===================
CFastString operator+(const CFastString& cstr1, const CFastString& cstr2)
{
	CFastString cstrNew;
	cstrNew.Concat(cstr1, cstr2);
	return cstrNew;
}
CFastString operator+(const CFastString& cstr, const char* psz)
{
	CFastString cstrNew;
	cstrNew.Concat(cstr, psz);
	return cstrNew;
}
CFastString operator+(const char* psz, const CFastString& cstr)
{
	CFastString cstrNew;
	cstrNew.Concat(psz, cstr);
	return cstrNew;
}

從 Redux 說起，到手寫，再到狀態管理
2022-04-03
Redux
技術分享：記憶體管理
2018-12-14
記憶體
何時需要關注 Linux 的記憶體用量？
2019-07-06
Linux記憶體
從 MMU 看記憶體管理
2022-02-17
記憶體
Linux記憶體洩露案例分析和記憶體管理分享
2024-10-24
Linux記憶體洩露
記憶體管理記憶體管理概述
2020-11-03
記憶體
全網最硬核 JVM 記憶體解析 - 1.從 Native Memory Tracking 說起
2023-04-26
JVM記憶體
為何Google、微軟、華為將億級原始碼放一個倉庫？從全球最大程式碼管理庫說起...
2019-10-20
Go微軟原始碼
Python分享之Python的記憶體管理
2023-12-12
Python記憶體
【知識分享】伺服器記憶體和普通記憶體的區別
2022-11-15
伺服器記憶體
從記憶體對映mmap說開去
2019-04-08
記憶體
String/StringBuilder字串拼接操作
2019-04-16
UI字串
記憶體管理篇——實體記憶體的管理
2022-02-23
記憶體
為什麼 Linux 需要虛擬記憶體
2020-06-09
Linux記憶體
【記憶體管理】記憶體佈局
2024-06-10
記憶體
【知識分享】伺服器記憶體和普通電腦記憶體區別在哪
2023-01-31
伺服器記憶體
7.7 實現程式記憶體讀寫
2023-09-25
記憶體
4.JNI：操作字串String
2023-04-15
字串
高階記憶體管理程式設計指南-實用的記憶管理
2019-02-12
記憶體程式設計
從HBase offheap到Netty的記憶體管理
2019-04-30
Netty記憶體
Android中使用Handler為何造成記憶體洩漏？
2019-03-09
Android記憶體
技術分享：Python如何進行記憶體管理？
2021-06-10
Python記憶體
記憶體管理兩部曲之實體記憶體管理
2021-05-22
記憶體
Java應用程式中的記憶體洩漏及記憶體管理
2019-08-29
Java記憶體
Java的記憶體 -JVM 記憶體管理
2018-08-20
Java記憶體JVM
Go：記憶體管理與記憶體清理
2020-08-04
Go記憶體
【記憶體管理】Oracle AMM自動記憶體管理詳解
2020-08-27
記憶體Oracle
記憶體管理兩部曲之虛擬記憶體管理
2021-05-31
記憶體
從萌新的角度理解JVM記憶體管理
2018-09-14
JVM記憶體
從記憶體管理策略看Rust獨特性 - Khorchanov
2021-12-27
記憶體Rust
面試官：為什麼需要Java記憶體模型？
2021-10-14
面試Java記憶體模型
JavaScript 記憶體管理
2018-11-02
JavaScript記憶體
iOS 記憶體管理
2018-12-20
iOS記憶體
Android記憶體管理
2018-06-13
Android記憶體
OC記憶體管理
2018-08-29
記憶體
記憶體管理-swMemoryGlobal
2019-09-05
記憶體
Flink記憶體管理
2022-12-11
記憶體
MySQL記憶體管理
2021-01-03
MySql記憶體

【超值分享】為何寫伺服器程式需要自己管理記憶體，從改造std::string字串操作說起。。。

改進一：

改進二：

相關文章