移除函式呼叫能有更好的效能

發糞塗牆發表於2012-07-31

 

原文出自:

http://www.mssqltips.com/sqlservertip/2727/removing-function-calls-for-better-performance-in-sql-server/?utm_source=dailynewsletter&utm_medium=email&utm_content=headline&utm_campaign=2012726

 

問題:

         大部分人都知道不要在where子句中呼叫函式,這樣會影響你的效能。但是如果在SELECT中使用呢?本文將嘗試移除select中的函式呼叫能戲劇性地增強效能,特別在返回大資料量時。

 

解決方案:

示例表和函式:

在此例中,我們將建立兩個示例表和兩個訪問這些表的函式。至於填充表,你將不得不使用一種工具,Visual Studio來填充他們以提供一些合理的真實資料。在本例中,將對每個表填充20萬資料。其中一件需要注意的事是,這些示例函式只返回每個買家的一筆記錄。幾時存在多個買家。

下面是程式碼:

-- Table creationlogic

CREATE TABLE[dbo].[CarSale](

 [CarSaleID] [int] IDENTITY(1,1) NOT NULL,

 [PurchaseDate] [smalldatetime] NOT NULL,

    CONSTRAINT [PK_CarSale] PRIMARY KEYCLUSTERED ([CarSaleID] ASC)

);

CREATE TABLE[dbo].[Buyer](

 [BuyerID] [int] IDENTITY(1,1) NOT NULL,

 [CarSaleID] [int] NOT NULL,

 [LastName] [varchar](50) NULL,

 [FirstName] [varchar](100) NULL,

 [CompanyName] [varchar](200) NULL,

 CONSTRAINT [PK_Buyer] PRIMARY KEY NONCLUSTERED([BuyerID] ASC)

);

ALTER TABLE[dbo].[Buyer]  WITH CHECK ADD CONSTRAINT[FK_Buyer_CarSale] FOREIGN KEY([CarSaleID])

REFERENCES[dbo].[CarSale] ([CarSaleID]) ON UPDATE CASCADE ON DELETE CASCADE;

CREATE CLUSTEREDINDEX [IX_Buyer_CarSalelID] ON [dbo].[Buyer]([CarSaleID] ASC);

-- Function creationlogic

CREATE FUNCTION[dbo].[fnGetBuyerFirstName]

(@CarSaleID INT)

RETURNS VARCHAR (500)

AS

BEGIN

RETURN (SELECT Top 1FirstName

FROM Buyer

WHERE CarSaleID=@CarSaleID

ORDER BY BuyerID)

END

GO

CREATE FUNCTION[dbo].[fnGetBuyerLastName]

(@CarSaleID INT)

RETURNS VARCHAR (500)

AS

BEGIN

RETURN (SELECT Top 1coalesce(LastName,CompanyName)

FROM Buyer

WHERE CarSaleID=@CarSaleID

ORDER BY BuyerID)

END

GO

 

原始查詢:

SELECT cs.PurchaseDate,
       dbo.fnGetBuyerFirstName(cs.CarSaleID),
       dbo.fnGetBuyerLastName(cs.CarSaleID)
FROM CarSale cs
ORDER BY CarSaleID;

從上面程式碼中可以看出,每條記錄都呼叫一次函式。並且查詢了Buyer表兩次。當CarSale表有大量資料時,這種做法並不高效。執行計劃如下:


即使我們使用where子句限制查詢並只查詢一條資料,通過檢視執行計劃,如下,可以看到,依舊要對Buyer表做兩次搜尋。


修改後的查詢:

SELECT cs.PurchaseDate,
       dbo.fnGetBuyerFirstName(cs.CarSaleID),
       dbo.fnGetBuyerLastName(cs.CarSaleID)
FROM CarSale cs
WHERE CarSaleID=5
ORDER BY CarSaleID;

值得注意的是,在這個例子中,只返回了一條記錄。一下帶有更廣where條件從而返回更多資料的查詢會變得越來越慢。

 

去除函式的例子:

現在移除select中的函式呼叫,並使用表關聯來實現同樣結果,其中一個是使用了where子句,另外一個沒有限制:

SELECT cs.PurchaseDate,FirstName,LastName
FROM CarSale cs 
INNER JOIN (SELECT CarSaleID,MIN(BuyerID) AS SingleBuyerID FROM Buyer GROUP BY CarSaleID) m2 
          ON cs.CarSaleID=m2.CarSaleID
INNER JOIN Buyer m ON m2.CarSaleID=cs.CarSaleID AND m2.SingleBuyerID=m.BuyerID
ORDER BY cs.CarSaleID;

SELECT cs.PurchaseDate,FirstName,LastName
FROM CarSale cs 
INNER JOIN (SELECT CarSaleID,MIN(BuyerID) AS SingleBuyerID FROM Buyer GROUP BY CarSaleID) m2 
          ON cs.CarSaleID=m2.CarSaleID
INNER JOIN Buyer m ON m2.CarSaleID=cs.CarSaleID AND m2.SingleBuyerID=m.BuyerID
WHERE cs.CarSaleID=5
ORDER BY cs.CarSaleID;

通過檢視執行計劃,可以得出不用函式以後,不再需要每條記錄都去重新查詢。這是通過merge join來處理的。


為了確認這點,我們看看剛才去掉了函式之後的查詢,通過sql Profiler的跟蹤,可以得到多大的效能提升:

Query

WHERE Clause

CPU (ms)

Reads

Writes

Duration

Original

NO

10734

1239655

0

25879

YES

0

9

0

0

No Function Call

NO

578

16337

0

2457

YES

0

11

0

0

通過上面的結果可以看出,當返回的結果很大時,能從中得到相當大的好處,包括CPU、邏輯讀、持續時間等。當只返回一個結果時,效能更好。

 

最終版本,使用CTE:

因為在本例中,使用函式來返回單獨的買家,所以可以使用CTE來取得進一步的效能:

WITH summary AS (SELECT CarSaleID,            
                        BuyerID,           
                        FirstName,
                        LastName,
                        ROW_NUMBER() OVER(PARTITION BY CarSaleID ORDER BY CarSaleID) AS rk      
                 FROM Buyer) 
SELECT PurchaseDate,s.FirstName,s.LastName FROM CarSale cs INNER JOIN summary s  
ON s.CarSaleID=cs.CarSaleID WHERE s.rk = 1;

WITH summary AS (SELECT CarSaleID,            
                        BuyerID,           
                        FirstName,
                        LastName,
                        ROW_NUMBER() OVER(PARTITION BY CarSaleID ORDER BY CarSaleID) AS rk      
                 FROM Buyer) 
SELECT PurchaseDate,s.FirstName,s.LastName FROM CarSale cs INNER JOIN summary s  
ON s.CarSaleID=cs.CarSaleID WHERE s.rk = 1 AND cs.CarSaleID=5;

 

通過執行計劃和sqlprofiler對比得到:

Query

WHERE Clause

CPU (ms)

Reads

Writes

Duration

No Function Call add WITH statement

NO

266

15796

0

1931

YES

0

6

0

0

總結:

我同意第一種方式容易實現並容易閱讀,但是對效能提升來說,效能上的提升比程式碼量更重要。

相關文章