Optimal system performance begins with design and continues throughout the life of your system. Carefully consider performance issues during the initial design phase so that you can tune your system more easily during production.
2.1 Oracle Methodology
System performance has become increasingly important as computer systems get larger and more complex as the Internet plays a bigger role in business applications. To accommodate this, Oracle has produced a performance methodology based on years of designing and performance experience. This methodology explains clear and simple activities that can dramatically improve system performance.
Performance strategies vary in their effectiveness, and systems with different purposes—such as operational systems and decision support systems—require different performance skills. This book examines the considerations that any database designer, administrator, or performance expert should focus their efforts on.
System performance is designed and built into a system. It does not just happen. Performance problems are usually the result of contention for, or exhaustion of, some system resource. When a system resource is exhausted, the system cannot scale to higher levels of performance. This new performance methodology is based on careful planning and design of the database, to prevent system resources from becoming exhausted and causing down-time. By eliminating resource conflicts, systems can be made scalable to the levels required by the business.
2.2 Understanding Investment Options 專案投入
With the availability of relatively inexpensive, high-powered processors, memory, and disk drives, there is a temptation to buy more system resources to improve performance. In many situations, new CPUs, memory, or more disk drives can indeed provide an immediate performance improvement. However, any performance increases achieved by adding hardware should be considered a short-term relief to an immediate problem. If the demand and load rates on the application continue to grow, then the chance of the same problem occurring soon is likely.
In other situations, additional hardware does not improve the system's performance at all. Poorly designed systems perform poorly no matter how much extra hardware is allocated. Before purchasing additional hardware, ensure that serialization or single threading is not occurring within the application. Long-term, it is generally more valuable to increase the efficiency of your application in terms of the number of physical resources used for each business transaction.
但是有的時候加硬體是沒有用的。可能有人說怎麼可能? 我舉個簡單的例子,兄弟我唯一能玩的遊戲就是蜘蛛紙牌(4幅牌的那種噢),其他的都不喜歡。那你說給我一個8核,32G記憶體的電腦對於提升蜘蛛紙牌效能有多大影響?同樣的道理,我見過一個系統,他們在伺服器上就一個程式在處理業務(因為難以併發執行,我看了系統設計後發現要併發的確需要大量的工作要做,的確比較難),這樣使用1個主頻更高的POWER CPU更划算些,因為它只能使用這一個,給2個就是浪費,不如劃分一個給我玩蜘蛛紙牌了。另外一個就是批次程式,通常這玩意兒是要按序列化進行處理的,比如先要批次增加新註冊的個人資料,然後再處理公司的工資入賬預處理,同時併發執行個人業務相關預處理,但是以上所有這些工作不能放到真正入賬程式步驟之後,必須要提前處理的,順序一定要正確的,這就是序列化。因此對於多大的交易量,單一時刻也只能使用那麼多。更簡單的說就是隻有一張嘴,不能說著話再吃著東西吧!(或許上帝發明鼻子就是為了解決吃飯時還可以吸氣?)
2.3 Understanding Scalability
The word scalability is used in many contexts in development environments. The following section provides an explanation of scalability that is aimed at application designers and performance specialists.
What is Scalability?
System Scalability
Factors Preventing Scalability
Scalability is a system's ability to process more workload, with a proportional increase in system resource usage. In other words, in a scalable system, if you double the workload, then the system uses twice as many system resources. This sounds obvious, but due to conflicts within the system, the resource usage might exceed twice the original workload.
Examples of poor scalability due to resource conflicts include the following:
Applications requiring significant concurrency management as user populations increase 隨著使用者量的增加,系統需要更多的併發處理
Increased locking activities 業務量增加時鎖隨之增加
Increased data consistency workload 資料併發量隨著業務量增加而增加
Increased operating system workload 作業系統負荷增加
Transactions requiring increases in data access as data volumes increase 業務量增加時事務處理隨之增加
Poor SQL and index design resulting in a higher number of logical I/Os for the same number of rows returned SQL和index設計差勁
Reduced availability, because database objects take longer to maintain 資料庫需要更長的維護,導致可用性降低
An application is said to be unscalable if it exhausts a system resource to the point where no more throughput is possible when its workload is increased. Such applications result in fixed throughputs and poor response times.
Examples of resource exhaustion include the following:
Hardware exhaustion
Table scans in high-volume transactions causing inevitable disk I/O shortages
Excessive network requests, resulting in network and scheduling bottlenecks
Memory allocation causing paging and swapping
Excessive process and thread allocation causing operating system thrashing
This means that application designers must create a design that uses the same resources, regardless of user populations and data volumes, and does not put loads on the system resources beyond their limits.
Applications that are accessible through the Internet have more complex performance and availability requirements. Some applications are designed and written only for Internet use, but even typical back-office applications—such as a general ledger application—might require some or all data to be available online.
Characteristics of Internet age applications include the following: 網際網路應用主要包含著下面幾個特點:
Availability 24 hours a day, 365 days a year 7*24小時服務
Unpredictable and imprecise number of concurrent users 不可預測的併發使用者量(天弘基金做餘額寶就是一開始沒預測到吧!網民太給面子了,我算一個了)
Difficulty in capacity planning 難以預先做好擴容計劃(準備處理1000W使用者的系統如果只有10W人來用,也是浪費錢;反之更慘)
Availability for any type of query 查詢條件亂七八糟
Multitier architectures 多層次的體系架構(這個應該還可以了,weblogic,spring等等架構還是必要的)
Stateless middleware 無用的中間層(這個在C架構中容易出現,可能是為了靈活造成不必要的中間層設計)
Rapid development timescale 開發時間比較短(銀行裡開發更短!)
Minimal time for testing 測試時間不夠(網際網路應用有測試嗎?許多程式1天三更新的,不就是說明沒測試就讓我們做小白鼠嘛)
Figure 2-1 illustrates the classic workload growth curve, with demand growing at an increasing rate. Applications must scale with the increase of workload and also when additional hardware is added to support increasing demand. Design errors can cause the implementation to reach its maximum, regardless of additional hardware resources or re-design efforts.
Figure 2-1 Workload Growth Curve 壓力增長曲線圖
Applications are challenged by very short development timeframes with limited time for testing and evaluation. However, bad design typically means that you must later rearchitect and reimplement the system. If you deploy an application with known architectural and implementation limitations on the Internet, and if the workload exceeds the anticipated demand, then failure is a real possibility. From a business perspective, poor performance can mean a loss of customers. If Web users do not get a response in seven seconds, then the user's attention could be lost forever.
In many cases, the cost of re-designing a system with the associated downtime costs in migrating to new implementations exceeds the costs of properly building the original system. The moral of the story is simple: design and implement with scalability in mind from the start.
When building applications, designers and architects should aim for as close to perfect scalability as possible. This is sometimes called linear scalability, where system throughput is directly proportional to the number of CPUs.
In real life, linear scalability is impossible for reasons beyond a designer's control. However, making the application design and implementation as scalable as possible should ensure that current and future performance objectives can be achieved through expansion of hardware components and the evolution of CPU technology.
(微軟SQL SERVER這個破東西在解決併發方面就是一個反面的例子!一併發就亂加鎖啦,能不能學學Oracle的設計啊!DB2好像也會鎖提升的,只學oracle的DBA絕對不可想像這些系統的!好在銀行的核心繫統除了oracle,DB2是不會用其他的了。informix,sybase逐漸被oracle替代,SQL SERVER是那種可有可無的邊緣專案可能會用,MYSQL這種網際網路應用資料庫,果斷無法應用到金融方面大事務系統上的。我OCM老師說MYSQL主要就是要消滅SQL SERVER的,呵呵)
Factors that may prevent linear scalability include: 影響線性擴充套件的因素包括:
Poor application design, implementation, and configuration 差勁的應用程式設計、實現、及配置化
The application has the biggest impact on scalability. For example: 應用程式在擴充套件性方面有非常大的影響,比如:
Poor schema design can cause expensive SQL that do not scale. 糟糕的schema設計導致昂貴的SQL處理(你oracle能實現此功能不就可以啦?)
Poor transaction design can cause locking and serialization problems. 糟糕的事務處理機制導致鎖和序列化處理
Poor connection management can cause poor response times and unreliable systems. 糟糕的連線管理導致響應時間下降和系統不穩定(這個多數不是問題的,因為C是通常長連線的,JAVA是用連線池的)
However, the design is not the only problem. The physical implementation of the application can be the weak link. For example:
Systems can move to production environments with bad I/O strategies. 糟糕的硬碟配置不正常(通常測試環境配置差,結果生產也按測試一樣配置的結果)
The production environment could use different execution plans than those generated in testing. 測試環境與生產環境執行計劃不一樣,這很正常,因為測試環境使用的資料通常都是自己手工造的資料。但是我們應該去分析。同時如果是維護期的話可以使用手續章節講解的內容處理。
Memory-intensive applications that allocate a large amount of memory without much thought for freeing the memory at run time can cause excessive memory usage. 應用程式分配了太多的記憶體卻沒有釋放掉。
Inefficient memory usage and memory leaks put a high stress on the operating virtual memory subsystem. This impacts performance and availability. 記憶體太小或者出現記憶體洩漏導致記憶體swap到硬碟上了,這個是影響效能和可用性的。
Incorrect sizing of hardware components
Bad capacity planning of all hardware components is becoming less of a problem as relative hardware prices decrease. However, too much capacity can mask scalability problems as the workload is increased on a system.
Limitations of software components 軟體限制
All software components have scalability and resource usage limitations. This applies to application servers, database servers, and operating systems. Application design should not place demands on the software beyond what it can handle. 所有的軟體都有資源使用的限制。無論是對於應用軟體,資料庫,還是作業系統都是一樣的。應用程式設計應該考慮到這些因素,不要超過他們的能力範圍。(比如oracle讓給它設定processes=10000,作業系統也掛了,何況它呢)
Limitations of Hardware Components 硬體限制
Hardware is not perfectly scalable. Most multiprocessor computers can get close to linear scaling with a finite number of CPUs, but after a certain point each additional CPU can increase performance overall, but not proportionately. There might come a time when an additional CPU offers no increase in performance, or even degrades performance. This behavior is very closely linked to the workload and the operating system setup.
These factors are based on Oracle Server Performance group's experience of tuning unscalable systems.
這些內容和寫oracle此文章的作者無關。因為他們是根據Oracle Server Performance group's 的一些非擴充套件系統的經驗而編注的。
2.4 System Architecture
There are two main parts to a system's architecture:
Hardware and Software Components
Configuring the Right System Architecture for Your Requirements
This section discusses:
Hardware Components
Software Components
Today's designers and architects are responsible for sizing and capacity planning of hardware at each tier in a multitier environment. It is the architect's responsibility to achieve a balanced design. This is analogous to a bridge designer who must consider all the various payload and structural requirements for the bridge. A bridge is only as strong as its weakest component. As a result, a bridge is designed in balance, such that all components reach their design limits simultaneously.
The main hardware components include:
I/O Subsystem
There can be one or more CPUs, and they can vary in processing power from simple CPUs found in hand-held devices to high-powered server CPUs. Sizing of other hardware components is usually a multiple of the CPUs on the system. See Chapter 9, "Managing Operating System Resources".
Database and application servers require considerable amounts of memory to cache data and avoid time-consuming disk access. See Chapter 7, "Configuring and Using Memory".
The I/O subsystem can vary between the hard disk on a client PC and high performance disk arrays. Disk arrays can perform thousands of I/Os each second and provide availability through redundancy in terms of multiple I/O paths and hot pluggable mirrored disks. See Chapter 8, "I/O Configuration and Design".
All computers in a system are connected to a network, from a modem line to a high speed internal LAN. The primary concerns with network specifications are bandwidth (volume) and latency (speed).
The same way computers have common hardware components, applications have common functional components. By dividing software development into functional components, it is possible to better comprehend the application design and architecture. Some components of the system are performed by existing software bought to accelerate application implementation, or to avoid re-development of common components.
The difference between software components and hardware components is that while hardware components only perform one task, a piece of software can perform the roles of various software components. For example, a disk drive only stores and retrieves data, but a client program can manage the user interface and perform business logic.
Most applications involve the following components: 多數的系統都會實現以下的功能
Managing the User Interface 使用者介面管理(這個通常是JAVA做的)
Implementing Business Logic 業務邏輯處理(這個通常是C做的)
Managing User Requests and Resource Allocation 管理使用者請求和資源分配(網銀的話是由JAVA處理或是weblogic吧,其他的多數是由C處理)
Managing Data and Transactions 資料管理和事務處理(基於上都是C來做的,儲存過程也可以的)
This component is the most visible to application users, and includes the following functions: 對使用者可用的模組透過包含下面的功能:
Displaying the screen to the user 一個漂亮的顯示介面,不像sqlplus黑不拉嘰的
Collecting user data and transferring it to business logic 收集使用者錄入的資料,然後對其進行加工處理
Validating data entry 驗證使用者的資料合法性
Navigating through levels or states of the application 根據級別或身份做不同的應用導航,簡單說就是不同人看到的東西是不一樣的
This component implements core business rules that are central to the application function. Errors made in this component can be very costly to repair. This component is implemented by a mixture of declarative and procedural approaches. An example of a declarative activity is defining unique and foreign keys. An example of procedure-based logic is implementing a discounting strategy.
應用實現的核心就是那些處理核心功能的模組。一旦這些核心模組出現錯誤就要趕緊去修復,就像看到alert日誌中出錯一樣。這些模組的實現通常是declarative and procedural 共同實現的。具個例子就是:declarative 就是定義唯一索引和外來鍵。procedure-based logic 就是實現打折銷售策略。(這兩個詞語還沒有好的翻譯方式;或許他的意思是一個是搞技術的去做,一個是搞業務的去做,兩個人共同配置以達到需要的效果?)
Common functions of this component include:
Moving a data model to a relational table structure 將資料模型遷移到關聯式資料庫上(比如以前的網狀結構資料庫或者將EXCEL表格資料)
Defining constraints in the relational table structure 在資料結構(表)上定義約束(這個多數不用,因為很多語法在不同資料庫上不好遷移)
Coding procedural logic to implement business rules 編寫程式邏輯去實現業務需求
This component is implemented in all pieces of software. However, there are some requests and resources that can be influenced by the application design and some that cannot.
In a multiuser application, most resource allocation by user requests are handled by the database server or the operating system. However, in a large application where the number of users and their usage pattern is unknown or growing rapidly, the system architect must be proactive to ensure that no single software component becomes overloaded and unstable.
Common functions of this component include:
Connection management with the database 與資料庫的連線管理(如dedicade,shared,mixed)
Executing SQL efficiently (cursors and SQL sharing) SQL執行的效率
Managing client state information 客戶端狀態資訊管理
Balancing the load of user requests across hardware resources 根據硬體資源來平衡使用者的業務請求(如VIP單獨給一臺機器處理)
Setting operational targets for hardware and software components 配置軟硬體的資源使用
Persistent queuing for asynchronous execution of tasks 非同步執行任務的持久排除
This component is largely the responsibility of the database server and the operating system. 這個模組是資料庫和作業系統的職責
Common functions of this component include:
Providing concurrent access to data using locks and transactional semantics 透過鎖和事務原語對併發資料進行訪問管理
Providing optimized access to the data using indexes and memory cache 使用index和快取提供最佳的資料訪問方式
Ensuring that data changes are logged in the event of a hardware failure 確保在當機時資料已經做了記錄
Enforcing any rules defined for the data 確保資料的定義規則
Configuring the initial system architecture is a largely iterative process. System architects must satisfy the system requirements within budget and schedule constraints. If the system requires interactive users transacting business-making decisions based on the contents of a database, then user requirements drive the architecture. If there are few interactive users on the system, then the architecture is process-driven.
Examples of interactive user applications: 使用者互動類的系統
Accounting and bookkeeping applications 會計賬務系統(會計人員要登入上去做查詢或賬務錄入)
Order entry systems 訂單處理系統
Email servers 郵件服務
Web-based retail applications 淘寶零售
Trading systems 交易系統
Examples of process-driven applications: 應用(程式,業務)驅動的系統
Utility billing systems 賬務處理系統
Fraud detection systems 欺詐系統(或者是風險管理系統)
Direct mail 直接的郵件處理系統
In many ways, process-driven applications are easier to design than multiuser applications because the user interface element is eliminated. However, because the objectives are process-oriented, system architects not accustomed to dealing with large data volumes and different success factors can become confused. Process-driven applications draw from the skills sets used in both user-based applications and data warehousing. Therefore, this book focuses on evolving system architectures for interactive users.
Generating a system architecture is not a deterministic process. It requires careful consideration of business requirements, technology choices, existing infrastructure and systems, and actual physical resources, such as budget and manpower.
The following questions should stimulate thought on system architecture, though they are not a definitive guide to system architecture. These questions demonstrate how business requirements can influence the architecture, ease of implementation, and overall performance and availability of a system. For example:
How many users must the system support? -----> 你的破系統,破業務到底要支援多少使用者訪問?
Most applications fall into one of the following categories: 多數的系統基於上基於以下3個考慮:
Very few users on a lightly-used or exclusive computer 只有很少的使用者在一個獨有的或者壓力比較小的系統上執行
For this type of application, there is usually one user. The focus of the application design is to make the single user as productive as possible by providing good response time, yet make the application require minimal administration. Users of these applications rarely interfere with each other and have minimal resource conflicts. 這種系統通常只有一個使用者在使用。設計的目標是給這個哥們提供儘可以快的響應時間,因此這種應用很少需要去管理。這種應用程式很少需要和其他使用者進行打交道。(說的可能是我的蜘蛛紙牌程式不需要和其他人的蜘蛛紙牌溝通了吧)
A medium to large number of users in a corporation using shared applications 在一箇中等的公司內部大家共用一些資源
For this type of application, the users are limited by the number of employees in the corporation actually transacting business through the system. Therefore, the number of users is predictable. However, delivering a reliable service is crucial to the business. The users must share a resource, so design efforts must address response time under heavy system load, escalation of resource for each session usage, and room for future growth.對於這類應用程式,使用者數是由在公司內部的,實際透過此係統進行交易處理的員工數決定的(比如公司報銷時,可能同一時間就那2%左右的員工會同時登入系統進行報銷錄入)。因此使用者數是可以預測的。然而這類系統需要一個非常健壯的服務功能。許多使用者共享一個資源,那麼設計的原則就是要在大負荷下能提供儘可能快的響應時間,增加資源以滿足未來增加的員工。(我們公司那個破財務系統,是由2個2年工作經驗的人寫的,還只是支援IE,我都告訴他們去修改一點點CSS就可以用的情況下他們也不改,實在是不好意思在辦公室罵人,特別是人家不是我們部門的,更不好去當著人家領導的面批評他們了)
An infinite user population distributed on the Internet 網際網路上不確定的使用者資料(網遊,淘寶等)
For this type of application, extra engineering effort is required to ensure that no system component exceeds its design limits. This creates a bottleneck that halts or destabilizes the system. These applications require complex load balancing, stateless application servers, and efficient database connection management. In addition, use statistics and governors to ensure that the user receives feedback if the database cannot satisfy their requests because of system overload.對於這類系統,額外的一些工作就是要確保系統中沒有某一個模組是超負荷動作。這樣就產生了一個瓶頸。這些應用程式需要更加複雜的壓力測試(國內一般用loadrunner來壓的),無狀態的應用服務(這個不清楚啊),和有效的資料庫連線(一般weblogic都幫應用開發人員處理了)。另外,要使用統計資料和協調部件確保,在資料庫超負荷壓力下無法滿足客戶請求時給使用者一些反饋資訊。(就像404網頁無法開啟錯誤一樣,給點安慰的資訊)
What will be the user interaction method? 使用者需要以什麼樣的方式和系統互動?在基於B/S的瀏覽器方式還是基於傳統的windows視窗程式(客戶應用程式)?
The choices of user interface range from a simple Web browser to a custom client program.
Where are the users located? 使用者分佈在哪裡?使用者的分佈決定了網路的延遲處理設計。同時也影響一天業務高峰區間,影響批次或維護時間的確定。
The distance between users influences how the application is engineered to cope with network latencies. The location also affects which times of the day are busy, when it is impossible to perform batch or system maintenance functions.
What is the network speed? 網路的速度?速度決定了傳輸的方式。高度靈敏的使用者介面可以應對任何的衝擊;否則就需要傳送等待模型處理。在慢網路中不可能去實現大資料傳輸的同時滿足客戶的通訊體驗。
Network speed affects the amount of data and the conversational nature of the user interface with the application and database servers. A highly conversational user interface can communicate with back-end servers on every key stroke or field level validation. A less conversational interface works on a screen-sent and a screen-received model. On a slow network, it is impossible to achieve high data entry speeds with a highly conversational user interface.
How much data will the user access, and how much of that data is largely read only? 需要處理多少使用者資料,多少資料是隻讀的?查詢的方式會很大的影響系統的設計,包括表建立方式及索引的建立。必須要確保資料庫的響應時間不是一個問題。假如系統主要是讀取一些資料的話,那麼將資料快取到應用程式服務端就是一個好的方案。這也減少了核心事務處理的壓力。(我們在設計系統時,這方面很多時候處理的相對是過度的,也就是說都儘可能的去快取,要麼快取到共享記憶體中,或者自己的程式內,儘可能地去麻煩資料庫,因為資料庫是要處理多使用者多併發的,因此實現一個單使用者多併發是效率非常高的。由於現在資料庫伺服器效能越來越強勁,所以有些系統又把這些快取放回到資料庫端,不再做這些功能的編寫。當然oracle是建議我們這麼做的)
The amount of data queried online influences all aspects of the design, from table and index design to the presentation layers. Design efforts must ensure that user response time is not a function of the size of the database. If the application is largely read only, then replication and data distribution to local caches in the application servers become a viable option. This also reduces workload on the core transactional server.
What is the user response time requirement? 需要提供的一個使用者響應時間是多少?
Consideration of the user type is important. If the user is an executive who requires accurate information to make split second decisions, then user response time cannot be compromised. Other types of users, such as users performing data entry activities, might not need such a high level of performance. 不同使用者需求是不一樣的。如果一個公司高官,他需要一個準確的資訊以制定計劃,那麼就不能以提供響應時間上的妥協。另外一些使用者,例如主要做一些資料的錄入工作,那就不需要太高的效能。
Do users expect 24 hour service? 是否需要7*24小時的服務?
This is mandatory for today's Internet applications where trade is conducted 24 hours a day. However, corporate systems that run in a single time zone might be able to tolerate after-hours downtime. You can use this after-hours downtime to run batch processes or to perform system administration. In this case, it might be more economic not to run a fully-available system.
Must all changes be made in real time? 是否需要實時的處理?
It is important to determine whether transactions must be executed within the user response time, or if they can be queued for asynchronous execution.
簡單概括一下就是設計時考慮一下:系統有多少人會訪問,以什麼方式訪問,對網路要求有多高, 一天的交易量大概有多少,TPS要求多高,系統是否需要7*24時執行。
The following are secondary questions, which can also influence the design, but really have more impact on budget and ease of implementation. For example:
How big will the database be? 需要多大的資料庫?
This influences the sizing of the database server. On servers with a very large database, it might be necessary to have a bigger computer than dictated by the workload. This is because the administration overhead with large databases is largely a function of the database size. As tables and indexes grow, it takes proportionately more CPUs to allow table reorganizations and index builds to complete in an acceptable time limit.
What is the required throughput of business transactions? 業務處理的最大吞吐量是多少?
What are the availability requirements? 可靠性要求是什麼? (是否要做DG)
Do skills exist to build and administer this application? 開發人員的能力是否能夠滿足目前的專案開發?(銀行裡很多是由剛畢業的大學生做的)
What compromises are forced by budget constraints? 在專案預算是有多少可以協商的餘地?(一般沒得商量)
