14個高效能Java永續性技巧
豬豬最近在研究資料庫持久層的效能最佳化,做了大量的知識儲備,今天分享一篇著名的flexy-pool作者在2019年1月22日更新的該領域的技術文章。我們先看譯文,再看英文原文。翻譯不當的地方,大家可以直接看英文原文哈。
一個高效能的資料訪問層需要大量關於資料庫內部、JDBC、JPA、Hibernate的知識,本文總結了一些可用來最佳化企業應用程式的重要的技術。
1. SQL語句日誌
如果您用了生成符合自己使用習慣的語句的框架,則應始終驗證每個語句的有效性和效率。測試時使用斷言機制驗證更好,因為即使在提交程式碼之前,也可以捕獲N + 1個查詢問題。
2.連線管理
資料庫的連線開銷非常大,因此您應該始終使用連線池機制。
由於連線數由底層資料庫叢集的功能給出,所以您需要儘可能快地釋放連線。
在效能調優中,你總是要測量、設定出正確的連線池,池的大小又是差不多的。 但像FlexyPool這樣工具可以幫助您找到合適的大小,即使您已經將應用程式部署到生產環境中。
3.JDBC批處理
JDBC批處理允許我們在單個資料庫往返中傳送多個SQL語句。效能增益在驅動程式和資料庫端都很重要。PreparedStatements 非常適合批處理,而某些資料庫系統(例如 Oracle)僅支援用於預處理語句的批處理。
由於JDBC為批處理定義了獨特的API(例如PreparedStatement.addBatch和PreparedStatement.executeBatch),如果您手動生成語句,那麼您應該從一開始就知道是否應該使用批處理。 使用Hibernate,您可以切換到使用單個配置的批處理。
Hibernate 5.2 提供了會話級別的批處理,所以在這方面更加靈活。
4.語句快取
語句快取是您可以輕鬆利用的最鮮為人知的效能最佳化之一。 根據基礎的JDBC驅動程式,可以在客戶端(驅動程式)或資料庫端(語法樹甚至執行計劃)上快取PreparedStatements。
5.Hibernate 識別符號
當使用Hibernate時,IDENTITY生成器不是一個好的選擇,因為它禁用了JDBC批處理。
TABLE生成器更糟糕,因為它使用一個單獨的事務來獲取新的識別符號,這會對底層事務日誌以及連線池造成壓力,因為每次我們需要一個新的識別符號時都需要單獨的連線。
SEQUENCE是正確的選擇,甚至從2012版本就開始支援SQL Server。對於SEQUENCE識別符號,Hibernate一直提供最佳化器,如 pooled 或 pooled-lo,這可以減少獲取新的實體識別符號值所需的資料庫往返次數。
6.選擇正確的列型別
您應該始終在資料庫端使用正確的列型別。 列型別越緊湊,資料庫工作集中可容納的條目越多,索引將更好地適應於記憶體。 為此,您應該利用特定於資料庫的型別(例如PostgreSQL中的IPv4地址的inet),尤其是在實現新自定義型別時,Hibernate非常靈活。
7 .關係
Hibernate 帶有許多關係對映型別,但並不是所有的關係對映型別在效率上都是相等的。
應該避免單向集合和 @ManyToMany 列表。如果您確實需要使用實體集合,則首選雙向 @OneToMany關聯。對於 @ManyToMany 關係,使用 Set(s),因為在這種情況下它們更高效,或者簡單地對映連結的多對多表,並將 @ManyToMany 關係轉換為兩個雙向的 @OneToMany 關聯。
然而,與查詢不同,集合不夠靈活,因為它們不易分頁,這意味著當子關聯的數量相當高時,我們不能使用它們。出於這個原因,你應該考慮一個集合是否真的有必要。 在許多情況下,實體查詢可能是更好的選擇。
8.繼承
就繼承而言,面嚮物件語言和關聯式資料庫之間的不匹配變得更加明顯。 JPA提供了SINGLE_TABLE,JOINED和TABLE_PER_CLASS來處理繼承對映,每個策略都有其優缺點。
SINGLE_TABLE在SQL語句方面表現最好,但由於我們不能使用NOT NULL約束,所以我們在資料完整性方面失敗了。
當同時提供更復雜的語句時,JOINED採用資料完整性限制。 只要你不使用基本型別的多型查詢或@OneToMany關聯,這個策略就沒有問題。 它的真正的作用在於對資料訪問層上由策略模式支援的多型@ManyToOne關聯。
應該避免使用TABLE_PER_CLASS,因為它不會生成有效的SQL語句。
9.永續性上下文的大小
在使用 JPA 和 Hibernate 時,應該始終關注永續性上下文的大小。 出於這個原因,您不應該過多地使用託管實體。 透過限制託管實體的數量,我們可以獲得更好的記憶體管理,並且預設的檢查機制也將更加高效。
10.只抓取必要的東西
獲取太多的資料可能是導致資料訪問層效能出問題的首要原因。 一個問題是,即使是隻讀的 Projections,實體查詢也是專用的。
DTO projections更適合於獲取自定義檢視,而實體只能在業務流需要修改時才能獲取。
EAGER抓取是最糟糕的,您應該避免反模式(Anti-Pattern),例如 Open-Session in View。
11.快取記憶體
關聯式資料庫系統使用許多記憶體緩衝區結構來避免磁碟訪問。 資料庫快取經常被忽視。 我們可以透過適當調整資料庫引擎來顯著降低響應時間,以便工作集駐留在記憶體中,而不是一直從磁碟中獲取。
應用程式級快取對於許多企業應用程式來說是不可選的。 應用程式級快取可以減少響應時間,同時為資料庫關閉以進行維護或由於某些嚴重系統故障提供只讀輔助儲存庫。
二級快取對於減少讀寫事務響應時間非常有用,特別是在主從複製體系結構中。 根據應用程式的要求,Hibernate允許你在READ_ONLY,NONSTRICT_READ_WRITE,READ_WRITE和TRANSACTIONAL之間進行選擇。
12.併發控制
在效能和資料完整性方面,事務隔離級別的選擇是非常重要的。 對於多請求Web流程,為避免丟失更新,您應該對分離的實體或 EXTENDED 永續性上下文使用 optimistic 鎖定。
為避免optimistic locking誤報,您可以使用無版本 optimistic 併發控制或基於讀寫的屬性集來拆分實體。
13.釋放資料庫查詢功能
僅僅因為您使用JPA或Hibernate,並不意味著您不應該使用原生查詢。 您應該利用視窗函式,CTE(公用表表示式),CONNECT BY,PIVOT 查詢。
這些構造允許您避免獲取太多的資料,以便稍後在應用程式層進行轉換。 如果可以讓資料庫進行處理,那麼只能獲取最終結果,因此可以節省大量的磁碟I / O和網路開銷。 為避免主節點過載,可以使用資料庫複製和擁有多個從屬節點,這樣資料密集型的任務就會在從屬節點而不是主節點上執行。
14.橫向擴充套件和縱向擴充套件
關聯式資料庫的伸縮性非常好。如果Facebook、Twitter、Pinterest或StackOverflow可以擴充套件他們的資料庫系統,那麼很有可能您可以將企業應用程式擴充套件到其特定的業務需求。
資料庫複製和分片是提高吞吐量的很好的方法,您應該完全可以利用這些經過測試的架構模式來擴充套件您的企業應用程式。
結論
高效能資料訪問層必須與底層資料庫系統互相響應。 瞭解關聯式資料庫和正在使用的資料訪問框架的內部工作原理可以使企業高效能應用程式和幾乎沒有crawls的應用程式之間產生差異。
原文:
A high-performance data access layer requires a lot of knowledge about database internals, JDBC, JPA, Hibernate, and this post summarizes some of the most important techniques you can use to optimize your enterprise application.
1. SQL statement logging
If you’re using a framework that generates statements on your behalf, you should always validate each statement effectiveness and efficiency. A testing-time assertion mechanism is even better because you can catch N+1 query problems even before you commit your code.
2. Connection management
Database connections are expensive, therefore you should always use a connection pooling mechanism.
Because the number of connections is given by the capabilities of the underlying database cluster, you need to release connections as fast as possible.
In performance tuning, you always have to measure, and setting the right pool size is no different. A tool like FlexyPool can help you find the right size even after you deployed your application into production.
3. JDBC batching
JDBC batching allows us to send multiple SQL statements in a single database roundtrip. The performance gain is significant both on the Driver and the database side. PreparedStatements
are very good candidates for batching, and some database systems (e.g. Oracle) support batching only for prepared statements only.
Since JDBC defines a distinct API for batching (e.g. PreparedStatement.addBatch
and PreparedStatement.executeBatch
), if you’re generating statements manually, then you should know right from the start whether you should be using batching or not. With Hibernate, you can switch to batching with a single configuration.
Hibernate 5.2 offers Session-level batching, so it’s even more flexibile in this regard.
4. Statement caching
Statement caching is one of the least-known performance optimization that you can easily take advantage of. Depending on the underlying JDBC Driver, you can cache PreparedStatements
both on the client-side (the Driver) or databases-side (either the syntax tree or even the execution plan).
5. Hibernate identifiers
When using Hibernate, the IDENTITY
generator is not a good choice since it disables JDBC batching.
TABLE
generator is even worse since it uses a separate transaction for fetching a new identifier, which can put pressure on the underlying transaction log, as well as the connection pool since a separate connection is required every time we need a new identifier.
SEQUENCE
is the right choice, and even SQL Server supports since version 2012. For SEQUENCE
identifiers, Hibernate has long been offering optimizers like pooled or pooled-lo which can reduce the number of database roundtrips required for fetching a new entity identifier value.
6. Choosing the right column types
You should always use the right column types on the database side. The more compact the column type is, the more entries can be accommodated in the database working set, and indexes will better fit into memory. For this purpose, you should take advantage of database-specific types (e.g. inet
for IPv4 addresses in PostgreSQL), especially since Hibernate is very flexible when it comes to implementing a new custom Type.
7. Relationships
Hibernate comes with many relationship mapping types, but not all of them are equal in terms of efficiency.
Unidirectional collections and @ManyToMany
List(s) should be avoided. If you really need to use entity collections, then bidirectional @OneToMany
associations are preferred. For the @ManyToMany
relationship, use Set(s) since they are more efficient in this case or simply map the linked many-to-many table as well and turn the @ManyToMany
relationship into two bidirectional @OneToMany
associations.
However, unlike queries, collections are less flexible since they cannot be easily paginated, meaning that we cannot use them when the number of child associations is rather high. For this reason, you should always question if a collection is really necessary. An entity query might be a better alternative in many situations.
8. Inheritance
When it comes to inheritance, the impedance mismatch between object-oriented languages and relational databases becomes even more apparent. JPA offers SINGLE_TABLE
, JOINED
, and TABLE_PER_CLASS
to deal with inheritance mapping, and each of these strategies has pluses and minuses.
SINGLE_TABLE
performs the best in terms of SQL statements, but we lose on the data integrity side since we cannot use NOT NULL
constraints.
JOINED
addresses the data integrity limitation while offering more complex statements. As long as you don’t use polymorphic queries or @OneToMany
associations against base types, this strategy is fine. Its true power comes from polymorphic @ManyToOne
associations backed by a Strategy pattern on the data access layer side.
TABLE_PER_CLASS
should be avoided since it does not render efficient SQL statements.
9. Persistence Context size
When using JPA and Hibernate, you should always mind the Persistence Context size. For this reason, you should never bloat it with tons of managed entities. By restricting the number of managed entities, we gain better memory management, and the default dirty checking mechanism is going to be more efficient as well.
10. Fetching only what’s necessary
Fetching too much data is probably the number one cause for data access layer performance issues. One issue is that entity queries are used exclusively, even for read-only projections.
DTO projections are better suited for fetching custom views, while entities should only be fetched when the business flow requires to modify them.
EAGER fetching is the worst, and you should avoid anti-patterns such as Open-Session in View.
11. Caching
Relational database systems use many in-memory buffer structures to avoid disk access. Database caching is very often overlooked. We can lower response time significantly by properly tuning the database engine so that the working set resides in memory and is not fetched from disk all the time.
Application-level caching is not optional for many enterprise application. Application-level caching can reduce response time while offering a read-only secondary store for when the database is down for maintenance or because of some serious system failure.
The second-level cache is very useful for reducing read-write transaction response time, especially in Master-Slave replication architectures. Depending on application requirements, Hibernate allows you to choose between READ_ONLY, NONSTRICT_READ_WRITE, READ_WRITE, and TRANSACTIONAL.
12. Concurrency control
The choice of transaction isolation level is of paramount importance when it comes to performance and data integrity. For multi-request web flows, to avoid lost updates, you should use optimistic locking with detached entities or an EXTENDED
Persistence Context.
To avoid optimistic locking
false positives, you can use versionless optimistic concurrency control or split entities based write-based property sets.
13. Unleash database query capabilities
Just because you use JPA or Hibernate, it does not mean that you should not use native queries. You should take advantage of Window Functions, CTE (Common Table Expressions), CONNECT BY
, PIVOT
.
These constructs allow you to avoid fetching too much data just to transform it later in the application layer. If you can let the database do the processing, you can fetch just the end result, therefore, saving lots of disk I/O and networking overhead. To avoid overloading the Master node, you can use database replication and have multiple Slave nodes available so that data-intensive tasks are executed on a Slave rather than on the Master.
14. Scale up and scale out
Relational databases do scale very well. If Facebook, Twitter, Pinterest or StackOverflow can scale their database system, there is good chance you can scale an enterprise application to its particular business requirements.
Database replication and sharding are very good ways to increase throughput, and you should totally take advantage of these battle-tested architectural patterns to scale your enterprise application.
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/31555484/viewspace-2636142/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 006.OpenShift永續性儲存
- 永續合約開發運營版丨永續合約系統開發(詳細規則)丨永續合約系統原始碼版及方案原始碼
- 永續合約槓桿交易/系統1開發/永續交易量化策略技術開發程式
- Veeam代理解決方案,讓可用性永續
- 讓網站不停止,永遠持續執行網站
- 永續合約交易所繫統開發(開發邏輯)丨永續合約系統開發(原始碼方案)原始碼
- ●連續質數2.3.5.7.11.13.17.19的規律●(14)
- 谷歌揭露14個iPhone漏洞,黑客持續竊取資料長達2年谷歌iPhone黑客
- Java - 14 方法過載Java
- 永續性Akka、Kafka、Cassandra實現CQRS資料同步Kafka
- java筆記14之privateJava筆記
- 永續合約交易所軟體平臺開發
- MicroStream + Helidon高效能Java持久層ROSJava
- 從TAF到TAC,業務連續性的追求永無止境
- TRX(TRX-USD)正式上線dYdX永續合約市場
- 永續合約交易所繫統開發(案例產品)
- Oracle Java 14釋出! | Oracle Java平臺組OracleJava
- 陣列[簡單]1550. 存在連續三個奇數的陣列2020/11/14(6)陣列
- 大一寒假集訓(13)(14)---vector ,string【未完待續】
- java高效能反射及效能對比Java反射
- 實現Java集合迭代的高效能Java
- 編寫高效能的Java程式碼Java
- Java高效能本地快取框架CaffeineJava快取框架
- Java 15 正式釋出, 14 個新特性,重新整理你的認知!!Java
- Java基礎14-java進階(5)【IO流】Java
- 多語言永續性與資料儲存比較綜述
- 永續合約與傳統合約的差異是什麼?
- Java併發指南14:Java併發容器ConcurrentSkipListMap與CopyOnWriteArrayListJava
- JAVA GUI學習 繼續JavaGUI
- java時斷時續————eclipseJavaEclipse
- Java 細粒度鎖續篇Java
- 秒合約|合約跟單|永續合約系統開發模式模式
- 永續合約交易所繫統開發功能分析詳情
- 企業也有中年危機?探討數字化與永續經營
- 深入理解Java虛擬機器--個人總結(持續更新)Java虛擬機
- 【演算法題解】485. 最大連續1的個數 - Java演算法Java
- 牛客練習賽14B 區間的連續段
- Java藍橋杯14年第五題Java