mysql 關於大資料量日誌表的最佳化過程

datapeng發表於2017-04-12
我們知道很多網站都會有關於記錄網站搜尋的日誌表,用於記錄會員的基本登入資訊,用於後期資料分析或者防攻擊使用,現在我們有一張表logs,每天產生大概60萬的資料。
環境介紹:10.0.21-MariaDB-log
我們的業務主要分成三塊:
1)查詢當天的logs;
2)檢視最近2個月的logs;
3)檢視最近一年的logs.
我們當前的表是不管點那個按鈕查詢的都是這個表logs,這個表的資料一直存在,直到收到反饋報錯504 timeout,然後才去處理這個表,就是刪除一年之前的資料,倆月就得處理一次,很蛋疼,於是想著最佳化下。
整體的思路就是:分表和分割槽表的結合。
1)查詢當天和近兩個月的資料的業務去查詢logs;
2)檢視最近一年的資料的業務去查詢另一個表logs_oneyear,
3)然後利用儲存過程+event的方式定時去處理logs和logs_oneyear,使這兩個表裡始終只保留最近2個月以及近10個月的資料,這樣保證了可以查詢到一年資料的目的。
具體處理過程如下:
一:首先處理儲存2個月的表logs,打算按時間分割槽,7天一個分割槽,利用儲存過程logs_twomonths_procedure刪除一個老分割槽,建立一個新分割槽,並且把刪除的老分割槽的資料,插入到logs_oneyear,然後做一個event每7天定時執行,
1)建立表logs的語句如下,這裡需要注意的是針對timestamp型別的時間屬性需要藉助函式unix_timestamp才能建立分割槽表,但是DATETIME 型別不需要。
CREATE TABLE `logs` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`member_id` int(11) DEFAULT NULL,
`jsession` int(11) DEFAULT NULL,
`ip` bigint(20) DEFAULT NULL,
`shijian` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`status` int(11) NOT NULL DEFAULT '0' COMMENT 'When the page(html) is open ,this attribute will set 1',
PRIMARY KEY (`id`,`shijian`),
KEY `logs_member_idx` (`member_id`,`jsession`,`shijian`),
KEY `logs_ip_idx` (`ip`),
KEY `logs_shijian_idx` (`shijian`)
) ENGINE=MyISAM AUTO_INCREMENT=847368831411249480 DEFAULT CHARSET=utf8
PARTITION BY RANGE(unix_timestamp(shijian))
(
PARTITION p20170201 VALUES LESS THAN (unix_timestamp('2017-02-01 00:00:00')) ENGINE = MyISAM,
PARTITION p20170207 VALUES LESS THAN (unix_timestamp('2017-02-07 00:00:00')) ENGINE = MyISAM,
PARTITION p20170214 VALUES LESS THAN (unix_timestamp('2017-02-14 00:00:00')) ENGINE = MyISAM,
PARTITION p20170221 VALUES LESS THAN (unix_timestamp('2017-02-21 00:00:00')) ENGINE = MyISAM,
PARTITION p20170228 VALUES LESS THAN (unix_timestamp('2017-02-28 00:00:00')) ENGINE = MyISAM,
PARTITION p20170307 VALUES LESS THAN (unix_timestamp('2017-03-07 00:00:00')) ENGINE = MyISAM,
PARTITION p20170314 VALUES LESS THAN (unix_timestamp('2017-03-14 00:00:00')) ENGINE = MyISAM,
PARTITION p20170321 VALUES LESS THAN (unix_timestamp('2017-03-21 00:00:00')) ENGINE = MyISAM,
PARTITION p20170328 VALUES LESS THAN (unix_timestamp('2017-03-28 00:00:00')) ENGINE = MyISAM,
PARTITION p20170404 VALUES LESS THAN (unix_timestamp('2017-04-04 00:00:00')) ENGINE = MyISAM,
PARTITION pmax VALUES LESS THAN (MAXVALUE) ENGINE = MyISAM);
2)如下建立儲存過程logs_twomonths_procedure的語句:
MariaDB [log]> drop procedure if exists logs_twomonths_procedure;
MariaDB [log]>
DELIMITER $$
create procedure logs_twomonths_procedure()
begin
/* 事務回滾*/
declare exit handler for sqlexception rollback;
start TRANSACTION;
/* 到系統表查出這個表的最大分割槽,得到最大分割槽的日期。在建立分割槽的時候,名稱就以日期格式存放,方便後面維護,p12_name是自己定義變數,注意需要條件partition_name!='pmax'*/
select REPLACE(partition_name,'p','') into @P12_Name from INFORMATION_SCHEMA.PARTITIONS where TABLE_SCHEMA='log' and table_name='logs' and partition_name!='pmax' order by partition_ordinal_position DESC limit 1;
/*直接加7天,就是得到7天之後的日期。 +0 是為了把日期都格式化成YYYYMMDD這樣的格式*/
set @Max_date= date(DATE_ADD(@P12_Name+0, INTERVAL 7 DAY))+0;
/* 修改表,在最大分割槽的後面增加一個分割槽,時間範圍新增7天 ,DATE函式是把20110101改成日期格式2011-01-01,語句中兩個單引號代表一個單引號 ,repare預處理固定語法 */
SET @s1=concat('ALTER TABLE logs_oneyear ADD PARTITION (PARTITION p',@Max_date,' VALUES LESS THAN (unix_timestamp(''',date(@Max_date),''')))');
PREPARE hezi FROM @s1;
EXECUTE hezi ;
DEALLOCATE PREPARE hezi ;
/* 將最小分割槽中的資料插入到logs_oneyear表中,因為接下來要刪除這個最小分割槽會同時刪除分割槽內的資料,慎重 */
select REPLACE(partition_name,'p','') into @min_date from INFORMATION_SCHEMA.PARTITIONS where TABLE_SCHEMA='log' and table_name='logs' order by partition_ordinal_position asc limit 1;
SET @s2=concat('insert into logs_oneyear select * from logs where shijian<''',date(@min_date),'''');
PREPARE hezi1 FROM @s2;
EXECUTE hezi1 ;
DEALLOCATE PREPARE hezi1 ;
/* 取出最小的分割槽的名稱,並刪除掉 */
select partition_name into @P0_Name from INFORMATION_SCHEMA.PARTITIONS where TABLE_SCHEMA='log' and table_name='logs' order by partition_ordinal_position limit 1;
SET @s3=concat('ALTER TABLE logs DROP PARTITION ',@P0_Name);
PREPARE hezi3 FROM @s3;
EXECUTE hezi3 ;
DEALLOCATE PREPARE hezi3;
/* 提交 */
COMMIT ;
end;
$$
3)建立event,7天執行一次,應該在log庫下建立,mysql的庫相當於oracle的schema,
MariaDB [log]> CREATE EVENT logs_Partition7
ON SCHEDULE
EVERY 7 day STARTS '2017-03-31 08:00:00'
DO
call logs_twomonths_procedure();
二:處理報錯10個月資料的那個表logs_oneyear.
1)建立logs_oneyear表的語句,半月一個分割槽
MariaDB [log]>drop table if exists logs_oneyear;
MariaDB [log]>CREATE TABLE `logs_oneyear` (
`disc` int(11) NOT NULL,
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`member_id` int(11) DEFAULT NULL,
`jsession` int(11) DEFAULT NULL,
`ip` bigint(20) DEFAULT NULL,
`keyword` varchar(255) DEFAULT NULL,
`page` int(11) DEFAULT NULL,
`rp` int(11) DEFAULT NULL,
`area` varchar(255) DEFAULT NULL,
`category` varchar(255) DEFAULT NULL,
`fenlei` int(11) DEFAULT NULL,
`shijianduan` int(11) DEFAULT NULL,
`type` int(11) DEFAULT NULL,
`agency_company` varchar(55) DEFAULT NULL,
`biaoxun_type` int(11) DEFAULT NULL,
`caigou_type` int(11) DEFAULT NULL,
`shijian` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`total` int(11) DEFAULT NULL,
`xmxx_jieduan` int(11) DEFAULT NULL,
`zhanguan` varchar(255) DEFAULT NULL,
`main_product` varchar(255) DEFAULT NULL,
`fazx_fenlei1` int(11) DEFAULT NULL,
`fazx_fenlei2` int(11) DEFAULT NULL,
`jiaoyileixing` int(11) DEFAULT NULL,
`mailid` bigint(20) DEFAULT NULL,
`referer` int(11) DEFAULT NULL,
`search_id` bigint(20) DEFAULT NULL,
`info_id` bigint(20) DEFAULT NULL,
`status` int(11) NOT NULL DEFAULT '0' COMMENT 'When the page(html) is open ,this attribute will set 1',
PRIMARY KEY (`id`,`shijian`),
KEY `logs_disc_idx` (`disc`),
KEY `logs_member_idx` (`member_id`,`jsession`,`shijian`),
KEY `logs_ip_idx` (`ip`),
KEY `logs_shijian_idx` (`shijian`),
KEY `logs_mail_idx` (`mailid`)
) ENGINE=MyISAM AUTO_INCREMENT=847368831411249480 DEFAULT CHARSET=utf8
PARTITION BY RANGE(unix_timestamp(shijian))
(
PARTITION p20160501 VALUES LESS THAN (unix_timestamp('2016-05-01 00:00:00')) ENGINE = MyISAM,
PARTITION p20160515 VALUES LESS THAN (unix_timestamp('2016-05-15 00:00:00')) ENGINE = MyISAM,
PARTITION p20160601 VALUES LESS THAN (unix_timestamp('2016-06-01 00:00:00')) ENGINE = MyISAM,
PARTITION p20160615 VALUES LESS THAN (unix_timestamp('2016-06-15 00:00:00')) ENGINE = MyISAM,
PARTITION p20160701 VALUES LESS THAN (unix_timestamp('2016-07-01 00:00:00')) ENGINE = MyISAM,
PARTITION p20160715 VALUES LESS THAN (unix_timestamp('2016-07-15 00:00:00')) ENGINE = MyISAM,
PARTITION p20160801 VALUES LESS THAN (unix_timestamp('2016-08-01 00:00:00')) ENGINE = MyISAM,
PARTITION p20160815 VALUES LESS THAN (unix_timestamp('2016-08-15 00:00:00')) ENGINE = MyISAM,
PARTITION p20160901 VALUES LESS THAN (unix_timestamp('2016-09-01 00:00:00')) ENGINE = MyISAM,
PARTITION p20160915 VALUES LESS THAN (unix_timestamp('2016-09-15 00:00:00')) ENGINE = MyISAM,
PARTITION p20161001 VALUES LESS THAN (unix_timestamp('2016-10-01 00:00:00')) ENGINE = MyISAM,
PARTITION p20161015 VALUES LESS THAN (unix_timestamp('2016-10-15 00:00:00')) ENGINE = MyISAM,
PARTITION p20161101 VALUES LESS THAN (unix_timestamp('2016-11-01 00:00:00')) ENGINE = MyISAM ,
PARTITION p20161115 VALUES LESS THAN (unix_timestamp('2016-11-15 00:00:00')) ENGINE = MyISAM,
PARTITION p20161201 VALUES LESS THAN (unix_timestamp('2016-12-01 00:00:00')) ENGINE = MyISAM,
PARTITION p20161215 VALUES LESS THAN (unix_timestamp('2016-12-15 00:00:00')) ENGINE = MyISAM,
PARTITION p20170101 VALUES LESS THAN (unix_timestamp('2017-01-01 00:00:00')) ENGINE = MyISAM,
PARTITION p20170115 VALUES LESS THAN (unix_timestamp('2017-01-15 00:00:00')) ENGINE = MyISAM,
PARTITION p20170201 VALUES LESS THAN (unix_timestamp('2017-02-01 00:00:00'))
ENGINE = MyISAM,
PARTITION p20170215 VALUES LESS THAN (unix_timestamp('2017-02-15 00:00:00'))
ENGINE = MyISAM,
PARTITION pmax1 VALUES LESS THAN (MAXVALUE) ENGINE = MyISAM));
);
2)建立儲存過程logs_oneyear_procedure,
MariaDB [log]>drop procedure if exists logs_oneyear_procedure;
DELIMITER $$
MariaDB [log]>create procedure logs_oneyear_procedure()
begin
declare exit handler for sqlexception rollback;
start TRANSACTION;
select REPLACE(partition_name,'p','') into @liu_Name from INFORMATION_SCHEMA.PARTITIONS where TABLE_SCHEMA='log' and table_name='logs_oneyear' and partition_name!='pmax1' order by partition_ordinal_position DESC limit 1;
/* 判斷最大分割槽的時間段,如果是15號,那麼根據情況需要加,14,15,16,17天,,如果是1號就直接加14即可,這樣的目的就是保證產生的分割槽間隔是每個月的15號和下個月的的1號,說白了 就是為了規範*/
IF (DAY(@P12_Name)=15) THEN
CASE day(LAST_DAY(@P12_name))
WHEN 31 THEN set @Max_date= date(DATE_ADD(@P12_Name+0,INTERVAL 17 DAY))+0 ;
WHEN 30 THEN set @Max_date= date(DATE_ADD(@P12_Name+0,INTERVAL 16 DAY))+0 ;
WHEN 29 THEN set @Max_date= date(DATE_ADD(@P12_Name+0,INTERVAL 15 DAY))+0 ;
WHEN 28 THEN set @Max_date= date(DATE_ADD(@P12_Name+0,INTERVAL 14 DAY))+0 ;
END CASE;
ELSE
set @Max_date= date(DATE_ADD(@P12_Name+0, INTERVAL 14 DAY))+0;
END IF;
SET @s1=concat('ALTER TABLE logs_oneyear ADD PARTITION (PARTITION p',@Max_date,' VALUES LESS THAN (unix_timestamp(''',date(@Max_date),''')))');
PREPARE stmt2 FROM @s1;
EXECUTE stmt2;
DEALLOCATE PREPARE stmt2;
/* 將最小分割槽中的資料插入到歷史庫下的備份表logs_back_start20160501中,因為接下來要刪除這個最小分割槽會同時刪除分割槽內的資料,慎重 */
select REPLACE(partition_name,'p','') into @min_date from INFORMATION_SCHEMA.PARTITIONS where TABLE_SCHEMA='log' and table_name='logs_oneyear' order by partition_ordinal_position asc limit 1;
SET @s2=concat('insert into history.logs_back_start20160501 select * from logs_oneyear where shijian<''',date(@min_date),'''');
PREPARE stmt3 FROM @s2;
EXECUTE stmt3;
DEALLOCATE PREPARE stmt3;
select partition_name into @P0_Name from INFORMATION_SCHEMA.PARTITIONS where TABLE_SCHEMA='log' and table_name='logs_oneyear' order by partition_ordinal_position limit 1;
SET @s=concat('ALTER TABLE logs_oneyear DROP PARTITION ',@P0_Name);
PREPARE stmt1 FROM @s;
EXECUTE stmt1;
DEALLOCATE PREPARE stmt1;
COMMIT ;
end;
$$
3)建立event,每隔15天執行一次。
計劃任務程式碼:
CREATE EVENT logs_Partition_oneyear
ON SCHEDULE
EVERY 15 day STARTS '2017-04-01 22:30:00'
DO
call logs_oneyear_procedure();
最後把相應的資料insert到logs和logs_oneyear即可,至此處理結束,觀察一段時間,效能果然有所提升,需要注意的是,如果查詢的語句中where條件中沒有使用shijian欄位,那麼表做分割槽效能提升不大,還影響了insert的效率,所以具體該不該分割槽,或者應該使用什麼欄位分割槽,需要自己去探究。
引申內容:
1)關於時間欄位datetime和timestamp的異同點,如果可以儘量使用TIMESTAMP
相同
顯示
TIMESTAMP列的顯示格式與DATETIME列相同。換句話說,顯示寬度固定在19字元,並且格式為YYYY-MM-DD HH:MM:SS。
不同
範圍
datetime 以'YYYY-MM-DD HH:MM:SS'格式檢索和顯示DATETIME值。支援的範圍為'1000-01-01 00:00:00'到'9999-12-31 23:59:59'TIMESTAMP值不能早於1970或晚於2037
儲存
TIMESTAMP
1.4個位元組儲存(Time stamp value is stored in 4 bytes)
2.值以UTC格式儲存( it stores the number of milliseconds)
3.時區轉化 ,儲存時對當前的時區進行轉換,檢索時再轉換回當前的時區。
datetime
1.8個位元組儲存(8 bytes storage)
2.實際格式儲存(Just stores what you have stored and retrieves the same thing which you have stored.)
3.與時區無關(It has nothing to deal with the TIMEZONE and Conversion.)
2)關於mysql預處理語句
其用法十分簡單,其中stmt_name 是隨便起的名字。
PREPARE stmt_name FROM preparable_stmt
EXECUTE stmt_name
[USING @var_name [, @var_name] ...] -
{DEALLOCATE | DROP} PREPARE stmt_name
使用PAREPARE STATEMENT可以減少每次執行SQL的語法分析,比如用於執行帶有WHERE條件的SELECT和DELETE,或者UPDATE,或者INSERT,只需要每次修改變數值即可。同樣可以防止SQL隱碼攻擊,引數值可以包含轉義符和定界符。適用在應用程式中,或者SQL指令碼中均可

















來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29371470/viewspace-2137069/,如需轉載,請註明出處,否則將追究法律責任。

相關文章