mysql学习记录¶
1.MySQL 使用Profile¶
1.开启Profile 记录
2.执行查询
set @project='ceph/ceph';select c.pull_number,timestampdiff(hour,p.request_date,c.comment_date) from (select * from pull_request where project=@project and request_date between '2018-06-01' and '2019-09-01') p inner join (select * from comments where project=@project) c on c.pull_number=p.number group by c.pull_number order by c.pull_number limit 10;
3.查看Profile记录
show profiles\G;
*************************** 1. row ***************************
Query_ID: 1
Duration: 0.00017875
Query: set @project='ceph/ceph'
*************************** 2. row ***************************
Query_ID: 2
Duration: 1.06222625
Query: select c.pull_number,timestampdiff(hour,p.request_date,c.comment_date) from (select * from pull_request where project=@project and request_date between '2018-06-01' and '2019-09-01') p inner join (select * from comments where project=@project) c on c.pull_number=p.number group by c.pull_number orde
2 rows in set, 1 warning (0.00 sec)
4.查询SQL 执行细节
mysql> SHOW PROFILE FOR QUERY 1;
+----------------+----------+
| Status | Duration |
+----------------+----------+
| starting | 0.000110 |
| Opening tables | 0.000028 |
| query end | 0.000007 |
| closing tables | 0.000005 |
| freeing items | 0.000014 |
| cleaning up | 0.000015 |
+----------------+----------+
6 rows in set, 1 warning (0.01 sec)
mysql> SHOW PROFILE FOR QUERY 2;
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000154 |
| Executing hook on transaction | 0.000008 |
| starting | 0.004463 |
| checking permissions | 0.000042 |
| checking permissions | 0.000011 |
| Opening tables | 0.000163 |
| init | 0.000014 |
| System lock | 0.000015 |
| optimizing | 0.000033 |
| statistics | 0.000046 |
| preparing | 0.000032 |
| Creating tmp table | 0.000169 |
| executing | 1.055802 |
| end | 0.000026 |
| query end | 0.000015 |
| waiting for handler commit | 0.000336 |
| removing tmp table | 0.000794 |
| waiting for handler commit | 0.000018 |
| closing tables | 0.000024 |
| freeing items | 0.000046 |
| cleaning up | 0.000016 |
+--------------------------------+----------+
21 rows in set, 1 warning (0.00 sec)
5.SQL执行结果按照消耗时间排序
以上结果是按照执行顺序返回的结果,但是此结果不能用order by 进行排序。但是我们可以直接查询INFORMATION_SCHEMA中对应的表,按照格式化输出
mysql> select state,sum(duration) as total_r,round(100*sum(duration)/(select sum(duration) from information_schema.profiling where query_id=@query_id),2) as pct_r,count(*) as calls,sum(duration)/count(*) as "r/calls" from information_schema.profiling where query_id=@query_id group by state order by total_r desc;
+--------------------------------+----------+-------+-------+--------------+
| state | total_r | pct_r | calls | r/calls |
+--------------------------------+----------+-------+-------+--------------+
| executing | 1.055802 | 99.40 | 1 | 1.0558020000 |
| starting | 0.004617 | 0.43 | 2 | 0.0023085000 |
| removing tmp table | 0.000794 | 0.07 | 1 | 0.0007940000 |
| waiting for handler commit | 0.000354 | 0.03 | 2 | 0.0001770000 |
| Creating tmp table | 0.000169 | 0.02 | 1 | 0.0001690000 |
| Opening tables | 0.000163 | 0.02 | 1 | 0.0001630000 |
| checking permissions | 0.000053 | 0.00 | 2 | 0.0000265000 |
| freeing items | 0.000046 | 0.00 | 1 | 0.0000460000 |
| statistics | 0.000046 | 0.00 | 1 | 0.0000460000 |
| optimizing | 0.000033 | 0.00 | 1 | 0.0000330000 |
| preparing | 0.000032 | 0.00 | 1 | 0.0000320000 |
| end | 0.000026 | 0.00 | 1 | 0.0000260000 |
| closing tables | 0.000024 | 0.00 | 1 | 0.0000240000 |
| cleaning up | 0.000016 | 0.00 | 1 | 0.0000160000 |
| query end | 0.000015 | 0.00 | 1 | 0.0000150000 |
| System lock | 0.000015 | 0.00 | 1 | 0.0000150000 |
| init | 0.000014 | 0.00 | 1 | 0.0000140000 |
| Executing hook on transaction | 0.000008 | 0.00 | 1 | 0.0000080000 |
+--------------------------------+----------+-------+-------+--------------+
18 rows in set, 19 warnings (0.01 sec)
以上结果可以看到主要消耗的时间是在执行上,因此不需要进行额外的优化
[1].高性能MySQL (https://book.douban.com/subject/23008813/)
2.MySQL 字符串处理函数¶
1.string_index函数
2.locate函数
mysql> set @project='pytorch/pytorch';select pull_number,group_concat(distinct substring_index(file_name,'/',1)) as module_name from request_file where project=@project and pull_number in (select number from pull_request where project=@project and request_date between '2018-06-01' and '2019-09-01') and locate('/',file_name)>0 group by pull_number limit 10;
Query OK, 0 rows affected (0.01 sec)
+-------------+---------------+
| pull_number | module_name |
+-------------+---------------+
| 8013 | aten |
| 8014 | torch |
| 8018 | .jenkins,aten |
| 8020 | .jenkins |
| 8029 | docs |
| 8030 | aten,torch |
| 8031 | torch |
| 8033 | aten,test |
| 8034 | torch |
| 8037 | caffe2 |
+-------------+---------------+
10 rows in set (0.69 sec)
3. awk 格式化输出¶
[2].https://stackoverflow.com/questions/1447809/
4.将多个列表中的值合并到一个列表中¶

因此最好的方法是使用python自带的 itertools 中的chain()函数
[3].https://stackoverflow.com/questions/952914
5.count(1),count(*),count(列名)¶
执行效果上:
count(*)包括了所有的列,相当于行数,在统计结果的时候,不会忽略列值为NULL
count(1)包括了忽略所有列,用1代表代码行,在统计结果的时候,不会忽略列值为NULL
count(列名)只包括列名那一列,在统计结果的时候,会忽略列值为空(这里的空不是只空字符串或者0,而是表示null)的计数,即某个字段值为NULL时,不统计。
执行效率上:
列名为主键,count(列名)会比count(1)快
列名不为主键,count(1)会比count(列名)快
如果表多个列并且没有主键,则 count(1) 的执行效率优于 count(*)
如果有主键,则 select count(主键)的执行效率是最优的
如果表只有一个字段,则 select count(*)最优。
6.使用Counter函数¶
[4].https://www.itread01.com/articles/1475818990.html
[5].https://www.jb51.net/article/85542.htm