对Python脚本做简单的profiling

最近事情好多，Blog好久没有更新了。今天上来写写最近解决的一个Python里的性能优化问题。

起因

之前为项目写过一个Sqlite数据库预处理的Python脚本，里面主要做了张新表，把其他表的数据填进去。当时主要考虑到维护性，条理清楚，就没太考虑Performance。之后QA发现模块的运行比原来慢了20倍，因为还是挺快，所以没有当时马上修正。

Profiling

这次Release要修掉这个问题。我的原则是，改进Performance一定要做Profiling，做到有的放矢才。

和Java的VirtualVM类似，Python 2.7也内置了几个Module做Profiling，我选择了cProfile。基本就是如下命令：

python -m cProfile -s tottime python_script

1	python -m cProfile -s tottime python_script

“-s tottime”是让结果用总执行时间排序。

优化之前的执行结果，

        4012898 function calls (4011847 primitive calls) in 139.670 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   281311  131.300    0.000  131.300    0.000 {method 'execute' of 'sqlite3.Cursor' objects}
   121621    4.296    0.000    4.296    0.000 {method 'fetchall' of 'sqlite3.Cursor' objects}
   121618    0.733    0.000    7.110    0.000 a_module.py:510(get_columns)
        1    0.642    0.642  126.630  126.630 a_module.py:32(execute)
   121616    0.470    0.000    7.580    0.000 a_module.py:495(has_field)
    15202    0.440    0.000    7.500    0.000 a_module.py:406(update_parameters)
   364921    0.365    0.000    0.365    0.000 {method 'format' of 'str' objects}
    30405    0.254    0.000    0.254    0.000 {method 'fetchone' of 'sqlite3.Cursor' objects}
        1    0.253    0.253  139.553  139.553 a_module.py:83(preprocess)
  2322518    0.194    0.000    0.194    0.000 {method 'append' of 'list' objects}
...

4012898 function calls (4011847 primitive calls) in 139.670 seconds

Ordered by: internal time

ncalls tottime percall cumtime percall filename:lineno(function)

281311 131.300 0.000 131.300 0.000 {method 'execute' of 'sqlite3.Cursor' objects}

121621 4.296 0.000 4.296 0.000 {method 'fetchall' of 'sqlite3.Cursor' objects}

121618 0.733 0.000 7.110 0.000 a_module.py:510(get_columns)

1 0.642 0.642 126.630 126.630 a_module.py:32(execute)

121616 0.470 0.000 7.580 0.000 a_module.py:495(has_field)

15202 0.440 0.000 7.500 0.000 a_module.py:406(update_parameters)

364921 0.365 0.000 0.365 0.000 {method 'format' of 'str' objects}

30405 0.254 0.000 0.254 0.000 {method 'fetchone' of 'sqlite3.Cursor' objects}

1 0.253 0.253 139.553 139.553 a_module.py:83(preprocess)

2322518 0.194 0.000 0.194 0.000 {method 'append' of 'list' objects}

...

可见罪魁祸首就是sqlite3的Cursor的execute()方法，和原本猜测的也是一样的。优化的手法也很明确就是减少execute()的调用次数，使用batch和合并SQL语句的办法，很容易就用空间换回了时间。

优化之后的结果，

        1048606 function calls (1047555 primitive calls) in 11.653 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    15235   10.101    0.001   10.101    0.001 {method 'execute' of 'sqlite3.Cursor' objects}
    15202    0.309    0.000    1.448    0.000 a_module.py:422(update_a_parameters)
        1    0.264    0.264   11.557   11.557 a_module.py:88(preprocess)
   250907    0.230    0.000    0.230    0.000 {method 'format' of 'str' objects}
        8    0.226    0.028    0.226    0.028 {method 'fetchall' of 'sqlite3.Cursor' objects}
     7601    0.090    0.000    0.627    0.000 a_module.py:391(update_b_parameters)
   121616    0.075    0.000    1.766    0.000 a_module.py:460(check_and_add_column_if_not_existed)
   167222    0.067    0.000    0.067    0.000 {method 'keys' of 'sqlite3.Row' objects}
...

1048606 function calls (1047555 primitive calls) in 11.653 seconds

Ordered by: internal time

ncalls tottime percall cumtime percall filename:lineno(function)

15235 10.101 0.001 10.101 0.001 {method 'execute' of 'sqlite3.Cursor' objects}

15202 0.309 0.000 1.448 0.000 a_module.py:422(update_a_parameters)

1 0.264 0.264 11.557 11.557 a_module.py:88(preprocess)

250907 0.230 0.000 0.230 0.000 {method 'format' of 'str' objects}

8 0.226 0.028 0.226 0.028 {method 'fetchall' of 'sqlite3.Cursor' objects}

7601 0.090 0.000 0.627 0.000 a_module.py:391(update_b_parameters)

121616 0.075 0.000 1.766 0.000 a_module.py:460(check_and_add_column_if_not_existed)

167222 0.067 0.000 0.067 0.000 {method 'keys' of 'sqlite3.Row' objects}

...

优化之后只是原来的9%的Runtime。

总结

继续坚持用Profiler来做Performance的改进。
边改边用Profiler查看Performance有没有提升。
不要过分优化，否则代码没法看了。😀

References

The Python Profilers

一例关于如何提升SQLite的性能

今天接到一个Ticket，说SQLite数据库的访问太慢。用VisualVM一查，两个访问函数需要分别消耗~200秒和~300秒。真不太能接受，想想如何解决吧。

问题分析

SQLite数据量大约有40K，其实数据量并不大，主要问题一定还出在SQLite的表结构上。在这个应用中，我们需要根据现有数据和用户的输入，在线计算很多新的列，还要支持排序，过滤等操作。为了避免磁盘IO，定义了很多的View，没有生成新的Table。虽然没有在SQLite的网上查到具体说明，但问题一定就出在这些View的定义上。恰好是因为没有生成Table，View的每次生成计算反而需要更多时间。

问题解决

那如何能生成Table，并避免大量读写操作呢？想到SQLite支持In-memory DB, 所以应该可以把这一切的操作放到内存里。这里的Java示例代码说明如何使用In-memory DB.

// Create a memory database
Connection conn = DriverManager.getConnection("jdbc:sqlite:");
// Do some updates
stmt.executeUpdate("create table sample(id, name)");
stmt.executeUpdate("insert into sample values(1, \"leo\")");
stmt.executeUpdate("insert into sample values(2, \"yui\")");
// Dump the database contents to a file
stmt.executeUpdate("backup to backup.db");
Restore the database from a backup file:
// Create a memory database
Connection conn = DriverManager.getConnection("jdbc:sqlite:");
// Restore the database from a backup file
Statement stat = conn.createStatement();
stat.executeUpdate("restore from backup.db");

// Create a memory database

Connection conn = DriverManager.getConnection("jdbc:sqlite:");

// Do some updates

stmt.executeUpdate("create table sample(id, name)");

stmt.executeUpdate("insert into sample values(1, \"leo\")");

stmt.executeUpdate("insert into sample values(2, \"yui\")");

// Dump the database contents to a file

stmt.executeUpdate("backup to backup.db");

Restore the database from a backup file:

// Create a memory database

Connection conn = DriverManager.getConnection("jdbc:sqlite:");

// Restore the database from a backup file

Statement stat = conn.createStatement();

stat.executeUpdate("restore from backup.db");

具体思路：

新建In-memory database
把目标数据库restore到database中（前提是我们的数据量不大，并且生成后不会有变化）
不用View，全部使用Table，用户输入变化时，就重新写Table。

结果

当我看到改进之后的函数访问时间在VisualVM上变成0.0ms时，我知道这个Ticket可以Resolve了。^_^

References

使用Web技术开发桌面客户端的一些准备

有了用Web技术开发客户端的想法，自然就有个疑问，Embedded的Browser到底能力怎么样？

于是便和各桌面流行浏览器的对比实验，Embedded Browser选用的是JavaFX的WebBrowser，也是个基于Webkit的Browser.

不是专业人士，不知道这些测试页有没有代表性。下面是对比结果，

[easy_table caption=”Result Comparison”]Test Page,Firefox 25, Safari 17, Chrome 30, JavaFX, Safari (iOS 7.0.3)
http://browsermark.rightware.com,4051,5163,3701,3323,2627,
http://html5test.com,424,385,463,310,415,
http://css3test.com,58%,61%,60%,51%,61%,
[/easy_table]

显然JavaFX的WebBrowser在各方面都基本是最差的。(Performance比iPhone上的Safari略好，但iPhone毕竟是移动设备，且这个分数比91%的手机都要好。）这结果对于目前的使用场景应该足够了，不会使用很多HTML5和CSS3的高级用法，也不会加载很大的页面，Performance应该不是问题。

接下来做个更贴近产品的Demo吧！