A tricky thing in C++ – Will a member function call by using a raw nullptr crash? Maybe no.

As a Java/Python developer for about 10 years, in my mind, invoking a member function from a raw nullptr, there should have always exceptions, as the following shows.

Maybe you think why I discuss about this naive question. Since recently I found, it’s may not be true for C++ and it depends on how the C++ class defines.

Take a look at the following code written by C++ and execution results.

Surprisingly, the code was executed correctly! It’s tested by VC 2019, GCC 4.8.4/7.3.

I never noticed this but after more time thinking, you may understand the reason. It’s because there is no member field de-referenced in C++ class of “A”.

Let me explain it simply. The function A::hello() will be renamed (there is a full set of naming convention) and compiled into a kind of a C function. And the parameter of pointer of “this” will be added into the function (not that simple, but you can understanding like that). So A::hello() will be compiled to a function like _ZN1A5helloEv(A* this). When you call “a->hello()”, actually, it looks like “_ZN1A5helloEv(a)”. Calling a C function will not cause segment fault, but de-referencing will.

The following code snippet will give you better understanding.

In OOP, this kind of class “A” is not useful. No member fields means a object has no properties. For this case, you should use static function and invoking the function as a class function.

But this tricky behavior is good for us to understand how C++ language works.

Use boost::asio to implement a simple thread pool in C++

Working on a personal machine learning project, I would like to train a model which can recognize my son.

The first step is to extract my son’s faces from about 20K pictures. OpenCV can help a lot. It’s very handy and there already has the face cascade. Now the point is how to make the image processing faster. As a Java programmer, processing it in multiple threads is the first solution to try. After some searching, a lot of comments led me to boost::asio.

I was a C++ developer before C++11. I have to say, in the last several years, C++ got greatly improved. With boost::asio in C++, it becomes much easier to implement a simple thread pool, similar to Java concurrency.

thread_pool.h

#ifndef __THREAD_POOL_H__
#define __THREAD_POOL_H__

#include <boost/asio.hpp>
#include <vector>
#include <memory>
#include <boost/thread.hpp>

class ThreadPool;

class Worker
{
public:
  Worker(ThreadPool&);
  void operator()();

private:
  ThreadPool& m_pool;
};

class ThreadPool
{
public:
  explicit ThreadPool(size_t);
  ~ThreadPool();

  template<typename F>
  void enqueue(F f);

private:
  std::vector<std::unique_ptr<boost::thread>> m_workThreads;
  boost::asio::io_service m_ioService;
  boost::asio::io_service::work m_work;

  friend class Worker;
};

template<typename F>
void ThreadPool::enqueue(F f)
{
  m_ioService.post(f);
}

#endif

#ifndef __THREAD_POOL_H__

#define __THREAD_POOL_H__

#include <boost/asio.hpp>

#include <vector>

#include <memory>

#include <boost/thread.hpp>

class ThreadPool;

class Worker

{

public:

Worker(ThreadPool&);

void operator()();

private:

ThreadPool& m_pool;

};

class ThreadPool

{

public:

explicit ThreadPool(size_t);

~ThreadPool();

template<typename F>

void enqueue(F f);

private:

std::vector<std::unique_ptr<boost::thread>> m_workThreads;

boost::asio::io_service m_ioService;

boost::asio::io_service::work m_work;

friend class Worker;

};

template<typename F>

void ThreadPool::enqueue(F f)

{

m_ioService.post(f);

}

#endif

thread_pool.cpp

#include "thread_pool.h"

using namespace boost;
using namespace std;

Worker::Worker(ThreadPool& aPool) : m_pool(aPool)
{
  
}

void Worker::operator()()
{
  m_pool.m_ioService.run();
}

ThreadPool::ThreadPool(size_t sizeOfWorkerThreads) : m_work(m_ioService)
{
  for (auto i = 0; i < sizeOfWorkerThreads; ++i)
  {
    m_workThreads.push_back(
      unique_ptr<boost::thread>(new boost::thread(Worker(*this))));
  }
}

ThreadPool::~ThreadPool()
{
  m_ioService.stop();

  for (auto& workThread : m_workThreads)
  {
    workThread->join();
  }
}

#include "thread_pool.h"

using namespace boost;

using namespace std;

Worker::Worker(ThreadPool& aPool) : m_pool(aPool)

{

}

void Worker::operator()()

{

m_pool.m_ioService.run();

}

ThreadPool::ThreadPool(size_t sizeOfWorkerThreads) : m_work(m_ioService)

{

for (auto i = 0; i < sizeOfWorkerThreads; ++i)

{

m_workThreads.push_back(

unique_ptr<boost::thread>(new boost::thread(Worker(*this))));

}

ThreadPool::~ThreadPool()

{

m_ioService.stop();

for (auto& workThread : m_workThreads)

{

workThread->join();

}

Image processing code to make pictures smaller.

int doMain5(int, char**)
{
  ThreadPool threadPool(10);

  fs::path imgFolderPath("H:\\export_files");
  fs::directory_entry facesDirEntry(imgFolderPath);

  auto index = 1;
  for(auto& p : fs::directory_iterator(imgFolderPath))
  {
    if (p.path().string().find("smaller") != string::npos)
    {
      cout << index << ": Skip the path of " << p.path() << endl;
      index++;
      continue;
    }

    threadPool.enqueue([p, imgFolderPath, index] {

      Mat img = imread(p.path().string());
      Mat imgSmallerOne;

      resize(img, imgSmallerOne, Size(), 0.4, 0.4);

      stringstream ss;
      ss << p.path().stem() << "_smaller.jpg";

      cout << index << ": Write resized img into file of " << ss.str() << endl;
      fs::path newPath = fs::path(imgFolderPath) / fs::path(ss.str()).c_str();
      imwrite(newPath.string(), imgSmallerOne);

      fs::remove(p);
    });

    index++;
  }

  return 0;
}

int doMain5(int, char**)

{

ThreadPool threadPool(10);

fs::path imgFolderPath("H:\\export_files");

fs::directory_entry facesDirEntry(imgFolderPath);

auto index = 1;

for(auto& p : fs::directory_iterator(imgFolderPath))

{

if (p.path().string().find("smaller") != string::npos)

{

cout << index << ": Skip the path of " << p.path() << endl;

index++;

continue;

}

threadPool.enqueue([p, imgFolderPath, index] {

Mat img = imread(p.path().string());

Mat imgSmallerOne;

resize(img, imgSmallerOne, Size(), 0.4, 0.4);

stringstream ss;

ss << p.path().stem() << "_smaller.jpg";

cout << index << ": Write resized img into file of " << ss.str() << endl;

fs::path newPath = fs::path(imgFolderPath) / fs::path(ss.str()).c_str();

imwrite(newPath.string(), imgSmallerOne);

fs::remove(p);

});

index++;

}

return 0;

}

Introduction to Boost Regex

无意中找到07年刚参加工作时做的PPT，感觉就像发生在昨天一样，岁月如梭啊！😂

Introduction to Boost regex from Yongqiang Li

使用access函数检查文件权限

access函数在<unistd.h>头文件中，使用它可以非常方便的检查真实用户的权限。

有如下文件

yli@yli-desktop ~/test $ ll
total 8
drwxrwxr-x  2 yli yli 4096 Sep 18 00:14 .
drwxr-xr-x 70 yli yli 4096 Sep 18 00:13 ..
-rw-rw-r--  1 yli yli    0 Sep 18 00:14 a.txt

yli@yli-desktop ~/test $ ll

total 8

drwxrwxr-x 2 yli yli 4096 Sep 18 00:14 .

drwxr-xr-x 70 yli yli 4096 Sep 18 00:13 ..

-rw-rw-r-- 1 yli yli 0 Sep 18 00:14 a.txt

用如下C++程序，

#include <cstdlib>
#include <iostream>
#include <unistd.h>
#include <errno.h>

using namespace std;

/*
 * 
 */
int main(int argc, char** argv) {

    // 检查文件是否存在
    if (access("/home/yli/test/a.txt", F_OK) == 0) {
        cout << "the file exists" << endl;
    }
    
    if (access("/home/yli/test/a.txt", R_OK) == 0) {
        cout << "has read permission" << endl;
    }
    
    if (access("/home/yli/test/a.txt", W_OK) == 0) {
        cout << "has write permission" << endl;
    }
    
    if (access("/home/yli/test/a.txt", R_OK | W_OK) == 0) {
        cout << "has read and write permission" << endl;
    }
    
    if (access("/home/yli/test/a.txt", X_OK) == 0) {
        cout << "has exec permission" << endl;
    } else {        
        cout << "no exec permission" << ((errno == EACCES) ? " EACCES" : "") << endl;
    }
    return 0;
}

#include <cstdlib>

#include <iostream>

#include <unistd.h>

#include <errno.h>

using namespace std;

int main(int argc, char** argv) {

// 检查文件是否存在

if (access("/home/yli/test/a.txt", F_OK) == 0) {

cout << "the file exists" << endl;

}

if (access("/home/yli/test/a.txt", R_OK) == 0) {

cout << "has read permission" << endl;

}

if (access("/home/yli/test/a.txt", W_OK) == 0) {

cout << "has write permission" << endl;

}

if (access("/home/yli/test/a.txt", R_OK | W_OK) == 0) {

cout << "has read and write permission" << endl;

}

if (access("/home/yli/test/a.txt", X_OK) == 0) {

cout << "has exec permission" << endl;

} else {

cout << "no exec permission" << ((errno == EACCES) ? " EACCES" : "") << endl;

}

return 0;

}

得如下输出，

the file exists
has read permission
has write permission
has read and write permission
no exec permission EACCES

the file exists

has read permission

has write permission

has read and write permission

no exec permission EACCES

清楚明了。

让普通Java Library包含C/C++动态库

如何让Java Library包含C/C++的动态库而且实现正确加载呢？

在OSGi环境下，我们可以在MANIFEST.MF定义不同平台（操作系统，CPU架构）要加裁的动态库，然后OSGi Runtime会正确找到它们，我们主要在start()方法里用System.loadLibrary()就可以正确的加载。

可对于POJO Java Library则没有这些便利。如何做呢？最近看到用到一个库叫sqlite-jdbc，它有比较完整的实现方法。现在做个小归纳。

代码结构

把各个平台下的动态库分文件夹放在源代码文件夹里。这样动态库会和.class一起打包进*.jar里（当然你也可用Maven进行更精确的控制），同时便于我们后面用Class.getResourceAsStream()进行调用。

动态库的加载

在Java里使用C/C++动态库，都是为了配合native API，使native能正常使用，首先要使这些库能被正确的加载。但在jar里的动态库是不能被System.loadLibrary()正确加载的。所以基本思路就是把它们Extract到真正的文件系统中

下面是加载流程：

用户可以通过环境变量自己提供动态库。第一步检查有没有设置Jar所需的so环境变量，如果用就不加载Jar包内的。
如果没有，根据操作系统（System.getProperty(“os.name”)）和CPU架构（System.getProperty(“os.arch”)），找到包里对应的库，准备把库拷贝到temporary folder（System.getProperty(“java.io.tmpdir”)）中。
拷贝之前，检查目标目录中是否已有动态库，用MD5码进行比对。
如果MD5不相同或库不存在，使用getResourceAsStream的方法把库写到目标目录中。
对非Window系统，赋予”755″权限。
最后使用System.load()把该动态库加载到JVM Runtime中。

总结

这是一个比较完备的方法。有了这套方法，我们还可以结合Ant或Maven动态的编译C/C++的源码，再打包。MS可以写个库来做实现整个流程。 😀

Visual Studio 2012 C++里的Lambda

从C++转到Java开发的这几年里，时常感慨Java里没有C++里的函数对象等好用的特性。最近安装了VS2012 Express尝鲜，偶然试了下Lambda，真是不得不惊呼“C++也能这样了！Java何时能支持原生函数对象啊！“，同时吐出几百两血，再想想悲剧的Java模板，再吐出几百两血。 😀

下面就用几个简单的例子说明一下。

一个例子

auto x = [] (int a) { return 2 * a; };

cout &lt;&lt; x(2) &lt;&lt; endl; // 4

auto x = [] (int a) { return 2 * a; };

cout << x(2) << endl; // 4

一个简单例子，很简洁的语法，就可以写出一个lambda表达式，可以赋值给auto变量或function变量。同时发现在VS2012 Express的IDE已经对这些特性支持的非常好，自动补全也非常给力。

例子中x的类型就是function<int(int)>。

Lambda的基本语法

Lambda introducer或Capture Clauser
Parameter list 声名lambda参数表的
Mutable specification
Exception specification 声名异常
Return type clause 返回值
Lambda body 函数体

其实除了第一项和第三项，其他的部分和写个函数很像。这里就主要讲讲1、3项。

首先一个Lambda表达式可以访问所有上下文中的变量（在一个｛｝之内的）。Capture Clauser就是用来指定Lambda Body是按值类型访问还是引用类型访问。
举个例子：

int a = 1;
int b = 2;
[a, &amp;b] { // access a by value, access b by reference.
   cout &lt;&lt; a &lt;&lt; endl;
   b = 3;
}

// a is still 1.
// b is 3 now.

int a = 1;

int b = 2;

[a, &b] { // access a by value, access b by reference.

cout << a << endl;

b = 3;

}

// a is still 1.

// b is 3 now.

如果所有变量都想以值方式访问，可以用[=]；都想以引用方式访问可以用[&]。

int a;
int b;

[=] { // 等价于[a,b]
 ...
}

[&amp;] { // 等价于[&amp;a, &amp;b]
 ...
}

int c; // 如果只想c以引用方式访问
[=, &amp;c] { // 那么就只有c是以引用方式访问
 ...
}

int a;

int b;

[=] { // 等价于[a,b]

...

}

[&] { // 等价于[&a, &b]

...

}

int c; // 如果只想c以引用方式访问

[=, &c] { // 那么就只有c是以引用方式访问

...

}

前面的例子也说明，如果没有参数表，可以省略。

mutable关键字是和by value的访问方式有关。当我们以by value方式访问变量时，默认是不能对变量进行赋值的。但当我们声明为mutable的lambda时，就可以了。注意，这里修改的也仅仅是一个拷贝而不是值本身。

int a = 1;

[=] {
   a = 2; // compiling error!
}

[=] () mutable { // parameter list can&#039;t be omitted.
   a = 2; // compile pass!
}

cout &lt;&lt; a &lt;&lt; endl; // still a == 1

int a = 1;

[=] {

a = 2; // compiling error!

}

[=] () mutable { // parameter list can't be omitted.

a = 2; // compile pass!

}

cout << a << endl; // still a == 1

和STL算法联用

vector&lt;int&gt; vec(10);
generate_n(vec.begin(), 10, rand());

for_each(vec.begin(), vec.end(), [] (int a) { cout &lt;&lt; a &lt;&lt; endl; });

vector<int> vec(10);

generate_n(vec.begin(), 10, rand());

for_each(vec.begin(), vec.end(), [] (int a) { cout << a << endl; });

Lambda的加入方便了C++写出简洁的代码，可以和STL里很多算法联用，非常方便，功能强大。当然也不仅限于VS2012，新版的LLVM和GCC对lambda也支持的很好！

Java 8里会有这些吗？Who knows?

References

Accelerate the Compiling Speed by Using Pre-compiled Header

前言

这是2009年我在项目组做的一个改进。项目大约有80万行的代码（现在已过百万），编译非常慢。当时大家苦不堪言，工作效率很低。预编译头文件可以大幅提高编译效率。在一些尝试之后，我写了一个Python脚本，把工程中的常用头文件提取出来做成预编译头，也成为组里为数不多的，把全部代码Check out的工程师。你可以在这找到Python的脚本。

Introduction

This article is trying to explain why and how to use pre-compiled header file to accelerate our projects compiling(not including linking) speed.

Motivation

In our daily work, we modify code for fixing bugs or implementing new features, and then build new binary file and test it. The building process usually costs between 3~20 minutes. And sometimes we have to sync other colleagues’ code, the building process is much longer. So accelerating building speed is very important to improve our efficiency. Building process contains two steps. One is to compile source files and generate object files, the other is to link all object files together and generate binary file. This artical is focus on accelerating the first step.

How to use pre-compiled header

Right now many compilers, including VC++ compiler and GCC, supply a technology called pre-compiled header, which can greatly improve the compiling efficiency. Assumed there is a structure of source files.

A.cpp incudes the wx/wx.h and also does B.cpp and C.cpp. When compiling A.cpp, B.cpp and C.cpp, the compiler will parse the wx/wx.h three times. wx/wx.h is a quitely-often-used header file in our project. And there are hundreds of source files in our projects. So the wx/wx.h will be parsed hundreds of times. And there are many often-used header files like header files in std(Standard Template Library) and wxWidgets.

Pre-compiled header technology uses a common header file to generated pre-compiled header file(.pch using by VC++, .gch using by GCC). The common header file contains as many as often-used header files and we seldom change it. The former structure should be changed like the following schematic.

Now the header file called “stdwx.h” contains other often-used header files. Since the header file can not be compiled, stdwx.cpp is used to generate the pre-compiled header file. And we need to change the project build configuration like below.

For VC++

For whole project,

NOTE: Actually, we can change the configuration for every single source(.cpp) file using the same method. But project-level setting is much more convenient, it can change the other source file configuration, except for those already have their own setting of precompiled header.

For stdwx.cpp,

NOTE: stdwx.cpp is special, since the header(.h) file can not be compiled by compiler. We need a cpp file to let the compiler to compile. The content of stdwx.cpp is very simple, just include stdwx.h. So compiler will use stdwx.cpp to generate the precompiled header.

For GCC

GCC supports a compile option called -x c++-header (for cpp) and c-header(for c), which used to generate the precompiled header file. Please see the following statements in makefiles.

stdwx.h.gch: stdwx.h
          g++ -o $@ -x c++-header -c $&lt; -Wall -I&quot;...&quot;

1 2	stdwx.h.gch: stdwx.h g++ -o $@ -x c++-header -c $< -Wall -I"..."

After that, when compiler compiles the A.cpp, B.cpp and C.cpp, the stdwx.h will be parsed only once by compiling stdwx.cpp. That how the time is saved!

NOTE: The source files using pre-compiled header should include “stdwx.h” at the first line of the source, or there will be compiling errors!

Test Result

The test result is very exciting! Project has 80,000 LoC.

	With Pre-compiled Header	Without Pre-compiled Header
Re-build Time(NOT including linking time)	30~35min	130min+

The re-build time with pre-compiled head is only about 23% of that of without pre-compiled header project. In other words, the compiling efficiency is improved over 70%.