函数组合与Monoid-优快云博客

本文链接：https://blog.youkuaiyun.com/springasa111/article/details/52903158

本文探讨了函数组合的代数特性及其与Monoid结构的关系，通过C语言示例展示了如何将函数组合抽象成数据结构，并处理函数的副作用。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

我们从算术开始。

看看下面的算式：

1 + 2 + 3 = 1 + 3 + 2
1 + 0 = 0 + 1

大家对这个结果肯定不会怀疑的，我们小时候还会把1 + 2 + 3在心里面或者草稿纸上“计算”一下，但是为什么我们现在不用算就知道正确呢？

因为我们知道，算术的加法是满足交换律和结合律的。

再看看下面的代数式：

a + b + c = a + ( b + c ) = a + c + b = b + c + a
a + 0 = 0 + a 
a + 0 = a

大家对这个结果也肯定不会怀疑的。但是我们并没有“计算”，但是为什么我们现在不用算就知道正确呢？

因为我们知道，代数的加法满足交换律和结合律的。

那么我们是不是可以考虑一个问题，为什么代数的加法满足交换律和结合律？

数学家给了我们一个定义，这是一个Monoid，因为它满足下面两条性质：

结合性： (a + b) + c = a + (b + c)
单位元：a + 0 = a ， 0即是加法的单位元。

Monoid更一般的定义是：

Suppose that S is a set and • is some binary operation S × S → S, then S with • is a monoid if it satisfies the following two axioms:

Associativity
    For all a, b and c in S, the equation (a • b) • c = a • (b • c) holds.
Identity element
    There exists an element e in S such that for every element a in S, the equations e • a = a • e = a hold.

那这个和编程有什么关系呢？有关系！

如果我们编程时定义的函数也符合这种Monoid的代数结构，那么我们就很容易证明我们复杂的函数组合下程序的正确性，而不需要去运行它。

我们用C语言来表达一下函数和函数的组合方式，

函数：

/* add2: int -> int */
int add2(int i)
{
    return i + 2;
}

/* square: int -> int */
int square(int i)
{
    return i*i;
}

/* time4: int -> int */
int time4(int i)
{
    return i*4;
}

其中注释部分表示函数的类型。

看一下他们是如何组合的：

int r = square(3);
add2(r);
或者
add2(square(3)) --> 11

这个实际上是下面的一个组合：

（int -> int) • (int -> int) = int -> int

即add2是int -> int的函数， square也是一个int -> int的函数，他们组合在一起即add2 • square也是一个int -> int的函数，我们也可以为这个组合的函数起个名字叫 squreAndAdd2。

int squreAndAdd2(int i)
{
    return add2(square(3)) ;
}

更加一般化的表示上面的组合，我们可以用下面的代码表示：

typedef int (*i2i)(int);

int add2(int i)
{
    return i + 2;
}

int square(int i)
{
    return i*i;
}


int compose(i2i f1, i2i f2, int i)
{
    return f1(f2(i));
}

那么可以按照下面方式组合：

compose(add2, square, 3)

但是问题来了，我们定义的这个compose怎么把time4和compose(add2, square, 3)继续组合呢？那么问题的根源是什么？

我们在组合的时候进行了计算，但是我们知道，代数的组合是不需要计算的，我们只需要把组合的意思表示出来即可。

再来看一下下面的代数结构：

int                 <---> int -> int， 单位元 f(unit) = unit
+                   <---> compose
int + int           <---> int -> int

compose:: (int -> int) -> (int -> int) -> (int -> int)

我们需要把计算数据化，按照上面的代数结构，我们来定义满足这种结构的数据和组合函数。

typedef int (*i2i)(int);

/* int -> int */
struct primitive
{
    i2i f;
};

/* (int -> int) -> (int -> int) -> (int -> int) */
struct compose
{
    struct primitive* f1;
    struct primitive* f2;
};

但是这样还不行，我们需要我们元素在操作上是封闭的，即组合的结果仍然是同类型的，我们如下定义：

struct primitive
{
    i2i f;
};

struct func;

struct compose
{
    struct func* f1;
    struct func* f2;
};

enum F_TYPE {PRIMITIVE=0, COMPOSE=1};
struct func
{
    enum F_TYPE type;
    union
    {
        struct primitive p;
        struct compose c;
    } f;
};

为了把C语义上不能组合的操作转换为我们上面定义的可以组合的操作(lift的过程)，我们需要定义构造器：

struct func* make_primitive(i2i f)
{
    struct func* func = (struct func*)malloc(sizeof(struct func));
    func->type = PRIMITIVE;
    func->f.p.f = f;
    return func;
}

struct func* make_compose(struct func* f1, struct func* f2)
{
    struct func* func = (struct func*)malloc(sizeof(struct func));
    func->type = COMPOSE;
    func->f.c.f1 = f1;
    func->f.c.f2 = f2;
    return func;
}

有了上面的定义，我们已经可以把计算表示出来了，为了能计算，我们只需要定义一个解释器即可：

int apply(struct func* f, int i)
{
    int res;
    switch(f->type) 
    {
        case PRIMITIVE:
            res = (f->f.p.f)(i);
            break;
        case COMPOSE:
            res = apply(f->f.c.f1, apply(f->f.c.f2, i));
            break;
        default:
            assert(0);
    }
    return res;
}

看看如何来计算：

struct func* ADD2 = make_primitive(add2);
struct func* SQUARE = make_primitive(square);
struct func* TIME4 = make_primitive(time4);

struct func* TIME4_ADD2_SQUARE=make_compose(TIME4,
                                        make_compose(ADD2, SQUARE));
apply(TIME4_ADD2_SQUARE, 3);

可以看出，这样也做到了 代数描述和实现执行的分离。

和Monoid的结构再对应一下：

int unit(int i)
{
    return i;
}

0  <------> make_primitive(unit)
+  <------> make_compose

struct func* UNIT = make_primitive(unit);
make_compose(ADD2, UNIT) = make_compose(UNIT, ADD2);
make_compose(TIME4, make_compose(ADD2, SQUARE)) =
make_compose(make_compose(TIME4, ADD2), SQUARE);

上面都是理想的情况，所有的函数都是纯函数，即对同样的输入函数的输出肯定是相同的，没有任何副作用。但是实际情况并非如此，

来看一下真实的世界：

int add2(int i)
{
    log("add2");
    return i + 2;
}

int square(int i)
{
    log("square");
    return i*i;
}

显然，这些函数不再纯，在计算过程中加入了日志，函数有副作用，日志的实现不能保证输出是确定的。

其实我们日志也可以看作是一种计算，既然是计算，我们可以按照前面的思路，把计算数据化。即把计算

表示出来，至于实际的计算交给解释器就可以了。

struct LogM make_logM(int i, const char* log)
{
    struct LogM lm;
    lm.i = i;
    strcpy(lm.log, log);
    return lm;
}

struct logM add2M(int i)
{
    return make_logM(add2(i), "add2");
}

struct logM squareM(int i)
{
    return make_logM(square(i), "square");
}

struct logM time4M(int i)
{
    return make_logM(time4(i), "time4");
}

那么这个新的数据如何组合呢？看开新的数据类型

int -> LogM

我们把原来的代码重新修改一下。

typedef struct LogM (*i2LogM)(int);

struct primitiveM
{
    i2LogM f;
};

struct funcM;

struct composeM
{
    struct funcM* f1;
    struct funcM* f2;
};

enum F_TYPE {PRIMITIVE=0, COMPOSE=1};

struct funcM
{

    enum F_TYPE type;

    union

    {

        struct primitiveM p;

        struct composeM c;

    } f;

};

/******************************************************
 * CONSTRUCTOR
 ******************************************************/
struct funcM* make_primitiveM(i2LogM f)
{
    struct funcM* func = (struct funcM*)malloc(sizeof(struct funcM));
    func->type = PRIMITIVE;
    func->f.p.f = f;
    return func;
}

struct funcM* make_composeM(struct funcM* f1, struct funcM* f2)
{
    struct funcM* func = (struct funcM*)malloc(sizeof(struct funcM));

    func->type = COMPOSE;
    func->f.c.f1 = f1;

    func->f.c.f2 = f2;

    return func;

}

/******************************************************
 * PARSER
 ******************************************************/
struct LogM applyM(struct funcM* f, int i)
{
    struct LogM res;
    struct LogM temp;

    switch(f->type) 
    {
        case PRIMITIVE:
            res = (f->f.p.f)(i);
            break;
        case COMPOSE:
        	temp = applyM(f->f.c.f2.i);
            res = applyM(f->f.c.f1, temp.i);
            strcat(res.log, temp.log);
            break;
        default:
            assert(0);
    }
    return res;
}

再看一下使用，

struct funcM* ADD2M = make_primitiveM(add2M);
struct funcM* SQUARE = make_primitiveM(squareM);
struct funcM* TIME4M = make_primitiveM(time4M);

struct funcM* TIME4_ADD2_SQUAREM=make_composeM(TIME4M,
                                        make_composeM(ADD2M, SQUAREM));
struct LogM lm = applyM(TIME4_ADD2_SQUAREM, 3);

再看一下Monoid结构，

struct LogM unitM(init i)
{
	struct LogM lm = {i, {0}};
	return lm;
}
0  <------> make_primitiveM(unitM)
+  <------> make_composeM

struct funcM* UNIT = make_primitiveM(unitM);
make_composeM(ADD2M, UNITM) = make_composeM(UNITM, ADD2M);
make_composeM(TIME4, make_composeM(ADD2M, SQUAREM)) =
make_composeM(make_composeM(TIME4M, ADD2M), SQUAREM);

这样我们把一个有副作用的函数转换成了没有副作用，并且可以定义出满足Monoid的代数结构。