前言
我们在看程序设计相关书籍的时候,经常会看见:设计一个程序模块的时候,应该做到“高内聚,低耦合”或者“隔离变化,降低复杂度”等,其含义都是差不多的,即:减少模块之间的相互依赖,使模块更独立,尽可能的做到对扩展开放,对修改封闭。C++的多态特性,就是一个隔离变化,降低耦合的一种方式。C++的多态,本质上和在C语言当中的函数指针一样,通过一种手段,调用不同的函数,实现对同一事物实现不同的处理方式。最近在写一个单词解析的练习题,刚好是使用了函数指针来达到隔离变化,降低耦合的目的,本文将是对学习知识的一个记录,同时分享给每一个需要学习函数指针的同学们。一、背景与目的
之前使用状态机写了一个练习题,目的是使用[状态机来统计单词个数和统计每一个单词的个数](https://blog.youkuaiyun.com/woody218/article/details/109563187)。 今天的练习题目的是在状态机解析单词的基础上,使用函数指针来达到隔离变化,降低耦合的目的。状态机负责解析出一个个的单词,使用函数指针来指定处理单词的方式,如统计所有单词的总个数,以及统计某个单词出现的次数。二、C语言代码实现
1.状态机解析单词
该部分独立完成输入字符串的解析工作,应用状态机机制,解析出一个个单词。在文件中,使用了一个头文件types_def.h,这个文件来自李先静老师的[AWTK](https://github.com/zlgopen/awtk/blob/master/src/tkc/types_def.h) 需要源码编译的同学,可以去下载头文件的源码,同时也了解一下李老师的AWTK。parse_words.h
/**
* File: parse_words.h
* Author:
* Brief: word_parse
*
* Copyright (c) 2019 - 2020 Guangzhou ZHIYUAN Electronics Co.,Ltd.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* License file for more details.
*
*/
/**
* History:
* ================================================================
* 2020-11-01 ZHANG ZHONGJI <zhangzhongji@zlgmcu.cn> created
*
*/
#ifndef PARSE_WORDS_H
#define PARSE_WORDS_H
#include "types_def.h"
BEGIN_C_DECLS
/**
* @class new_word_handle_t
* 当解析到一个单词的时候调用。
*/
typedef void (*one_word_handle_t)(void* ctx, const char* one_word);
/**
* @method parse_words
* 解析字符串中的单词。
*
* @param {const char*} input,需要处理的字符串;
* @param {one_word_handle_t} on_word_handle,处理一个单词的函数指针;
* @param {void*} ctx,上下文,回调函数的第一个参数。
*
*/
void parse_words(const char* input, one_word_handle_t on_word_handle, void* ctx);
END_C_DECLS
#endif /*PARSE_WORDS_H*/
parse_words.c
/**
* File: parse_words.c
* Author:
* Brief: parse_words
*
* Copyright (c) 2019 - 2020 Guangzhou ZHIYUAN Electronics Co.,Ltd.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* License file for more details.
*
*/
/**
* History:
* ================================================================
* 2020-11-01 ZHANG ZHONGJI <zhangzhongji@zlgmcu.cn> created
*
*/
#include "parse_words.h"
typedef enum _state_t {
INIT,
IN_WORD,
OUT_WORD,
FINAL
}state_t;
typedef enum _event_t {
SEPARATION = 1,
NOT_SEPARATION,
STOP
}event_t;
static event_t current_event(const char* one_char) {
char separate_char[] = { ',', '.', ' ' };
char final_char = '\0';
if (one_char[0] == final_char) {
return STOP;
}
for (int i = 0; i < sizeof(separate_char); i++) {
if (one_char[0] == separate_char[i]) {
return SEPARATION;
}
}
return NOT_SEPARATION;
}
void parse_words(const char* input, one_word_handle_t on_word_handle, void* ctx) {
char* one_char = input;
state_t next_state = INIT;
int32_t index = 0;
int32_t str_len = strlen(input) + 1;
char name[50 + 1] = {'\0'}; //存放单词
while (str_len > 0) {
state_t current_state = next_state;
switch (current_state) {
case INIT:
if (current_event(one_char) == NOT_SEPARATION) {
next_state = IN_WORD;
memset(name, 0x00, sizeof(name));
name[index++] = one_char[0];
}
else if (current_event(one_char) == SEPARATION) {
next_state = OUT_WORD;
}
else {
return;
}
break;
case IN_WORD:
if (current_event(one_char) == NOT_SEPARATION) {
next_state = IN_WORD;
name[index++] = one_char[0];
}
else if (current_event(one_char) == SEPARATION) {
next_state = OUT_WORD;
name[index] = '\0';
index = 0;
//处理这个单词
on_word_handle(ctx, name);
}
else {
//STOP
name[index] = '\0';
index = 0;
//处理这个单词
on_word_handle(ctx, name);
}
break;
case OUT_WORD:
if (current_event(one_char) == NOT_SEPARATION) {
next_state = IN_WORD;
memset(name, 0x00, sizeof(name));
name[index++] = one_char[0];
}
else if (current_event(one_char) == SEPARATION) {
next_state = OUT_WORD;
}
else {
return;
}
break;
default:
return;
}
one_char = one_char++;
str_len--;
}
}
2.统计单词个数
这部分代码,主要处理单词统计过程中,经常变化的部分,如统计所有单词的总数、统计某一个单词的出现次数、或者统计某一长度的单词个数等等,是可以继续扩充功能的一部分代码。为了降低与单词解析部分源码的耦合度,这里使用了函数指针。函数指针能很好的应对变化,使单词解析部分的代码从功能上保持封闭,同时对功能的扩展保持开放,所有新的功能,都可以通过增加处理函数的方式来完成。经过李老师的点评,学习到了函数指针使用的一个技巧,即函数指针的第一个参数应该为void* ctx,表示上下文的意思。这个ctx指针的功能,和c++成员函数隐含的this指针一样,起到一个上下文标识的作用。通过下面的例子,大家也可以发现,ctx参数非常有用,它可以是一个 int32*,也可以是结构体指针,或者你需要的其他类型的指针,即使暂时用不上,也应该保留,因为你没法评估以后永远不会修改需求,新增功能。
word_count.h
/**
* File: word_count.h
* Author:
* Brief: word_count
*
* Copyright (c) 2019 - 2020 Guangzhou ZHIYUAN Electronics Co.,Ltd.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* License file for more details.
*
*/
/**
* History:
* ================================================================
* 2020-11-01 ZHANG ZHONGJI <zhangzhongji@zlgmcu.cn> created
*
*/
#ifndef WORD_COUNT_H
#define WORD_COUNT_H
#include "types_def.h"
BEGIN_C_DECLS
/**
* @class word_info_t
* word info 每个单词的信息,区分大小写。
*/
typedef struct _word_info_t {
char name[50 + 1]; //存放单词
uint32_t count; //统计单词的个数
struct _word_info_t* next_word; //下一个单词
}word_info_t;
/**
* @method get_words_count
* 获得单词的总个数。
*
* @param {const char*} input,需要处理的字符串。
*
* @return {int32_t} 返回单词的总个数。
*/
int32_t get_words_count(const char* input);
/**
* @method get_one_word_count
* 获得一个单词出现的次数。
*
* @param {const char*} input,需要处理的字符串。
* @param {const char*} one_word,需要获取出现个数的单词
*
* @return {int32_t} 返回one_word出现的个数。
*/
int32_t get_one_word_count(const char* input, const char* one_word);
/**
* @method get_every_word_count
* 获得单词的个数。
*
* @param {const char*} input, 需要处理的字符串。
*
* @return {word_info_t*} 返回单词的统计信息列表。
*/
word_info_t* get_every_word_count(const char* input);
/**
* @method release_memory
* 释放内存。
*
* @param {word_info_t*} word_info_list,get_every_word_count的返回值
*
*/
void release_memory(word_info_t* word_info_list);
END_C_DECLS
#endif /*WORD_COUNT_H*/
word_count.c
/**
* File: word_count.c
* Author:
* Brief: word_count
*
* Copyright (c) 2019 - 2020 Guangzhou ZHIYUAN Electronics Co.,Ltd.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* License file for more details.
*
*/
/**
* History:
* ================================================================
* 2020-11-01 ZHANG ZHONGJI <zhangzhongji@zlgmcu.cn> created
*
*/
#include "word_count.h"
#include "parse_words.h"
// 查找单词是否已存在
static word_info_t* find_word(const word_info_t* word_info_list, const char* one_word_buff) {
if (word_info_list == NULL || one_word_buff == NULL) {
return NULL;
}
word_info_t* word_info_temp = word_info_list;
while (word_info_temp != NULL) {
if (strcmp(word_info_temp->name, one_word_buff) == 0) {
return word_info_temp;
}
else {
word_info_temp = word_info_temp->next_word;
}
}
return NULL;
}
// 查找最后一个单词
static word_info_t* find_last_word(const word_info_t* word_info_list) {
if (word_info_list == NULL) {
return NULL;
}
word_info_t* word_info_temp = word_info_list;
while (word_info_temp != NULL) {
if (word_info_temp->next_word == NULL) {
return word_info_temp;
}
else {
word_info_temp = word_info_temp->next_word;
}
}
return NULL;
}
static word_info_t* create_one_word(void) {
word_info_t* one_word = (word_info_t*)malloc(sizeof(word_info_t));
memset(one_word, 0x00, sizeof(one_word));
one_word->count = 1;
one_word->next_word = NULL;
return one_word;
}
static void words_count_handle(void* ctx, const char* one_word) {
int32_t* count = (int32_t*)ctx;
(*count)++;
}
int32_t get_words_count(const char* input) {
int32_t count = 0;
parse_words(input, words_count_handle, &count);
return count;
}
static void one_word_count_handle(void* ctx, const char* one_word) {
word_info_t* input_one_word = (word_info_t*)ctx;
if (strcmp(input_one_word->name, one_word) == 0) {
input_one_word->count++;
}
}
int32_t get_one_word_count(const char* input, const char* one_word) {
word_info_t input_one_word;
memset(&input_one_word, 0, sizeof(input_one_word));
int len = strlen(one_word) > 50 ? 50 : strlen(one_word);
memcpy(input_one_word.name, one_word, len);
input_one_word.count = 0;
input_one_word.next_word = NULL;
parse_words(input, one_word_count_handle, &input_one_word);
return input_one_word.count;
}
static void every_word_count_handle(void* ctx, const char* one_word) {
word_info_t** word_info_list = (word_info_t**)ctx;
if (*word_info_list == NULL) {
word_info_t* one_new_word = create_one_word();
memset(one_new_word->name, 0, sizeof(one_new_word->name));
int len = strlen(one_word) > 50 ? 50 : strlen(one_word);
memcpy(one_new_word->name, one_word, len);
one_new_word->count = 1;
one_new_word->next_word = NULL;
*word_info_list = one_new_word;
}
else {
word_info_t* one_word_info_ptr = find_word(*word_info_list, one_word);
//没找到单词
if (one_word_info_ptr == NULL) {
word_info_t* one_new_word = create_one_word();
memset(one_new_word->name, 0, sizeof(one_new_word->name));
int len = strlen(one_word) > 50 ? 50 : strlen(one_word);
memcpy(one_new_word->name, one_word, len);
one_new_word->count = 1;
one_new_word->next_word = NULL;
word_info_t* last_word_info = find_last_word(*word_info_list);
if (last_word_info == NULL) {
*word_info_list = one_new_word;
}
else {
last_word_info->next_word = one_new_word;
}
}
else {
//已有单词
one_word_info_ptr->count++;
}
}
}
word_info_t* get_every_word_count(const char* input) {
word_info_t* ret_list = NULL;
parse_words(input, every_word_count_handle, &ret_list);
return ret_list;
}
void release_memory(word_info_t* word_info_list) {
if (word_info_list == NULL) {
return;
}
word_info_t* word_info_temp = word_info_list;
while (word_info_temp != NULL) {
word_info_t* word_info_new_temp = word_info_temp->next_word;
free(word_info_temp);
word_info_temp = word_info_new_temp;
}
}
3.gtest测试
设计测试用例的要点是需要覆盖完所有的测试点,使用的方法是等价划分。学习过离散数学的,大致会有些印象,不熟悉的同学可以回去复习一下:等价、等价关系、等价类、等价划分等相关知识。对应于本例题,一个字符串输入流可以分成三个等价类:单词、分隔符、结束符。在设计测试用例的时候,每个等价类写一到两个测试用例即可,每个等价类都覆盖到,就能覆盖所有的测试点。/**
* File: test.cpp
* Author:
* Brief: word_statistics
*
* Copyright (c) 2019 - 2020 Guangzhou ZHIYUAN Electronics Co.,Ltd.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* License file for more details.
*
*/
/**
* History:
* ================================================================
* 2020-11-01 ZHANG ZHONGJI <zhangzhongji@zlgmcu.cn> created
*
*/
#include "word_count.h"
#include "gtest/gtest.h"
TEST(TEST_WORD_STATISTICS, get_words_count)
{
ASSERT_EQ(1, get_words_count("ONE"));
ASSERT_EQ(2, get_words_count("ONE TWO"));
ASSERT_EQ(1, get_words_count("ONE,,,,"));
ASSERT_EQ(2, get_words_count(" ONE ,,,,TWO...."));
ASSERT_EQ(0, get_words_count(",,,, ..."));
ASSERT_EQ(0, get_words_count(","));
ASSERT_EQ(0, get_words_count("."));
ASSERT_EQ(0, get_words_count(" "));
ASSERT_EQ(0, get_words_count(""));
}
TEST(TEST_WORD_STATISTICS, get_one_word_count)
{
ASSERT_EQ(2, get_one_word_count("one two three one", "one"));
ASSERT_EQ(1, get_one_word_count("one,two. three. ", "one"));
ASSERT_EQ(0, get_one_word_count(",,,, . ", "one"));
ASSERT_EQ(0, get_one_word_count("", "one"));
}
TEST(TEST_WORD_STATISTICS, get_every_word_count)
{
word_info_t* word_list = NULL;
word_info_t* word_list_ptr = get_every_word_count("ONE ONE TWO");
//ASSERT_NE(NULL, word_list_ptr);
word_info_t* next_node = word_list_ptr;
while (next_node != NULL) {
if (strcmp(next_node->name, "ONE") == 0) {
ASSERT_EQ(2, next_node->count);
} else if (strcmp(next_node->name, "TWO") == 0) {
ASSERT_EQ(1, next_node->count);
}
next_node = next_node->next_word;
}
release_memory(word_list_ptr);
}
int main(int argc, char* argv[]) {
testing::InitGoogleTest(&argc, argv);
return RUN_ALL_TESTS();
}
#if 0
int main(int argc, char* argv[]) {
word_info_t* word_list = NULL;
word_info_t* word_list_ptr = get_every_word_count("one two three one");
int32_t total_count = get_words_count("one two three one");
int32_t one_word_count = get_one_word_count("one two three one", "one");
int32_t two_word_count = get_one_word_count("one two three one", "two");
int32_t three_word_count = get_one_word_count("one two three one", "three");
release_memory(word_list_ptr);
}
#endif
测试结果:
总结
这个练习题是我业余时间完成的,大家有需要的可以随意使用,共同学习,共同进步。本文一共有两大知识点,一是状态机的使用,二是使用函数指针隔离变化代码。此例中,状态机解析单词是不变的部分,对单词处理是可变的部分,我们开发模块的时候,也是需要考虑把变化的部分和不变的部分做隔离,避免后期需求改变,自己的模块需要大范围修改。在次例中,如果有新需求,需要统计指定字符长度的单词时,只需要增加一个函数即可,其他代码根本不需要修改,这样的程序设计符合开闭原则。函数指针的设计要点:第一个参数设置为void* ctx,表示上下文,例如:
typedef void (*one_word_handle_t)(void* ctx, const char* one_word)