Idioms for C programmers

本文档收集了一些C语言编程的习惯用法示例,包括文件输入输出处理、字符串比较、错误处理、内存分配等关键主题。


Idioms for C programmers

This document collects some idiomatic examples of the C way of doing things. None of these examples have been tested. Please report errors or difficulties to comp40-staff.

For the include statements, I've inserted a backslash before the statement, because the generator for this document deletes them otherwise. When you put include statements in our programs, don't put a backslash in.
Table of Contents

    Reading from standard input or from one or more files named on the command line
    Reading one line of input
        Design poisons for line-oriented input
    Getting rid of "Unused variable" warnings
    Comparing strings
    Hanson's idiom for constant-time string comparison
    Printing strings separated by commas
    Using List_map to print strings separated by newlines
        More detail on the indirections:
    Idioms for void ** pointers
    Type abbreviations for structure types
    Using an abstraction defined in an interface Foo
    Handling void * values of known type
    Using unboxed arrays
        Initializing array elements
        Storing values into an unboxed array
        Example of array of arrays
    Allocating memory
    Writing long string literals
    Printing integers of known width

Reading from standard input or from one or more files named on the command line

The idea is to separate out the processing of an open file handle from the process of finding one or more open file handles. Idiom adapted from Kernighan and Ritchie, page 162, for a program without options:

#include <stdio.h>

extern void do_something(FILE *fp);

int main(int argc, char *argv[])
{
        if (argc == 1) {
                do_something(stdin);
        } else {
                for (int i = 1; i < argc; i++) {
                        FILE *fp = fopen(argv[i], "r");
                        if (fp == NULL) {
                                fprintf(stderr,
                                        "%s: %s %s %s\n",
                                        argv[0], "Could not open file ",
                                        argv[i], "for reading");
                                exit(1);
                        }
                        do_something(fp);
                        fclose(fp);
                }
        }
}

Here's another approach:

\#include <stdio.h>
\#include <errno.h>
\#include <stdlib.h>


extern void do_something(FILE *fp);

static FILE *open_or_abort(char *fname, char *mode);

int main(int argc, char *argv[])
{
        if (argc == 1) {
                do_something(stdin);
        } else {
                for (int i = 1; i < argc; i++) {
                        FILE *fp = open_or_abort(argv[i], "r");
                        do_something(fp);
                        fclose(fp);
                }
        }
}

static FILE *open_or_abort(char *fname, char *mode)
{
        FILE *fp = fopen(fname, mode);
        if (fp == NULL) {
                int rc = errno;
                fprintf(stderr,
                        "Could not open file %s with mode %s\n",
                        fname,
                        mode);
                exit(rc);
        }
        return fp;
}

Idiom for OS-specific error message on failing to open a file:

perror(argv[i]);  /* print the filename with message about *why* fopen() failed */

Idiom for elaborate error messsage:

\#include <errno.h>
\#include <string.h>
...
fprintf(stderr, "%s: could not open %s (%s)\n",
        argv[0], argv[i], strerror(errno));

Reading one line of input

The correct idiom for reading input is to

    Allocate a buffer

    Call fgets

Allocation may be static or dynamic. The main issue is how to recover if fgets does not return an entire line. Assuming you can't just halt the program with an error message, these are your options:

    If your buffer is dynamically allocated, you can enlarge it and continue to read.

    If your buffer is statically allocated, you should
        Consume any characters remaining in the line
        Indicate some sort of error condition and continue

Design poisons for line-oriented input

Here are some things to avoid when doing input one line at a time:

    Never use gets; it's unsafe (because it can walk over more memory than you expect if the input is long).

    Don't use scanf, especially not for interactive programs. It's too easy for scanf to become greedy and gobble up more than one line, especially if the input doesn't meet specifications.

    If you have the urge to use the scanf interface, which can be quite useful, use fgets to read the line and then sscanf (note the extra 's') to read the pieces.

Getting rid of "Unused variable" warnings

Sometimes a contract insists that a function have certain arguments, but in some implementation you may not use the arguments. It may be that you don't use argc and argv in main, for example. Comp 40 does not permit you to have compile time warnings, so what can you do? This:

int main(int argc, char *argv[])
{
        (void) argc;
        (void) argv;
        ...
}

Comparing strings

Gotcha alert! The C++ strings you are used to do not exist in C. In C, you simulate a string by a char * pointer, which points to a sequence of bytes ending in '\0'. This style causes all sorts of problems, most notably

const char *s1, *s2;  /* strings in the neighborhood  */

if (s1 == s2) {       /* SILENTLY GIVES WRONG ANSWERS */
        ...
        /* go here only if *pointers* are identical */
        ...
}

The standard idiom for comparing strings is

if (strcmp(s1, s2) == 0) {
        ...
        /* strings are equal here */
        ...
}

Old-fashioned C programmers often write (but you should not)

if (!strcmp(s1, s2)) {
        ...
        /* strings are equal here */
        ...
}

The exclamation mark is easy to overlook, and the uncertain relationship between booleans, success/failure codes, and other finite conditions makes such code less clear. If the return value is intended as a boolean, then omit an explicit test; otherwise, put in an explicit test.

Problem: comparing equal strings costs time proportional to the length of the string.
Hanson's idiom for constant-time string comparison

Hanson's Atom_new or Atom_string functions use a shared hash table to ensure that equal strings are represented by identical pointers. A single Atom_new is more expensive than a single strcmp, but when you are using strings in data structures, you will recover the cost by saving comparisons down the road. You may also save memory. One idiom is

const char *s1, *s2;  /* strings in the neighborhood */

s1 = Atom_string(s1); /* hash strings to unique pointers */
s2 = Atom_string(s2);

...

if (s1 == s2) {
        ...
        /* strings are guaranteed equal */
        ...
}

The behavior is neatly expressed mathematically:

    Atom_string(s1) == Atom_string(s2) if and only if strcmp(s1, s2) == 0.

You use Atom_new if you want to create atomic strings that contain zeroes, as you might in some binary network protocols.
Printing strings separated by commas

This idiom is notable for its simple control flow. It can be generalized to other separators besides commas and other things to print besides strings.

\#include <stdio.h>
...
const char *prefix = "";

for (int i = 0; i < nthings; i++) {
        printf("%s%s", prefix, things[i]);
        prefix = ", ";
}

Notice that to reuse this code, you'll have to reset prefix. If this code is in a function, it will get reset.
Using List_map to print strings separated by newlines

For a list we'd like to write

const char *prefix = "";

foreach name in list { /* the iteration abstraction does not exist in C */
        printf("%s%s", prefix, name);
        prefix = "\n";
}

Using the List_map interface, name will be pointed to by a parameter, and prefix will have to be stored in the "closure" state which persists across iterations.

Here's the state:

struct inner_state {
        const char *prefix;
};

And here's the apply function:

static void inner_apply(void **x, void *cl)
{
        struct inner_state *s = cl;
        char *name = *x;  /* x is &(x->first), so *x is p->first */

        printf(cl->prefix, name);

        cl->prefix = "\n";
}

Now we rewrit the code to assign the initial state and then call List_map in place of the loop:

struct inner_state s;
s.prefix = "";

List_map(list, inner_apply, &s);

Again, the prefix is still a newline after the call to List_map. A good idea is to write a print function that encapsulates the state structure and the call to List_map.
More detail on the indirections:

The indirections look like this:

p->first points to the sequence of characters "hello"

&p->first points to p->first

the void **x that is passed to inner_apply is &p->first

*x has type void * and is p->first, so we can assign it to char *name without a cast
Idioms for void ** pointers

A void ** pointer almost always means "pass by reference a pointer to an unknown type. Here are a couple of idioms:

    To produce a value of type void ** always use &p, where p is a pointer of type void *.

    To consume or observe a value of type void **, first you have to know what the unknown type is. For sake of argument let us assume that the unknown type is 'struct date'. Then you dereference the void ** pointer and put the result in a pointer of correct type:

    static struct date *d;


    void set_d_by_reference(void **ref)
    {
            d = *ref;
    }

    No cast is needed, because *ref has type void * and so can be assigned to any pointer variable.

Type abbreviations for structure types

Idiom #1: Hanson style (the type abbreviation includes a pointer):

typedef struct foo *Foo;
...

Foo f;

Idiom #2: Bell Labs style (the type abbreviation does not include a pointer):

typedef struct foo foo;
...

foo *f;

Poison: mixing the two styles!

Note: you'll sometimes see capitalized names used with Bell Labs style. I've never seen lowercase names used with Hanson style.
Using an abstraction defined in an interface Foo

Many abstractions require memory allocated dynamically on the heap. To use such abstractions correctly, without leaking memory, you must balance every allocation with a free. Here is a recipe:

Foo_T foo = Foo_new(...arguments...);
assert(foo != NULL);

...
/* operations on foo, including calling functions that use foo */
...

Foo_free(&foo); /* free *foo and set foo to NULL */

Handling void * values of known type

Suppose we are using qsort to sort an array of strings, where each string is represented by a char * pointer. No sane person wants to deal with typecasting, so we write the comparison function this way:

static int compare(const void *p1, const void *p2)
{
        const char * const *ps1 = p1;
        const char * const *ps2 = p2;

        return strcmp(*ps1, *ps2);
}

The idiom is that if you have a void * value of known type, you immediately assign it to a variable of that type. No explicit cast is needed. Here's another example:

struct node *p = Table_get(t, Atom_string("root"));

Using unboxed arrays

Be alert that I have retired Hanson's Array abstraction. I have replaced it with `UArray', an abstraction of unboxed arrays. (The implementation is exactly what's in the book, but the interface is different.)

Suppose array a is an unboxed UArray_T containing values of type struct pixel. Here's how you get a pointer to the ith element:

The Ramsey approach:

struct pixel *p = UArray_at(a, i);   /* capture pointer into the array
                                      * (valid until resized or freed)
                                      */
assert(sizeof(*p) == UArray_size(a));
... *p ...        /* use expression of struct type */

This idiom is robust against changes in type. Why? It has a single point of truth, that is, the type is mentioned in only one place. In Hanson's book, you will see types mentioned in multiple places, and it is easier to write inconsistent
code. That's one reason I've retired Hanson's arrays. (The other is that students found them hard to learn.)
Initializing array elements

Here's an idiom for initializing element i of an array with an empty Set_T:

Set_T *setp = UArray_at(array, i);
assert(sizeof(*setp) == UArray_size(array));
*setp = Set_new(10, NULL, NULL);

Storing values into an unboxed array

Suppose function f() returns a value of type struct pixel you want to store in an unboxed array. Here's how you do it:

struct pixel *p = Array_at(a, i);    /* capture pointer into the array
                                      * (valid until resized or freed)
                                      */
assert(Array_size(a) == sizeof(*p)); /* detects some errors */
*p = f();

Example of array of arrays

Here I initialize and use an array of arrays. Suppose you want a UArray_T of UArray_T of double:

/* n elements of type UArray_T */
UArray_T outer = UArray_new(n, sizeof(UArray_T));

for (i = 0; i < n; i++) {
        UArray_T inner_array = UArray_new(length_of_row(i),
                                          sizeof(double));

        /* variable number of elements in each inner array */
        for (j = 0; j < UArray_length(inner_array), j++) {
                double *elemp = UArray_at(inner_array, j);
                *elemp = 0..0; /* initial value of element */
        }

        /* point to slot for inner array */
        UArray_T *innerp = UArray_at(outer, i);
        *innerp = inner_array;
}

Now to access element (i, j), I remember that UArray_at returns a pointer to an element:

UArray_T *p = UArray_at(outer, i);
assert(sizeof(*p) == UArray_size(outer));

UArray_T inner_array = *p;

double *q = UArray_at(inner_array, j);
assert(sizeof(*q)) == UArray_size(inner_array);

return *q;

Allocating memory

The following anti-idiom, although seen frequently in the code of those who have not been taught better, is anathema to C programmers:

Thing_T p = malloc(sizeof(struct Thing_T)); /* not acceptable */
assert(p != NULL);

Such code is not acceptable for COMP 40:

    It's easy to leave out the struct, in which case you have a memory error. valgrind will probably catch it, but it shouldn't happen in the first place.

    There is no single point of truth about what the type of p is. In particular, suppose the program evolves to this:

    NewThing_T p = malloc(sizeof(struct Thing_T)); /* actually wrong */
    assert(p != NULL);

The correct way to write these allocations is with a single point of truth:

Thing_T p = malloc(sizeof(*p));  /* established C idiom */
assert(p != NULL);

This code is good because

    There is a single point of truth about the type of p. If that type changes, the code adjusts automatically and is still correct.

    If the name of p changes, you have at least a fighting chance of getting a decent error message from the compiler.

This business of allocation and deallocation is so tricky that we recommend you use Hanson's macros:

Thing_T p;
NEW(p); // the assertion is included in NEW

Writing long string literals

It can be difficult to cram a long printf or fprintf call into 80 columns. Exploit this unusual property of C: adjacent string literals are concatenated at compile time.

Example:

fprintf(stderr, "%s: Things have gone horribly wrong: "
                "%s is a file format I don't recognize, "
                "I can't find any bytes on standard input, "
                "and the dog ate %d pages of my homework!\n",
                argv[0], argv[1], n - 1);

Note especially there are no commas between the literals.

This idiom also enables many scurvy tricks with the C preprocessor.
Printing integers of known width

The designers of C, unlike the designers of Java, decided that programmers did not need to know how many bits are in a type like int or unsigned long. As a result, it is nearly impossible to write code that is portable against changes in the size of a machine word.

This problem was fixed in C99 with the introduction of the <inttypes.h> interface, which contains a variety of .integer types and some print macros. Most of you will need it only for printing. Here are some examples of the C99 idiom for printing integers of known sizes:

\#include <stdio.h>
\#include <stdlib.h>
\#include <inttypes.h>

int main()
{
        uint64_t big     = (uint64_t)1 << 63;
        int16_t negative = ~(int16_t)0;

        printf("%" PRIu64 " is a large number, as we can see by its"
               " hex\nrepresentation 0x%016" PRIx64 ".\n%" PRId16 " is a "
               " negative number of very small magnitude.\n",
               big, big, negative);

        return 0;
}

Macros PRIu64, PRIx64, and PRId16 are like a regular u, x, or d, except correctly sized for 64-bit, 64-bit, and 16-bit integers respectively.
Back to class home page
 

标题SpringBoot智能在线预约挂号系统研究AI更换标题第1章引言介绍智能在线预约挂号系统的研究背景、意义、国内外研究现状及论文创新点。1.1研究背景与意义阐述智能在线预约挂号系统对提升医疗服务效率的重要性。1.2国内外研究现状分析国内外智能在线预约挂号系统的研究与应用情况。1.3研究方法及创新点概述本文采用的技术路线、研究方法及主要创新点。第2章相关理论总结智能在线预约挂号系统相关理论,包括系统架构、开发技术等。2.1系统架构设计理论介绍系统架构设计的基本原则和常用方法。2.2SpringBoot开发框架理论阐述SpringBoot框架的特点、优势及其在系统开发中的应用。2.3数据库设计与管理理论介绍数据库设计原则、数据模型及数据库管理系统。2.4网络安全与数据保护理论讨论网络安全威胁、数据保护技术及其在系统中的应用。第3章SpringBoot智能在线预约挂号系统设计详细介绍系统的设计方案,包括功能模块划分、数据库设计等。3.1系统功能模块设计划分系统功能模块,如用户管理、挂号管理、医生排班等。3.2数据库设计与实现设计数据库表结构,确定字段类型、主键及外键关系。3.3用户界面设计设计用户友好的界面,提升用户体验。3.4系统安全设计阐述系统安全策略,包括用户认证、数据加密等。第4章系统实现与测试介绍系统的实现过程,包括编码、测试及优化等。4.1系统编码实现采用SpringBoot框架进行系统编码实现。4.2系统测试方法介绍系统测试的方法、步骤及测试用例设计。4.3系统性能测试与分析对系统进行性能测试,分析测试结果并提出优化建议。4.4系统优化与改进根据测试结果对系统进行优化和改进,提升系统性能。第5章研究结果呈现系统实现后的效果,包括功能实现、性能提升等。5.1系统功能实现效果展示系统各功能模块的实现效果,如挂号成功界面等。5.2系统性能提升效果对比优化前后的系统性能
在金融行业中,对信用风险的判断是核心环节之一,其结果对机构的信贷政策和风险控制策略有直接影响。本文将围绕如何借助机器学习方法,尤其是Sklearn工具包,建立用于判断信用状况的预测系统。文中将涵盖逻辑回归、支持向量机等常见方法,并通过实际操作流程进行说明。 一、机器学习基本概念 机器学习属于人工智能的子领域,其基本理念是通过数据自动学习规律,而非依赖人工设定规则。在信贷分析中,该技术可用于挖掘历史数据中的潜在规律,进而对未来的信用表现进行预测。 二、Sklearn工具包概述 Sklearn(Scikit-learn)是Python语言中广泛使用的机器学习模块,提供多种数据处理和建模功能。它简化了数据清洗、特征提取、模型构建、验证与优化等流程,是数据科学项目中的常用工具。 三、逻辑回归模型 逻辑回归是一种常用于分类任务的线性模型,特别适用于二类问题。在信用评估中,该模型可用于判断借款人是否可能违约。其通过逻辑函数将输出映射为0到1之间的概率值,从而表示违约的可能性。 四、支持向量机模型 支持向量机是一种用于监督学习的算法,适用于数据维度高、样本量小的情况。在信用分析中,该方法能够通过寻找最佳分割面,区分违约与非违约客户。通过选用不同核函数,可应对复杂的非线性关系,提升预测精度。 五、数据预处理步骤 在建模前,需对原始数据进行清理与转换,包括处理缺失值、识别异常点、标准化数值、筛选有效特征等。对于信用评分,常见的输入变量包括收入水平、负债比例、信用历史记录、职业稳定性等。预处理有助于减少噪声干扰,增强模型的适应性。 六、模型构建与验证 借助Sklearn,可以将数据集划分为训练集和测试集,并通过交叉验证调整参数以提升模型性能。常用评估指标包括准确率、召回率、F1值以及AUC-ROC曲线。在处理不平衡数据时,更应关注模型的召回率与特异性。 七、集成学习方法 为提升模型预测能力,可采用集成策略,如结合多个模型的预测结果。这有助于降低单一模型的偏差与方差,增强整体预测的稳定性与准确性。 综上,基于机器学习的信用评估系统可通过Sklearn中的多种算法,结合合理的数据处理与模型优化,实现对借款人信用状况的精准判断。在实际应用中,需持续调整模型以适应市场变化,保障预测结果的长期有效性。 资源来源于网络分享,仅用于学习交流使用,请勿用于商业,如有侵权请联系我删除!
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值