Real World Haskell - Chapter 3. Defining Types, Streamlining Functions

最新推荐文章于 2024-09-25 07:35:51 发布

原创最新推荐文章于 2024-09-25 07:35:51 发布 · 511 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#haskell #types #string #list #apple #tree

Haskell 专栏收录该内容

11 篇文章

订阅专栏

本文介绍Haskell中的类型定义、代数数据类型、模式匹配等核心概念，并通过实例展示了如何利用这些特性来简化函数定义及错误处理。

Chapter 3. Defining Types, Streamlining Functions

使用“data”自定义新类型

-- book.hs

data BookInfo = Book Int String [String]

deriving (Show)

mybook = Book 9780135072455 "Algebra of Programming"

["Richard Bird", "Oege de Moor"]

---------------

:load book

mybook

--> Book 9780135072455 "Algebra of Programming" ["Richard Bird","Oege de Moor"]

:type mybook

--> mybook :: BookInfo

“BookInfo”是类型构造器(type constructor)，类型名必须以大写字母开头。

“Book”是值构造器(value constructor)，用于创建类型为BookInfo 的值。值构造器也必须以大写字母开头。

“Int”，“String”，“[String]”是slot，值就装在这些slot 里面。

在ghci 中构造Book

ghci> Book 0 "The Book of Imaginary Beings" ["Jorge Luis Borges"]

ghci> let cities = Book 173 "Use of Weapons" ["Iain M. Banks"]

“:info”命令显示详细类型信息

:info BookInfo

Naming Types and Values

类型名和类型值是彼此独立的。

类型构造器只用于类型声明或类型签名。值构造器只用于真实代码。

类型构造器和值构造器可以有相同的名字

data BookReview = BookReview BookInfo CustomerID String

Type Synonyms

定义类型别名的方法

type B = Int

data A = A BookInfo Int B

type C = (A, B)

这里C 的类型是tupe(A,B)，这些类型别名没有值构造器

Algebraic Data Types

所有使用Data 关键字定义的类型都是代数类型

Bool 是最简单的代数数据类型

代数数据类型可以有多个值构造器

data Bool = False | True

Bool 类型有两个值构造器，Ture 和False。

代数类型的值构造器可以接受零个或多个参数。

type CardHolder = String

type CardNumber = String

type Address = [String]

data BillingInfo = CreditCard CardNumber CardHolder Address

| CashOnDelivery

| Invoice CustomerID

deriving (Show)

BillingInfo 有三个值构造器，而且各自接受的参数也不同。

Tuples, Algebraic Data Types, and When to Use Each

有相同类型签名的pairs tuple 具有相同的类型

a = ("Porpoise", "Grey")

b = ("Table", "Oak")

“==”操作符要求两边有相同的类型

Analogues to Algebraic Data Types in Other Languages

The enumeration

用代数类型实现枚举

data Roygbiv = Red

| Orange

| Yellow

| Green

| Blue

| Indigo

| Violet

deriving (Eq, Show)

Red == Yellow --> False

Green == Green --> True

The discriminated union

用代数类型实现联合(union)

type Vector = (Double, Double)

data Shape = Circle Vector Double

| Poly [Vector]

如果使用Circle 构造器，我们实际上创建了并存储了一个Circle 值，相反用Poly ，就存Poly 值。

Pattern Matching

假设我们得到某类型的一些值：

1. 如果此类型有多个值构造器，我们应能够知道这个值是由哪个构造器构造出来的。

2. 如果值构造器拥有数据域(slot)，我们应能够取出这些数据值。

Haskell 有一个简单但极有用的模式匹配机制，可以完成这两个任务。

实现Not 函数

myNot True = False

myNot False = True

myNot True -->Fale

myNot True -->Ture

看起来像是定义了两个名为myNot 的函数，但Haskell 允许将函数定义为一系列的等式。这样同一个函数对于不同的输入模式具有不同的行为。

排在前面的函数等式优先匹配。

对列表的所有元素求和

sumList (x:xs) = x + sumList xs

sumList [] = 0

sumList [1,2] --> 3

（运算符“:” 在表的前端添加一个元素，前面必须是一个元素，后面必须是一个列表）

[1,2] 是 (1:(2:[])) 的简略写法。但函数参数列表(x:xs) 中的“：”是表示匹配，结果是1 匹配x，2:[] 匹配xs。函数的参数匹配就把list 拆成两部分，part1 + 递归调用part2。注意函数等式的排列顺序，排在前面的优先匹配。

最后，标准函数sum 与上面我们定义的函数有同样的功能。

Construction and Deconstruction

不要被deconstruction 迷惑，模式匹配不解构任何东西，只是让我们“look inside”某个东西。

Further Adventures

返回3-tuple 的第三个值的函数

third (a, b, c) = c

一个稍复杂的参数匹配例子

complicated (True, a, x:xs, 5) = (a, xs)

complicated (True, 1, [1,2,3], 5) -- > (1,[2,3])

complicated (False, 1, [1,2,3], 5) -- 参数匹配失败

data BookInfo = Book Int String [String]

deriving (Show)

bookID (Book id title authors) = id

bookID (Book 1 “tt” [“dsa”]) --> 1

m3list (3:xs) = 3 + m3list xs -- 只匹配全3 为元素的list，不是全3 会引起运行时错误

m3list [] = 0

m3list [3,3,3] --> 9

The Wild Card Pattern

“_”称为wild card

模式中的任意值“_”

data BookInfo = Book Int String [String]

deriving (Show)

bookID (Book id _ _) = id

bookID (Book 1 “tt” [“dsa”]) --> 1

Exhaustive Patterns and Wild Cards

默认匹配

goodExample (x:xs) = x + goodExample xs

goodExample _ = 0

goodExample [1,2] -- > 3

Record Syntax

nicerID (Book id _ _ ) = id 这种代码称为boilerplate，既bulky又irksome。

取出记录里的字段值

type CustomerID = Int

type Address = [String]

data Customer = Customer {

customerID :: CustomerID, -- customerID 是一个函数，输入Customer返回CustomerID

customerName :: String,

customerAddress :: Address

} deriving (Show)

let aa = Customer 1 “dfs” [“dfsa”]

customerID aa --> 1

记录函数(accessor functions)的另一种写法

data Customer = Customer Int String [String]

deriving (Show)

customerID :: Customer -> Int

customerID (Customer id _ _) = id

创建记录类型值的Detail 风格

customer2 = Customer {

customerID = 271828,

customerAddress = ["1048576 Disk Drive",

"Milpitas, CA 95134",

"USA"],

customerName = "Jane Q. Citizen"

}

data CalendarTime = CalendarTime {

ctYear :: Int,

ctDay, ctHour, ctMin, ctSec :: Int

}

ctime = CalendarTime {

ctYear = 1,

ctDay=3, ctHour=4, ctMin=5, ctSec=6

}

使用记录的detail 创建风格可以改变记录的order。这里customerName 就和customerAddress 的顺序就对调了。

使用记录的detail 创建风格打印的时侯也会显示更多信息。

如果不使用记录语法，要从一个类型提取某个字段的值将是一件痛若的事。

Parameterized Types

Maybe a 类型

data Maybe a = Something a

| Nonething

deriving (Eq, Show)

Something 2 --> Something 2

:type Something “string"

--> Something “string" :: Main.Maybe [char]

Something (Something 2) --> 嵌套定义要加括号

Recursive Types

list 类型是递归定义的。

List a 类型

data List a = Cons a (List a)

| Nil

deriving (Show)

fromList (x:xs) = Cons x (fromList xs) -- 用list 来构造List a

fromList [] = Nil

Nil -> Nil

Cons 0 Nil -- > Cons 0 Nil

Cons 1 it -- > Cons 1 (Cons 0 Nil)

fromList "durian"

--> Cons 'd' (Cons 'u' (Cons 'r' (Cons 'i' (Cons 'a' (Cons 'n' Nil)))))

Cons 构造器需要两个参数，一个a, 一个list a

二叉树

data Tree a = Node a (Tree a) (Tree a)

| Empty

deriving (Show)

Node 1 (Node 2 Empty Empty) (Node 3 Empty Empty)

simpleTree = Node "parent" (Node "left child" Empty Empty)

(Node "right child" Empty Empty)

Haskell 没有Null 类型，可以用Maybe 类型获得Null 类型的效果，但是这样做需要模式匹配；替代方法是用无参的值构造器（如Nil，Empty）。

Reporting Errors

error 函数输出错误消息，并立既终止程序。

error :: String -> a

因为返回类型是a ，所以可以在任何地方调用，并返回正确的类型。

返回List 列表的第二个元素，不够两个元素就报错（有问题版）

mySecond :: [a] -> a -- 不写好像也可以自动推断出参数类型因为tail 接受list

mySecond xs = if null (tail xs)

then error "list too short"

else head (tail xs)

mySecond "ab" --> 'b'

mySecond "a" --> Exception: list too short

head (mySecond [[9]]) --> Exception: list too short

mySecond [] --> 参数匹配失败，要求[a]，这里是[]

错误处理函数error 的缺点是不能区分对待可恢复错误和能终止程序的至命错误。

A More Controlled Approach

返回List 列表的第二个元素，不够两个元素就报错（待改进）

safeSecond :: [a] -> Maybe a

safeSecond [] = Nothing

safeSecond xs = if null (tail xs)

then Nothing

else Just (head (tail xs))

safeSecond [] --> Nothing

safeSecond [1] --> Nothing

safeSecond [1,2] -->Just 2

safeSecond [1,2,3] -->Just 2

返回List 列表的第二个元素，不够两个元素就报错

tidySecond :: [a] -> Maybe a

tidySecond (_:x:_) = Just x -- 亮点!

tidySecond _ = Nothing

tidySecond [1,2] -->Just 2

模式“_:x:_”只匹配有两个元素以上的List 列表。

Introducing Local Variables

let 关键字开始一个变量声明块，并以关键字in 作为结束。

函数体内部使用let 定义局部变量。

lend amount balance = let reserve = 100 -- 如果ghci 报错就在行后加个分号“;”

newBalance = balance - amount

in if balance < reserve

then Nothing

else Just newBalance

let 的左边被捆绑到右边的表达式，注意let是与表达式而不是与值捆绑

如果我们能使用一个名字，它就是“in scope”，否则就是“out of scope”；如果一个名字在整个源文件可见，我们就说它位于“top level”

Shadowing

let 可以互相嵌套但不是聪明的做法

foo = let a = 1

in let b = 2

in a + b

print foo -- >3

shadowing 变量(影子变量)

bar = let x = 1

in ((let x = "foo" in x), x)

-- >("foo",1)

里面的x，是外面x 的shadowing ，shadowing 变量是一个同名变量，类型和值都可以不同。

shadow 一个函数的参数

quux a = let a = "foo";

in a ++ "eek!"

type quux -- t -> [Char]

因为函数的参数永远不会被用到，所以参数可以是任意类型t

使用GHC 的选项“-fwarn-name-shadowing”可以开启shadow 警告，以避免发生莫明其妙的问题。

The where Clause

我们可以使用另一种机制引入局部变量：where 子句。

使用where 子句推迟局部变量的定义

lend2 amount balance = if amount < reserve * 0.5

then Just newBalance

else Nothing

where reserve = 100

newBalance = balance – amount -- 注意和上面对齐，不然会出错

where 子句有助于将读者的注意力集中在重要的位置，局部变量的值留到后面用where 来定义。

Local Functions, Global Variables

定义局部函数和定义局部变量一样简单。

定义局部函数

pluralise :: String -> [Int] -> [String]

pluralise word counts = map plural counts

where plural 0 = "no " ++ word ++ "s"

plural 1 = "one " ++ word

plural n = show n ++ " " ++ word ++ "s"

局部函数plural 含有多个等式，并使用了外部函数pluralise 的变量word。

定义全局变量

itemName = "Weighted Companion Cube"

在源文件的top level 定义既可

The Offside Rule and Whitespace in an Expression

Haskell 这种使用缩排的规则称为“offside rule”。

第一个使用top-level 的声明或定义可以从任何列开始，然后Haskell 编译器或解释器就会记住那个缩排层级数；所有后面的top-level 声明都必须和第一个top-level 具有同样的缩进。

bar = let b = 2

c = True

in let a = b

in (a, c)

a 只对里面的let 可见，外面的let 不可见。

foo = x

where x = y

where y = 2 -- 注意这个where 的缩进

A Note About Tabs Versus Spaces

把编辑器调成用空格代替Tab

The Offside Rule Is Not Mandatory

“offside rule”不是强制的

可以用大括号“{}”将equations 括起来，并用分号“;”分隔里面的每项。

用大括号代替缩排

bar = let a = 1

b = 2

c = 3

in a + b + c

foo = let { a = 1; b = 2;

c = 3 }

in a + b + c

The case Expression

函数定义不是唯一可以使用模式匹配的地方。“case”语句也会进行模式匹配。

case 语句

fromMaybe defval wrapped =

case wrapped of

Nothing -> defval

Just value -> value

“-> ” 左边是模式，如被匹配右边就得以evaluate

case 关键字后接一个任意的表达式，并且用这个表达式的值去匹配“of”后面的表达式。匹配的须序是自上往下。

“of”后面的所有表达式必须具有相同的类型。

wild card 表达式“_”可以放在case 语句的最后用作默认匹配。

Common Beginner Mistakes with Patterns

以下是初学者对模式的常见误用

Incorrectly Matching Against a Variable

“of”后面应该写case 对应的表达式的值

错误的版本

data Fruit = Apple | Orange

apple = "apple"

orange = "orange"

whichFruit :: String -> Fruit

whichFruit f = case f of

apple -> Apple

orange -> Orange

正确的版本

data Fruit = Apple | Orange -- 亮点

betterFruit f = case f of

"apple" → Apple

"orange" → Orange

函数betterFruit 的类型签名可以自动推断出来，因为输入与String 型的"apple" 匹配，输出与Fruit 类型的"Apple" 匹配。

Incorrectly Trying to Compare for Equality

模式里的名字只能出现一次

错误的例子

bad_nodesAreSame (Node a _ _) (Node a _ _) = Just a

bad_nodesAreSame _ _ = Nothing

解决这个问题要用到guards

Conditional Evaluation with Guards

对函数的参数进行条件测试

-- Node 是Tree 的值构造器；结点本身也是树，两者是一回事

data Tree a = Node a (Tree a) (Tree a)

| Empty

deriving (Show)

nodesAreSame (Node a _ _) (Node b _ _)

| a == b = Just a

nodesAreSame _ _ = Nothing

“| a == b” 是参数模式的“guards”。一个模式可以有零个或多个“guards”。

“guards” 就是类型为bool 的一个表达式。

“guards”使用符号“|” 引入

使用guards 让代码变得更清晰

myDrop n xs = if n <= 0 || null xs

then xs

else myDrop (n - 1) (tail xs)

niceDrop n xs | n <= 0 = xs

niceDrop _ [] = []

niceDrop n (_:xs) = niceDrop (n - 1) xs