Introduction to Sockets
Outline
IP, ICMP, UDP, TCP header format, servers, byte ordering, socket file descriptors, POSIX data types and functions, useful tools
对应教材:Chapters 1,2 and 3
IP Protocol Suite (IP协议族)
Internet Protocol(IP协议)
Two versions, ipv4 and ipv6.
- 32-bit address space / Fragements
- 128-bit address space / No Fragements
IP Principle(IP原则)
- Unreliable: No guarantees that a datagram will be successfully delivered to its destination. IP provides Best Effort service.
- Connectionless: IP will not maintain state information about successive datagrams. Each datagram is handled as an independent entity.
IPv4 Header and IPv6 Header见PPT。
Internet Main Transport Protocols
-
ICMP - Error Handling
全称:Internet Control Message Protocol (ICMP)
-
UDP - Simple Messaging
全称:User Datagram Protocol
- Message Oriented Transport protocol
- Unreliable
- provide Integrity Verification
- Simple
- Stateless
-
TCP - Reliable transport
全称:Transmission Control Protocol
- Reliable, ordered and error-checked delivery of a stream of octets(bytes)
- End-to-end
- Full-duplex
Bits, Bytes and Words
- bit = binary digit = 0/1
- byte = a sequence of 8 bits = 0/1 * 8
- word = a sequence of N bits where N = 16,32,64 depending on the computer
and on Architectures(不同架构)
- 不同架构基础bits不同,8/16/32/64/128/256-bit
Bit/Byte – Arch – Communications
- 在不同的架构下,通信
->Byte Ordering(字节序)
字节序(大小端),就是 大于一个字节类型的数据在内存中的存放顺序。是在跨平台和网络编程中,时常要考虑的问题。
- Big-Endian(大端)(网络字节序):高位字节排放在内存的低地址端,低位字节排放在内存的高地址端。
- Little-Endian(小端)(主机字节序):低位字节排放在内存的低地址端,高位字节排放在内存的高地址端。
巧记:一般网络字节序为大端字节序,因为UDP/TCP/IP协议规定:把接收到的第一个字节当作高位字节看待,网络数据解析时先收到的数据存放于低地址,否则内存的访问将是不连续的。 所以,大端字节序 = 网络字节序 = 高位放低地址。
相关函数
- ntohl() / ntohs() 大端变小端
- htonl() / htons() 小端变大端
#include <netinet/in.h>
/* Host to Network short||long */
uint16_t htons(uint16_t host16bitvalue);
uint32_t htonl(uint32_t host16bitvalue);
/* Network to Host short||long */
uint16_t ntohs(uint16_t net16bitvalue);
uint32_t ntohl(uint32_t net16bitvalue);
Sockets(多种socket)
BSD Sockets, POSIX Sockets, Linux Sockets, Unix Sockets, Datalink Sockets, Storage Sockets, Windows Sockets
POSIX (Portable Operating System Interface)
- Maintaining compatibility between operating systems.
- Application Programming Interface
- Command line shells, utilities, etc…
socket函数
int socket(int family或domain, int type, int protocol);
//常用的值
//domain: AF_INET, AF_INET6
//type: SOCK_STREAM, SOCK_DGRAM(datagram数据报)
//protocol: IPPROTO_TCP, IPPROTO_UDP
//UDP
if((s=socket(AF_INET, SOCK_DGRAM, 0)) == -1 ){
die(“Socket Error”);
}
//TCP
if((s=socket(AF_INET, SOCK_STREAM, 0)) == -1 ){
die(“Socket Error”);
}
Datatypes and Structs(数据类型与结构)
一些数据类型:
Datatype | Description | Header |
---|---|---|
int8_t | Signed 8-bit integer | <sys/types.h> |
uint8_t | Unsigned 8-bit integer | <sys/types.h> |
int16_t | Signed 16-bit integer | <sys/types.h> |
uint16_t | Unsigned 16-bit integer | <sys/types.h> |
int32_t | Signed 32-bit integer | <sys/types.h> |
uint32_t | Unsigned 32-bit integer | <sys/types.h> |
sa_family_t | Address family of socket address structure | <sys/socket.h> |
socklen_t | Length of socket address structure, normally uint32_t | <sys/socket.h> |
in_addr_t | IPv4 address, normally uint32_t | <netinet/in.h> |
in_port_t | TCP or UDP port, normally uint16_t | <netinet/in.h> |
Datatypes in Memory(数据类型在内存中的存储)
Byte 0 | Byte 1 | Byte 2 | Byte 3 |
---|---|---|---|
int8_t | |||
int16_t (a) | int16_t (b) | ||
int32_t (a) | int32_t (b) | int32_t © | int32_t (d) |
int64_t (a) | int64_t (b) | int64_t © | int64_t (d) |
int64_t (e) | int64_t (f) | int64_t (g) | int64_t (h) |
Structs
结构体不同的排列顺序导致内存的保存方式不同
Socket Address struct
以下列出POSIX下的struct:
sockaddr_in & in_addr
struct sockaddr_in {
uint8_t sin_len; /* length of structure (16) */
sa_family_t sin_family; /* address family: AF_INET */
in_port_t sin_port; /* port in network byte order */
struct in_addr sin_addr; /* internet address */
char sin_zero[8];/* unused */
};
/* Internet address. */
struct in_addr {
in_addr_t s_addr; /* address in network byte order */
};
sockaddr
struct sockaddr {
uint8_t sa_len;
sa_family_t sa_family; /* address family: AF_xxx value */
char sa_data[14]; /* protocol specific */
};
如果要用到bind()函数,则需要将sockaddr_in*类型的struct转换为sockaddr*类型,因为bind()中的参数指定要求sockaddr*类型。以下贴出bind函数的代码已经转换方法。
//bind函数
int bind(int sockfd, const struct sockaddr *addr,socklen_t addrlen);
//转换
struct sockaddr_in serv;
bind(sockfd, (struct sockaddr*) &serv,sizeof(serv)
转换的原理为将socket_in中的sin_port, sin_addr, sin_zero转变为sa_data[14]。
我们再看看更完整的例子:
//…
struct sockaddr_in address;
address.sin_family = AF_INET;
address.sin_addr.s_addr = INADDR_ANY;
address.sin_port = htons( PORT );
//…
if (bind(server_fd, (struct sockaddr *)&address, sizeof(address))<0) {
//…
另外还有IPv6的Socket Address Structure。
struct sockaddr_in6 {
uint8_t sin6_len; /* length of structure (28) */
sa_family_t sin6_family; /* address family: AF_INET6 */
in_port_t sin6_port; /* port in network byte order */
uint32_t sin6_flowinfo; /* flow information, undefined */
struct in6_addr sin6_addr; /* IPv6 address */
uint32_t sin6_scope_id; /* set of interfaces for a scope */
};
/* Internet IPv6 address. */
struct in6_addr {
uint8_t s6_addr[16]; /* 128-bit IPv6 address, network byte order */
};
Value – Result Arguments(值-结果参数)
当往一个套接字函数传递一个套接字地址结构时,该结构总是以引用形式来传递。
- 从进程到内核传递套接字地址结构的函数有3个:bind, connect, sendto
- 指针与指针所指内容的大小都传递给了内核,于是内核知道到底需从进程复制多少数据进来
- 从内核到进程传递套接字地址结构的函数有4个:accept, recvfrom, getsockname, getpeername
- 其中有2个函数的参数是指向某套接字的指针与指向表示某套接字的大小的指针。原因:当函数被调用,结构大小是一个值(value),告诉内核结构的大小,这样不会使内核在写该struct不越界;当函数返回,结构大小又是一个结果(result),告诉进程内核在struct中存储多少信息。
- -》这就是值-结果参数
- 其他值-结果参数
- select中间的3个参数
- getsockopt的长度参数
- recvmsg函数中msghdr的msg_namelen和msg_controllen字段
- 其他值-结果参数
Byte Manipulation Functions(一些字节操作函数)
#include <string.h>
/* ANSI C */
void *memset(void *dest, int c, size_t len); //初始化用,例如int c:memset(c, 0, sizeof(c));
void *memcpy(void *dest, const void *src, size_t nbytes); //copy用
int memcmp(const void *ptr1, const void *ptr2, size_t nbytes); //比较相同用
#include <strings.h>
/* Deprecated */ //以下是被弃用的函数
void bzero(void *dest, size_t bytes);
void bcopy(const void *src, void *dst, size_t nbytes);
int bcmp(const void *ptr1, const void *ptr2, size_t nbytes);
IP Address Conversion(IP地址转化函数)
#include <arpa/inet.h>
//ipv4
/* Convert ’A.B.C.D’ to 32-bit */ //把ip地址转化为用于网络传输的二进制数值
int inet_aton(const char *strptr, struct in_addr *addptr);
/* Convert ’A.B.C.D’ to 32-bit */ //把ip地址转化为用于网络传输的二进制数值
in_addr_t inet_addr(const char *strptr);
/* Convert 32-bit to ’A.B.C.D’ */ //将网络传输的二进制数值转化为成点分十进制的ip地址
char *inet_ntoa(struct in_addr inaddr);
//ipv4和ipv6通用,推荐
/* Convert ’A.B.C.D’ || ’A::D’ to 32/128-bit */ //将点分十进制的ip地址转化为用于网络传输的数值格式
int inet_pton(int family, const char *strptr, void *addptr);
/* Convert 32/128-bit to ’A.B.C.D’ ||’A::D’ */ //将数值格式转化为点分十进制的ip地址格式
Const char *inet_ntop(int family, const struct *addrptr, char *strptr, size_t len);
//p和n表达(presentation)和数值(numeric)
Domain Name System - DNS
- Translate ‘strings’(域名) to L3 addresses(ip地址)
resolving(解析)
int inet_pton(int family, const char *strptr, void *addptr);
const char *inet_ntop(int family, const struct *addrptr, char *strptr, size_t len);
参考
-
https://blog.youkuaiyun.com/photon222/article/details/90047502 大小端字节序/网络字节序
-
https://blog.youkuaiyun.com/arpann/article/details/20771093 大小端(字节序)位序
-
https://blog.youkuaiyun.com/lqw198421/article/details/113586869 字节序(大小端、网络字节序)