Algorithms

最新推荐文章于 2024-09-26 11:59:16 发布

tang0057

最新推荐文章于 2024-09-26 11:59:16 发布

阅读量400

点赞数

CC 4.0 BY-SA版权

分类专栏：算法，数据结构

本文链接：https://blog.youkuaiyun.com/tang0057/article/details/78178845

算法，数据结构专栏收录该内容

3 篇文章

订阅专栏

Programming Thoughts：

Reasoning thought; Recursive though; Greedy thought; Enumeration thought; Divid and Conquer; Backtracking; Dynamic Programming; Probability thought

Sorting：

Internal sort：All data have read into memory and only sort in memory.

Exchange sort: Bubble sort, Quick sort
Select sort: Select sort, Heap sort
Insert sort: Insert sort, Shell sort
Merge sort: Merge sort

External sort：Data is too big to read into memory once. Thus, process data part by part, finally, merge all parts of sorted data.

Linear List：

Order List和Linked List。

Queue：

A kind of Linear List.

Stack：

Search：

(1) Order Search

O(n)

(2) Binary Search

Unsorted array and using QuickSort O(NlogN+logN)；Sorted array O(logN)

(3) Hash Search

O(1)。Data as much as possible to disperse；Hash function as simple as possible。

Five Hash approaches：

1、直接寻址。Y=X+A；

2、除法取余。Y=X%A；

3、数字分析。比如有一组value1=112233，value2=112633，value3=119033，针对这样的数我们分析数中间两个数比较波动，其他数不变。那么我们取key的值就可以是key1=22,key2=26,key3=90。

4、平方取中。

5、折叠：

比如value=135790，要求key是2位数的散列值。那么我们将value变为13+57+90=160，然后去掉高位“1”,此时key=60，哈哈，这就是他们的哈希关系，这样做的目的就是key与每一位value都相关，来做到“散列地址”尽可能分散的目地。

散列冲突解决：

1、开放寻址：将数组中未使用的地址开放给发生冲突的数据作为存储。

(4) Index Search

O(n/3+length)

(5) Binary Search Tree

插入、删除、查找：O(LogN)

(6) Bitmap

大数据的快速查找、判重和删除。也可通过去重压缩数据。爬虫系统的url去重。数据重复率大时复杂度增加。

1Byte = 8Bits。1Byte表示8个数。一个Int空间4Bytes，

a[0]–0-31; a[1]–32-63; a[2]–64-95。

数字存在时，对应的位置value为1，否则为0。int a[1+MAX/32]。MAX为最大的数。

Tree Structures：

(1) Binary Search Tree：

(2) Balanced Search Tree：

Root的左子树高度与右子树高度之差不超过1.

失衡的解决办法：旋转、添加、删除。

检索速度快。

(3) Treap Tree:

有两套数据：结点的Key和优先级value。

特点：
结点的key满足Binary search tree；
结点的优先级满足MiniHeap。

(4) Splay Tree:

AVL变体。根据旋转、伸展将树的结构调整。

(5) Trie Tree:

1.根节点不包含字符，除根结点外每个节点都只包含一个字符。

2.从根节点刀某个节点，将途径节点的字符连接起来就是该节点对应的字符串。

Graph：

1. Basics of Graph:

1.1 Representation of Graphs

Edge list

adjacency lists: Sparse Graphs.|E| is much less than |V|*|V| or |E| is close to |V|.

adjacency matrix: Dense Graphs.|E| is close to |V|*|V|.

To know whether two edges are connected quickly, Adjacency matric is a good way to represent.

1.2 BFS

Prim and Dijkstra algorithms use ideas similar to BFS.

BFS(G,s)
  for each vertex u belongs to G.V - {s}
      u.color = WHITE
      u.d = inifinity
      u.t = NIL
  s.color = GRAY
  s.d = 0
  s.t = NIL 
  Q = Empty
  ENQUEUE(Q,s) 
  while Q != Empty
      u = DEQUEUE(Q) 
      for each v belongs to G.Adj[u] 
         if  v.color == WHITE  
            v.color = GRAY
            v.d = u.d + 1 
            v.t = u 
            ENQUEUE(Q,v)
      u.color = black

Time complexity: O(V+E)

1.4 DFS

DFS(G)
  for each vertex u belongs to G.V 
    u.color = WHITE 
    u.t = NIL
  time = 0
  for each vertex u belongs to G.V
    if u.color == WHITE 
       DFS-VISIT(G,u)

DFS-VISIT(G,u)
  time = time + 1 // white vertex u has just been discovered
  u.d = time
  u.color = GRAY
  for each v belongs to G.Adj[u] //explore edge(u)
     if v.color == WHITE  
        v.t = u
        DFS-VISIT(G,v)
  u.color = BLACK
  time = time + 1 
  u.f = time // blacken u; it is finished

Time complexity: O(V+E)

1.4 Topological Sort

use DFS to perform a topological sort of a directed acyclic graph.

TOPOLOGICAL-SORT(G)
1. call DFS(G) to compute finishing times v.f for each vertex v 
2. as each vertex is finished, insert it onto the front of a linked list
3. return  the linked list of vertices

O(V+E)

1.5 Strongly-connected components:**

vertice u –> v and v –> u; that is, vertices u and v are reachable from each other.

STRONGLY-CONNECTED-COMPONENTS(G)
1.call DFS(G) to compute finishing times u.f for each vertex u 
2.compute GT
3.call DFS(GT), but in the main loop of DFS, consider the vertices
   in order of decreasing u.f (as computed in line 1)
4.output the vertices of each tree in the depth-first forest formed in line 3 as a
separate strongly connected component

2. Minimum Spanning Tree：

Kruskal Algorithms :

根据weights为每条边排序。Find判断是否为环路。否，则用union连接两个点。

MST-KRUSKAL(G, w)
 A = Empty;
 for each vertex v belongs to G.V
    MAKE-SET(v)
 sort the edges of G:E into nondecreasing order by weight w

 for each edge(u,v) belongs to G.E, taken in nondecreasing order by weight
    if FIND-SET(u) != FIND-SET(v)
      A = A U {(u,v)} 
      UNION(u,v)

Prim Algorithms：
与Dijkstra相似，每次选择的边是使树的总权重增加最小的边。

MST-PRIM(G, w, r)
 for each u belongs to G.V
   u:key = infinity
   u:t = NILL

 r:key = 0 
 Q = G.V 
 while Q != Empty
   u = EXTRACT-MIN(Q)
   for each v belongs to G.Adj[u] 
     if v belongs to Q and w(u,v) < v.key
       v.t = u
       v.key = w(u,v)

3.Shortest Paths

Dijkstra算法:
goals to solve single-source shortest path in weighted directed graph. It requires all weights are non-negative.

INITIALIZE-SINGLE-SOURCE(G,s)
  for each vertex v belongs to G.V
     v.d = infinity // record the upper bound of the shortest path weight from source vertex s to vertex v. 
     v.t = NIL
  s.d = 0 
// Time complexity: O(V)

RELAX(u,v,w)
  if v.d > u.d + w(u,v)
    v.d = u.d + w(u,v)
    v.t = u

DIJKSTRA(G,w,s)
  INITIALIZE-SINGLE-SOURCE（G,s)
  S = Empty
  Q = G.V
  while Q != Empty
     u = EXTRACT-MIN(Q)
     S = S U {u}
     for each vertex v belongs to G.Adj[u]
       RELAX(u,v,w)