Competitive Programming

Yung-Sheng Lu

FEB 09, 2015

@NCKU-CSIE

Lecture 4

Outline

Sets
Disjoint Sets
Mathematics Basics

Sets

集合 (Set)：在此只考慮整數的集合，集合有以下特性。

空集合 (Empty set)。
集合中的元素不重複。

Sets

-5

Set

Not Set

Set

集合 (Set)：以下為常用的存取方式。

循序存取 (Sequential Access)
索引存取 (Indexed Access)
雜湊表 (Hash Table)

Sets (cont.)

循序存取 (Sequential Access)

陣列 (Array)：
- 使用一維陣列紀錄集合裡的所有元素。再用一個變數，記錄元素總數。

Data Structure of Sets

元素個數：5

/* Using struct */
typedef struct set {
    int array[1000]; 
    int num;
} Set;
 
Set set1, set2;


/* Using vector */
# include <vector>

vector<int> set1(1000);
vector<int> set2(1000);

-5

循序存取 (Sequential Access)

陣列 (Array)：
- 如果要做聯集、交集、差集之類的運算，則會相當麻煩。
  - 可以直接使用 STL 中 <algorithm> 的 set_union() 、 set_intersection() 、 set_difference() 、 set_symmetric_difference() 。

Data Structure of Sets (cont.)

-5

-2

A

B

-5

-2

檢查元素是否已經存在？

Union

循序存取 (Sequential Access)

陣列 (Array)：
- Example - set_union()

Data Structure of Sets (cont.)

/* Using vector */
#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

int main(void) {
    int first[] = {5,10,15,20,25};
    int second[] = {50,40,30,20,10};
    vector<int> v(10);
    vector<int>::iterator it;
    
    sort(first, first + 5);     //  5 10 15 20 25
    sort(second, second + 5);   // 10 20 30 40 50
    
...

...
    
    it = set_union(first, first + 5, second, second + 5, 
                   v.begin());    
    // 5 10 15 20 25 30 40 50  0  0

    v.resize(it - v.begin());                      
    // 5 10 15 20 25 30 40 50
    
    // Output result
    for (it = v.begin(); it != v.end(); ++it)
      cout << ' ' << *it;
    cout << endl;

    return 0;
}

循序存取 (Sequential Access)

陣列 (Array)：
- Example - set_intersection()

Data Structure of Sets (cont.)

/* Using vector */
#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

int main(void) {
    int first[] = {5,10,15,20,25};
    int second[] = {50,40,30,20,10};
    vector<int> v(10);
    vector<int>::iterator it;
    
    sort(first, first + 5);     //  5 10 15 20 25
    sort(second, second + 5);   // 10 20 30 40 50
    
...

...
    
    it = set_intersection(first, first + 5, second, second + 5, 
                   v.begin());    
    // 10 20 0 0 0 0 0 0 0 0

    v.resize(it - v.begin());                      
    // 10 20
    
    // Output result
    for (it = v.begin(); it != v.end(); ++it)
      cout << ' ' << *it;
    cout << endl;

    return 0;
}

循序存取 (Sequential Access)

陣列 (Array)：
- Example - set_difference()

Data Structure of Sets (cont.)

/* Using vector */
#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

int main(void) {
    int first[] = {5,10,15,20,25};
    int second[] = {50,40,30,20,10};
    vector<int> v(10);
    vector<int>::iterator it;
    
    sort(first, first + 5);     //  5 10 15 20 25
    sort(second, second + 5);   // 10 20 30 40 50
    
...

...
    
    it = set_difference(first, first + 5, second, second + 5, 
                   v.begin());    
    // 5 15 25 0 0 0 0 0 0 0

    v.resize(it - v.begin());                      
    // 5 10 15 20 25 30 40 50
    
    // Output result
    for (it = v.begin(); it != v.end(); ++it)
      cout << ' ' << *it;
    cout << endl;

    return 0;
}

循序存取 (Sequential Access)

鏈結串列 (Linked List)：
- 原理和陣列完全一樣。
- 陣列是一個一個數字連著放，鏈結串列則是一個一個數字連成串。

Data Structure of Sets (cont.)

/* Using linked list */
typedef struct node {
    int data;
    struct node *next;
} Node;

Node *set = new Node();

-5

循序存取 (Sequential Access)

二元搜尋樹 (Binary Search Tree)：
使用二元搜尋樹將集合裡的所有元素，概念類似鏈結串列。

Data Structure of Sets (cont.)

/* Using binary search tree */
class BinarySearchTree {
private:
    typedef struct treeNode {
       struct treeNode *left;
       struct treeNode *right;
       int key;
    } TreeNode;
    TreeNode *root;

...

...

public:
    BinarySearchTree() {
       root = NULL;
    }
    bool isEmpty() const { 
        return root == NULL; 
    }
    void inorder(tree_node*);
    void preorder(tree_node*);
    void postorder(tree_node*);
    void insert(int);
    void remove(int);
};

-5

索引存取 (Indexed Access)

陣列 (Array)：使用 bool 陣列。
- 集合裡若有 x 這個元素，就讓 array[x] 為 true ，否則為 false 。
- 數值受陣列大小影響。
- 如果要做聯集、交集、差集之類的運算，會比較快速，
  - 時間複雜度：，為陣列大小

Data Structure of Sets (cont.)

O(n)

O(n)

n

索引存取 (Indexed Access)

陣列 (Array)：使用 bool 陣列。

Data Structure of Sets (cont.)

bool set[1000];

// Determine whether the set is empty 
bool empty(bool a[1000]) {
    for (int i = 0; i < 1000; ++i) {
        if (a[i])
            return false;
    }
    return true;
}
 
// Add element
void add_element(bool a[1000], int element) {
    a[element] = true;
}
 
// Remove element
void remove_element(bool a[1000], int element) {
    a[element] = false;
}

...

...

void union(bool a[1000], bool b[1000], bool c[1000]) {
    for (int i = 0; i < 1000; ++i) 
        c[i] = a[i] || b[i];
}

void intersection(bool a[1000], bool b[1000], bool c[1000]) {
    for (int i = 0; i < 1000; ++i) 
        c[i] = a[i] && b[i];
}

void difference(bool a[1000], bool b[1000], bool c[1000]) {
    for (int i = 0; i < 1000; ++i) 
        c[i] = a[i] && !b[i];
}

void complement(bool a[1000], bool b[1000]) {
    for (int i = 0; i < 1000; ++i) 
        b[i] = !a[i];
}

索引存取 (Indexed Access)

位元陣列 (Bit Array; Bitset)：使用 bit 來代替 bool 變數。
- 每個位元只有 0 和 1 兩種值，可用來表示一個集合元素存不存在。
- 可以節省儲存空間、運算時間。
- 一個整數變數所使用的記憶體大小為 32 bit ，可當作是 32 個數字的集合。

Data Structure of Sets (cont.)

390

00000000000000000000000110000110

00000000000000000000000000000101

索引存取 (Indexed Access)

位元陣列 (Bit Array; Bitset)：使用 bit 來代替 bool 變數。
- 可以直接使用 STL 的 <bitset> 的 bitset 。

Data Structure of Sets (cont.)

// Bitset contains 0 to 3199 elements
typedef int Bitset[100];

// Get element's position
int get_pos(int element) {
    return element >> 5;
}
 
// Get the element's bit to compute
int get_bit(int element) {
    return 1 << (element & 31);
}

...

...
 
// Add element
void add_element(Bitset a, int element) {
    a[get_pos(element)] |= get_bit(element);
}
 
// Remove element
void delete_element(Bitset a, int element) {
    a[get_pos(element)] &= ~get_bit(element);
}

索引存取 (Indexed Access)

位元陣列 (Bit Array; Bitset)：使用 bit 來代替 bool 變數。
- Example - bitset 。

Data Structure of Sets (cont.)

#include <iostream>
#include <string>
#include <bitset>

using namespace std;

int main(void) {
    bitset<16> foo;
    bitset<16> bar(0xfa2);
    bitset<16> baz(string("0101111001"));

    cout << "foo: " << foo << '\n';
    cout << "bar: " << bar << '\n';
    cout << "baz: " << baz << '\n';

    return 0;
}

雜湊表 (Hash Table)

雜湊函式 (Hash Function)：非資料結構。
- 將一筆資料重新表示成一個數值，該數值稱作雜湊值。
- 以資料庫的觀點，資料進行索引，以利管理。
- 以密碼學的觀點，資料進行編碼，以求隱蔽。
- 理想情況是相同資料有著相同雜湊值、相異資料有著相異雜湊值，如此就能直接使用雜湊值來分辨資料。

Data Structure of Sets (cont.)

no: 1

name: H

m: 1.008

1210493772981

雜湊表 (Hash Table)

雜湊函式 (Hash Function)：非資料結構。
- 可以直接使用 STL 中 <functional> 的 hash 。

Data Structure of Sets (cont.)

#include <iostream>
#include <functional>
#include <string>

using namespace std;

int main(void) {
    char nts1[] = "Test";
    char nts2[] = "Test";
    string str1(nts1);
    string str2(nts2);

    hash<char*> ptr_hash;
    hash<string> str_hash;

...

...

    cout << "same hashes:\n" << boolalpha;
    cout << "nts1 and nts2: " << (ptr_hash(nts1) == ptr_hash(nts2)) << endl;
    cout << "str1 and str2: " << (str_hash(str1) == str_hash(str2)) << endl;

    return 0;
}

雜湊表 (Hash Table)

雜湊法 (Hashing)：非資料結構。
- 一筆資料套用 hash function 得到雜湊值，作為陣列索引值，用陣列儲存。
- 設計 hash function 時，必須確保雜湊值不會超出陣列邊界。

Data Structure of Sets (cont.)

no: 1
name: H
m: 1.008

no: 3
name: Li
m: 6.941

no: 11
name: Na
m: 22.99

no: 19
name: K
m: 39.1

no: 1
name: H
m: 1.008

no: 11
name: Na
m: 22.9

no: 3
name: Li
m: 6.941

no: 19
name: K
m: 39.1

雜湊表 (Hash Table)

雜湊法 (Hashing)：非資料結構。
- 相同雜湊值，會儲存到陣列的同一格。此時有三種應對方案：
  - 每個陣列元素皆改為 linked list ，串接資料。
  - 放到下一格；如果下一格已經使用，就再往下一格。
  - 新資料直接覆蓋舊資料。

Data Structure of Sets (cont.)

雜湊表 (Hash Table)

雜湊法 (Hashing)：非資料結構。

Data Structure of Sets (cont.)

雜湊表 (Hash Table)

雜湊表 (Hash Table)：
- 當元素的數值範圍很大，甚至元素不是整數，可以利用 hash function 得到一個索引值，而不會超出陣列邊界。
  - 當數值範圍小，建議使用索引存取：省時間、費空間。
  - 當數值範圍大，建議使用循序存取：省空間、費時間。
  - Hash table 兩者兼具，介於中間。

Data Structure of Sets (cont.)

-5

int hash(int n) { return n % 10; }

雜湊表 (Hash Table)

雜湊表 (Hash Table)：
- 可以直接使用 STL 的 unordered_set 、 unordered_multiset 。

Data Structure of Sets (cont.)

#include <iostream>
#include <string>
#include <unordered_set>

using namespace std;

typedef unordered_set<string> StringSet;

int main(void) {
    StringSet myset;

    StringSet::hasher hash = myset.hash_function();

    cout << "that: " << hash("that") << endl;
    cout << "than: " << hash("than") << endl;

    return 0;
}

#include <iostream>
#include <string>
#include <unordered_set>

using namespace std;

int main(void) {
    unordered_set<string> myset;

    myset.rehash(12);
    myset.insert("office");
    myset.insert("gym");
    myset.insert("highway");

    cout << "current bucket_count: " 
         << myset.bucket_count() << endl;

    return 0;
}

雜湊表 (Hash Table)

Cuckoo Filter *
- 建立多個 hash function 。
- 當陣列格子已有資料，就換 hash function 、換雜湊值。

Data Structure of Sets (cont.)

-5

int hash2(int n) { return (n + 2) % 10; }

int hash1(int n) { return (n * 2) % 10; }

雜湊表 (Hash Table)

Bloom Filter *
- 套用多個 hash function ，同時儲存於多個欄位，分散風險。
- 只要發現對應欄位幾乎都是 1 ，就推定元素存在於集合當中。
- 可能產生原本不存在的元素。

Data Structure of Sets (cont.)

-5

int hash2(int n) { return (n + 2) % 10; }

int hash1(int n) { return n % 10; }

Disjoint Sets

互斥集 (Disjoint Sets)

集合之間擁有的元素都不相同，也就是集合之間都沒有交集。
Example -

Disjoint Sets

A = \{1, 3, 7, 8\}

A = \{1, 3, 7, 8\}

B = \{4, 5, 9\}

B = \{4, 5, 9\}

C = \{0, 2\}

C = \{0, 2\}

A, B, C

A, B, C

構成 Disjoint Sets

D = \{1, 2, 3\}

D = \{1, 2, 3\}

A, B, C, D

A, B, C, D

不構成 Disjoint Sets

(元素重複 )

1, 2, 3

1, 2, 3

Basic Operations

make_set(x)：
以元素為代表建立新的集合。
find_set(x)：
回傳元素所在集合的代表。
union_set(x, y)：
將元素與聯集成新的集合，建立新的代表。

Basic Operations of Disjoint Sets

\{ x \}

\{ x \}

x

x

y

x

Example -
- make_set ( 1 )
- make_set ( 2 )
- make_set ( 3 )
- make_set ( 4 )
- find_set ( 3 )
- find_set ( 2 )
- union_set ( 1, 2 )
- union_set ( 3, 4 )
- find_set ( 2 )
- find_set ( 4 )

Basic Operations of Disjoint Sets (cont.)

return 3

return 1

representative 1

representative 3

return 1

return 3

make_set(x)：以元素為代表建立新的集合。

Basic Operations of Disjoint Sets (cont.)

\{ x \}

\{ x \}

x

void make_set(int x) {
    // Set the parent of the node
    parent[x] = x;
    // Set height of the tree
    rank[x] = 0;
    // Set current number of nodes including itself
    size[x] = 1;
}

rank[1] = 1

parent[8] = 1

rank[1] = 1

parent[8] = 1

parent[2] = 1

find_set(x)：回傳元素所在集合的代表。

Basic Operations of Disjoint Sets (cont.)

void find_set(int x) {
    if (parent[x] == x)
        return x;
    return find_set(parent[x]);
}

x

void find_set(int x) {
    if (parent[x] == x)
        return x;
    return parent[x] = find_set(parent[x]);
}

Very deep

Basic Operations of Disjoint Sets (cont.)

void union_set(int x, int y) {
    if (rank[x] > rank[y]) {
        parent[y] = x;
        rank[parent[x]] += rank[y];
        size[parent[x]] += size[y];
    }
    else {
        parent[x] = y;
        rank[parent[y]] += rank[x];
        size[parent[y]] += size[x];
    }
}

union_set(x)：將元素與聯集成新的集合，建立新的代表。

x

y

rank[1] = 2

rank[5] = 1

Mathematics Basics

Find out all prime numbers not greater than a specific number.

Simple solution
Chose the smallest number at each iteration and delete the multiple of this number.

Prime Number

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ...

#define MAX 1000000

vector<bool> isPrime(MAX);

void sieve() {
    isPrime[0] = false;
    isPrime[1] = true;
    
    for (int i = 2; i < MAX; ++i) {
        if (isPrime[i]) {
            for (int j = i + 1; j < MAX; j += i)
                isPrime[j] = false;
        }
    }
}

#define MAX 1000000

vector<bool> isPrime(MAX);

void sieve() {
    isPrime[0] = false;
    isPrime[1] = true;
    
    for (int i = 2; i < sqrt(MAX); ++i) {
        if (isPrime[i]) {
            for (int j = i * i; j < MAX; j += i)
                isPrime[j] = false;
        }
    }
}

Find out all prime numbers not greater than a specific number.

Euler's Function (Euler's Totient Function)
- The totient of a positive integer is defined to be
  - The number of positive integers that are co-prime to .
  - e.g., 1, 2, 4, 5, 7, 8 are co-prime to 9.
- To find totient of , we have to factorize first.
  - e.g.,
    - How to factorize ?
    - How to count the number of factor of ?
    - How to count the ?

Prime Number (cont.)

n

\leq n

\leq n

n

12 = 2 ^ 2 + 3 ^ 1

12 = 2 ^ 2 + 3 ^ 1

\phi(12)

\phi(12)

12

12

12

12

\phi

\phi

Find out all prime numbers not greater than a specific number.

Euler's Function (Euler's Totient Function)

Prime Number

\phi

\phi

int n = 12, pos = 0;
int tmp = n;


for (int i = 2; i <= (int)sqrt(n); ++i) {
    if (!isPrime[i] || tmp % i) {
        continue;
    }
    prime[pos] = i;  // prime[pos] = prime i
    while (tmp % i == 0) {
        tmp /= i;
        ++a[pox];    // Increase the power a[pos]
    }
    ++pos;
}

printf("%d =", n);
for (int i = 0; i < pos; ++i) {
    if (i)
        printf(" x");
    printf(" %d ^ %d", prime[i], a[i])
}

Practices

Disjoint Sets
- UVA - 793, 879, 10158, 10583, 10608, 10685, 11503
- POJ - 1703, 2492
Prime Number
- UVA - 406, 543, 1210, 10539, 10924
- POJ ‒ 2262, 2739, 3006

Competitive Programming

Outline

Sets

Disjoint Sets

Mathematics Basics

Sets

集合 (Set)：在此只考慮整數的集合，集合有以下特性。

Sets

集合 (Set)：以下為常用的存取方式。

Sets (cont.)

循序存取 (Sequential Access)

Data Structure of Sets

循序存取 (Sequential Access)

Data Structure of Sets (cont.)

循序存取 (Sequential Access)

Data Structure of Sets (cont.)

循序存取 (Sequential Access)

Data Structure of Sets (cont.)

循序存取 (Sequential Access)

Data Structure of Sets (cont.)

循序存取 (Sequential Access)

Data Structure of Sets (cont.)

循序存取 (Sequential Access)

Data Structure of Sets (cont.)

索引存取 (Indexed Access)

Data Structure of Sets (cont.)

索引存取 (Indexed Access)

Data Structure of Sets (cont.)

索引存取 (Indexed Access)

Data Structure of Sets (cont.)

索引存取 (Indexed Access)

Data Structure of Sets (cont.)

索引存取 (Indexed Access)

Data Structure of Sets (cont.)

雜湊表 (Hash Table)

Data Structure of Sets (cont.)

雜湊表 (Hash Table)

Data Structure of Sets (cont.)

雜湊表 (Hash Table)

Data Structure of Sets (cont.)

雜湊表 (Hash Table)

Data Structure of Sets (cont.)

雜湊表 (Hash Table)

Data Structure of Sets (cont.)

雜湊表 (Hash Table)

Data Structure of Sets (cont.)

雜湊表 (Hash Table)

Data Structure of Sets (cont.)

雜湊表 (Hash Table)

Data Structure of Sets (cont.)

雜湊表 (Hash Table)

Data Structure of Sets (cont.)

Disjoint Sets

互斥集 (Disjoint Sets)

Disjoint Sets

Basic Operations

Basic Operations of Disjoint Sets

Basic Operations of Disjoint Sets (cont.)

make_set(x)：以元素 為代表建立新的集合 。

Basic Operations of Disjoint Sets (cont.)

find_set(x)：回傳元素 所在集合的代表。

Basic Operations of Disjoint Sets (cont.)

Basic Operations of Disjoint Sets (cont.)

union_set(x)：將元素 與 聯集成新的集合，建立新的代表。

Mathematics Basics

Find out all prime numbers not greater than a specific number.

Prime Number

Find out all prime numbers not greater than a specific number.

Prime Number (cont.)

Find out all prime numbers not greater than a specific number.

Prime Number

Practices

Practices

make_set(x)：以元素為代表建立新的集合。

find_set(x)：回傳元素所在集合的代表。

union_set(x)：將元素與聯集成新的集合，建立新的代表。