探討遺傳演算法的種類與效能差異

–以俄羅斯方塊(Tetris)為例


建國中學 賴昭勳


專題簡介

Tetris 簡介

一個 10x20 的遊戲板上,持續有各種「四格骨牌」(Tetromino)掉落,並在一整個橫排都被填滿時可以完全消除,使上面的格子落下。玩家必須要消除橫排以得分,並避免遊戲板被堆到頂端。

From https://en.wikipedia.org/wiki/Tetris

Tetris Ai 好像很多人做過?

NES Tetris...很難

From https://www.youtube.com/watch?v=FWqb0G4WBKs

CTWC (Classic Tetris World Championship)

PART ONE: 遊戲的起點

實作遊戲!

class board:
    def __init__(self):
        self.pointcalc = [0, 40, 100, 300, 1200]
        self.weights = (5.06450279, -0.40483617,  1.14291331,  1.83894321,  0.18530122, 0.35860931,  0.93125075,  0.23497438)
        ###self.survweights = (1.72746616e+00,  1.18972449e-02,  4.54023544e-01,  4.19517578e-01,1.46628694e-03, -2.19499807e-01,  5.36884577e-01, -4.83240272e-01)
        #board data
        self.heights = [0] * 10
        self.ones = [0] * 10
        self.grid = [0] * 24 #each row is an integer
        self.rowTrans = [0] * 24
        self.alive = 1
        #heuristic variables
        self.holes = 0
        self.avgHeight = 0
        self.maxHeight = 0
        self.bumpiness = 0
        self.newl = 0 #new lines in this move
        self.wellCells = 0
        self.deepWells = 0
        self.blocks = 0
        #scoring variables
        self.lines = 0
        self.pts = 0
    def col(self, x, y, piece): #checks collision
    def countrowTrans(self, row):
    def place (self, x, y, piece, type): #places piece
    def checkline(self): #checks for filled lines and updates board position
    def put(self, x, piece): #puts piece down according to rotation and position
    def printboard(self): #prints current board for debugging
    def getstats(self): #completely refreshes stats by checking entire board (only called when lines are cleared)
    def getval(self): #the evaluation function

class piece:
    def __init__(self, p, num):
        self.tile = [] #stores representation of different tile rotations
        self.types = num
    def getbounds(self):
           
tet = [piece([[0, 0], [0, 1], [0, 2], [0, 3]], 2),
       piece([[0, 0], [0, 1], [1, 1], [1, 0]], 1),
       piece([[0, 0], [1, 0], [0, 1], [0, 2]], 4),
       piece([[0, 0], [1, 0], [1, 1], [1, 2]], 4),
       piece([[0, 0], [1, 0], [1, 1], [2, 1]], 2),
       piece([[0, 0], [0, 1], [1, 1], [1, 2]], 2),
       piece([[0, 0], [0, 1], [0, 2], [1, 1]], 4), ] #I, O, L, J, S, Z, T
for i in tet:
    i.getbounds()

gui: 看得到結果

PART TWO: AI 的誕生

AI 做選擇的方法

枚舉所有旋轉,放置位置

計算放下後的評估分數

更新目前找到

最好的放法

做出最終選擇

扣的

cur = tet[x[cnt]]
best = 1000
bestrot, bestpos = 0, 0
for i in range(cur.types):  # for each rotation
    for j in range(cur.minx[i], cur.maxx[i] + 1):  # possible placements
        result = board()
        result.copyboard(gameboard)
        result.put(j, cur.tile[i])
        resval = result.getval()
        if resval < best:
            best = resval
            bestrot, bestpos = i, j

等等,那評估分數怎麼算的?

盤面中很多不同的啟發因子(heuristic)

作為判斷依據

平均高度:5.6

最高高度:9

洞:2個

凹凸度:31

可消行數:2

 

...還有更多!

self.weights = (5.06450279, -0.40483617,  1.14291331,  1.83894321,  0.18530122, 0.35860931,  0.93125075,  0.23497438)
def getval(self):
  return self.weights[0] * self.holes + self.weights[1] * self.bumpiness + self.weights[2] * (max(7, self.maxHeight) - 7) \
                         + self.weights[3] * (-self.newl) + self.weights[4] * self.wellCells + self.weights[5] * self.deepWells + \
                         self.weights[6] * totalrowtrans + self.weights[7] * -self.blocks + 0.1 * self.avgHeight

所以,上面的 self.weights 要怎麼找呢?

這就是基因演算法派上場的時候了!

PART THREE: 適者生存

學長,我會基因演算法了!

基因

個體

族群

From https://towardsdatascience.com/introduction-to-genetic-algorithms-including-example-code-e396e98d8bf3

適應度:玩20場Tetris 之後的平均分數

個體的基因代表啟發因子的權重

\Crossover/

  • 帶權選擇親代父母
  • 隨機選擇要從哪個親代遺傳

From https://www.tutorialspoint.com/genetic_algorithms/genetic_algorithms_parent_selection.htm

突...突變了?!

  • ​有一定機率在某個基因上發生
  • 會把原本的數值在一定範圍內改變
  • 突變量隨世代增加會變小
if random.random() < 0.07122: #mutation
    if random.randint(0, 1) == 0:
        child1[0][j] += random.uniform(-0.5/(gen + 1), 0.5/(gen + 1))
    else:
        child2[0][j] += random.uniform(-0.5/(gen + 1), 0.5/(gen + 1))

鰣(ㄕˊ)糳(ㄗㄨㄛˋ)

pop = []
size = 50
tabugen = 3
clancnt = 0
for i in range(size): #隨機產生族群
    pop.append([numpy.array(list(random.uniform(-1, 1) for i in range(8))), clancnt, [], numpy.array(list(random.uniform(-1, 1) for i in range(8))), 1])
    # individual: weights forming a chromosome, the clan number, tabu list, memetic vector for local optimization, memetic scalar
    clancnt += 1

for gen in range(100):
    print("Generation ", gen + 1, sep='')
    print(pop)
    gettestdata(10)
    res = []
    c = 0
    for individual in pop: #取得適應度
        val = score(individual[0], 0)
        res.append((val, individual))
        print(val, end=" ")
        c += 1
    res = sorted(res, key= lambda x: x[0])
    res.reverse()
    print()
    print(res[0])
    s = sum(list(a[0] for a in res))
    pop = []
    valid = 0
    count = 0
    while valid < size - 1: #交配
        count += 1
        p1, p2 = [], [] #selects p1 and p2 as parents
        s1, s2 = 0, 0
        r = random.uniform(0, s - 0.01)
        n = 0
        for j in range(len(res)):
            n += res[j][0]
            if n > r:
                p1 = res[j][1]
                s1 = res[j][0]
                break

        r = random.uniform(0, s - 0.01)
        n = 0
        for j in range(len(res)):
            n += res[j][0]
            if n > r:
                p2 = res[j][1]
                s2 = res[j][0]
                break
        #print(p1, p2)

        child1, child2 = [numpy.array([0.0] * 8), p1[1], copy.deepcopy(p1[2])], [numpy.array([0.0] * 8), p2[1], copy.deepcopy(p2[2])]

        for j in range(8):
            if random.random() < 0.5:
                child1[0][j] = p1[0][j]
                child2[0][j] = p2[0][j]
            else:
                child1[0][j] = p2[0][j]
                child2[0][j] = p1[0][j]
            if random.random() < 0.07122: #mutation 突變
                if random.randint(0, 1) == 0:
                    child1[0][j] += random.uniform(-0.5/(gen + 1), 0.5/(gen + 1))
                else:
                    child2[0][j] += random.uniform(-0.5/(gen + 1), 0.5/(gen + 1))

        child1[2].append(child2[1]) #adds each other to tabu list
        child2[2].append(child1[1])

        pop.append(child1)
        if valid < size - 1:
            pop.append(child2)
    pop.append(res[0][1])

PART FOUR:  部落的禁忌 

等等...好像不太對?

是不是...會近親通婚?!

 

 

這樣生出來的小孩不是都...

 

 

 

禁忌的力量

禁忌搜索(tabu search) 是一種搜尋演算法,可以避免搜尋結果停在區域極值(local minima)

From https://bdtechtalks.com/2020/04/27/deep-learning-mode-connectivity-adversarial-attacks/gradient-descent-local-minima/

你是誰家的?

  • 部落編號:這個個體屬於哪個家族(ex.姓氏)
  • 禁忌名單:個體的祖先有哪些家族(ex.親戚)

在交配時,父母親部落編號相同或編號在對方禁忌名單內,就不會生下子代!

埕(ㄔㄥˊ)氏(ㄕˋ)

pop = []
size = 50
tabugen = 3 #禁忌名單存幾個世代
clancnt = 0
for i in range(size):
    pop.append([numpy.array(list(random.uniform(-1, 1) for i in range(8))), clancnt, [], numpy.array(list(random.uniform(-1, 1) for i in range(8))), 1])
    # individual: weights forming a chromosome, the clan number, tabu list, memetic vector for local optimization, memetic scalar
    clancnt += 1
    
for gen in range(100):
    #...和前面一樣
    while valid < size - 1:
        count += 1
        p1, p2 = [], [] #selects p1 and p2 as parents
        s1, s2 = 0, 0
        r = random.uniform(0, s - 0.01)
        n = 0
        for j in range(len(res)):
            n += res[j][0]
            if n > r:
                p1 = res[j][1]
                s1 = res[j][0]
                break

        r = random.uniform(0, s - 0.01)
        n = 0
        for j in range(len(res)):
            n += res[j][0]
            if n > r:
                p2 = res[j][1]
                s2 = res[j][0]
                break
        #print(p1, p2)
        
        #判斷禁忌
        tabu = False #checks if match is tabu/
        if p1[1] == p2[1]:
            tabu = True
        for clan in p1[2]:
            if p2[1] == clan:
                tabu = True
        for clan in p2[2]:
            if p1[1] == clan:
                tabu = True
        #print(tabu)
        if tabu:
            continue
        else:
            valid += 2

        child1, child2 = [numpy.array([0.0] * 8), p1[1], copy.deepcopy(p1[2])], [numpy.array([0.0] * 8), p2[1], copy.deepcopy(p2[2])]

        for j in range(8):
            if random.random() < 0.5:
                child1[0][j] = p1[0][j]
                child2[0][j] = p2[0][j]
            else:
                child1[0][j] = p2[0][j]
                child2[0][j] = p1[0][j]
            if random.random() < 0.07122: #mutation
                if random.randint(0, 1) == 0:
                    child1[0][j] += random.uniform(-0.5/(gen + 1), 0.5/(gen + 1))
                else:
                    child2[0][j] += random.uniform(-0.5/(gen + 1), 0.5/(gen + 1))

        child1[2].append(child2[1]) #adds each other to tabu list
        child2[2].append(child1[1])
        while len(child1[2]) > tabugen:
            child1[2].pop(0)
        while len(child2[2]) > tabugen:
            child2[2].pop(0)
            
        pop.append(child1)
        if valid < size - 1:
            pop.append(child2)
    pop.append(res[0][1])
    clancnt += 1

PART FIVE: 文化衝突 

但是,

基因演算法搜尋到的範圍太少了!

 

 

因此,有一個演算法出現了...

迷因演算法!!!

 

From https://www.reddit.com/r/memes/comments/eq2pe6/google_image_meme_man_meme/

From https://www.youtube.com/watch?v=Amu-4_mH0no

迷因:文化的遺傳單位

那這跟演算法有什麼關係?

迷因演算法(Memetic algorithm)

  • 結合區域搜尋和基因演算法

 

  • 個體在評估適應度前會先自己嘗試進步

 

  • 迷因的選擇可以透過「傳承」
     

我們的「迷因」

一個隨機產生的向量!

每次測試現在的權重向量加上迷因之後成績會不會變高,如果會就更新(類似模擬退火)

pop = []
size = 50
tabugen = 3
clancnt = 0
for i in range(size):
    pop.append([numpy.array(list(random.uniform(-1, 1) for i in range(8))), clancnt, [], numpy.array(list(random.uniform(-1, 1) for i in range(8))), 1])
    # individual: weights forming a chromosome, the clan number, tabu list, memetic vector for local optimization, memetic scalar
    clancnt += 1

for individual in pop:
    val = score(individual[0], 0)
    # print(individual[0], individual[3], individual[4])
    for step in range(5):
        newparam = numpy.add(individual[0],
                             numpy.multiply(numpy.divide(individual[3], numpy.linalg.norm(individual[3])),
                                            individual[4]))
        localval = score(newparam, 0)
        if localval > val:
            val = localval
            individual[0] = newparam
            individual[4] *= 0.9
        else:
            if step == 0:
                individual[3] = numpy.array(list(random.random() for i in range(8)))
            break
    # print(individual[0])
    res.append((val, individual))
    print(val, end=" ")
    c += 1

迷因的傳承

爸:欸欸兒子,這個迷因我覺得很讚,看完之後整個人都變強了,你要不要試試看XD

媽:我也有一個迷因,但是你爸比較強所以你就 用他的吧

你:(使用那個迷因搜尋)

你:這什麼爛梗啊88888

你:(重新作一個迷因)

大概就是這樣(講師寫到這裡已經不行了)

PART SIX: 結局

演算法的表現到底如何?

最後分數

  • 平均行數:230.4行 (最多也只到230行)
  • 平均分數(Lv. 0 Start):290251.6分
  • 平均分數(Lv. 18 Start):486000分

「啊所以那樣到底多強啦」

大家來體驗看看吧!

Q&A Time

謝謝大家的聆聽

Tetris AI (社團)

By justinlai2003

Tetris AI (社團)

  • 827