Teambition 数据层重构经验分享

ReactiveDB powered by RxJS & lovefield

  • Teambition 的业务特点

 

  • 各种数据层设计的对比与分析

 

  • 介绍ReactiveDB

数据的种类多,数据间的关联性强

  1. Tasklist 类型的数据

  2. Stage 类型的数据

  3. Member 类型的数据

  4. Subtask 类型的数据

  5. Like 类型的数据

  6. Activity 类型的数据
{
  "_id": "58b42f74428e764903e86395",
  "_creatorId": "54cb6200d1b4c6af47abe570",
  "_executorId": "556466f750e9c0503d58a37b",
  "_projectId": "58b42ef1c655de4203cad186",
  "_tasklistId": "58b42ef1428e764903e8636e",
  "_stageId": "58b42ef1428e764903e8636f",
  "involveMembers": [
    "54cb6200d1b4c6af47abe570",
    "556466f750e9c0503d58a37b"
  ],
  "dueDate": "2017-02-01T10:00:00.000Z",
  "note": "我逾期啦",
  "content": "在 Demo 中逾期的任务",
  "project": {
    "_id": "58b42ef1c655de4203cad186",
    "name": "ReactiveDB Demo"
  },
  "creator": {
    "_id": "54cb6200d1b4c6af47abe570",
    "name": "龙逸楠",
    "avatarUrl": "https://striker.teambition.net/thumbnail/110f238c339c7dacefbc792723a067fbf188/w/200/h/200"
  },
  "stage": {
    "_id": "58b42ef1428e764903e8636f",
    "name": "待处理"
  },
  "executor": {
    "_id": "556466f750e9c0503d58a37b",
    "name": "太狼",
    "avatarUrl": "https://striker.teambition.net/thumbnail/110q6f384916ce773e9352b8bf190b5dadca/w/200/h/200"
  }
}

实时性要求高

几乎所有数据都需要通过 WebSocket 更新

针对这种业务场景,我们是如何设计数据层的?

Teambition 技术演进史

  • 2013~2015 Backbone + jQuery
  • 2016
    1. Redux + Normalzr + Reselect + React
    2. Key Value Based Cache + RxJS + Vue.js
    3. Key Value Based Cache + RxJS + Angular2
    4. Lovefield + RxJS (ReactiveDB) + redux-observable + redux + React
  • 2017 ...

Backbone + jQuery

class TaskModel extends Model {
    listen() {
        this.socket.on('change', patch => {
            this.set(patch)
        })
    }
}
class TaskView {
    listenTo() {
        this.taskModel.on('change', taskModel => {
            this.renderTask(taskModel)
        })

        this.subtaskCollection.on('new remove change:_taskId', () => {
            this.taskModel.set({
                subtasksCount: this.subtaskCollection.length
            })
        })
        
        this.stageModel.on('change: name', stageModel => {
            this.renderStage(stageModel)
        })
    
        this.tasklistModel.on('change', tasklistModel => {
            this.renderTasklist(tasklistModel)
        })

        this.projectModel.on('change', projectModel => {
            this.renderProject(projectModel)
        })
    }
}

TaskCardView

TaskDetailView

TaskModel

diff,emit change

Others Models

emit more changes

???

Model

View

Redux + Normalzr + Reselect + React

Redux Store

Backend

Normalzr

Http & Websocket

Connected Container

Reselect

Component

// redux-observable middleware
const epic = action$ => action$
    .ofType(`${someAction}`)
    .switchMap(action => {
        const { param }  = action.payload
        return Ajax$.get(param)
          .map(response => otherAction(normalize(response, someSchema)))
          .catch(err => Observable.of(errorAction(err)))
          .takeUntil(`${componentUnmount}`)
    })


// reselect
const selector = createSelector(
    state => state.someModule.entities,
    state => state.someOtherModule.entities.something,
    (dataYouCareAbout, relatedData) =>
        // diff and reselect
)


// container
const mapStateToProps = state => ({
    selector1: selector1(state),
    selector2: selector2(state),
    selector3: selector3(state),
    // ...
})

connect(mapStateToProps)(component)
// reselect
  
/**
 * 存储未开始的任务
 * 定义:
 * & 未被完成
 * & 未被归档
 * & _projectId 字段匹配
 * &
 * (
 * | 没有截止日期
 *   | 有开始时间 & 开始时间在今天以后
 *   | 没有开始时间
 * | 有截止日期 & 截止日期在今天以后
 *   | 有开始时间 & 开始时间在今天以后
 *   | 没有开始时间
 * )
 */
const taskFilter = taskData => {
    return data._projectId === projectId &&
        !data.isArchived &&
        !data.isDone &&
        (
          !data.dueDate &&
          (
            !data.startDate ||
            (
              data.startDate &&
              new Date(data.startDate).valueOf() > now
            )
          ) ||
          (
            data.dueDate &&
            new Date(data.dueDate).valueOf() < now &&
            (
              (
                data.startDate &&
                new Date(data.startDate).valueOf() > now
              ) ||
              !data.startDate
            )
          )
        )
}

优点:

  • 数据全部 flatten,不需要维护多个类似的数据副本
  • 可以在 selector 里面做非常细粒度的 diff ,不需要在 component 里面额外写 SCU 就能获得最佳性能

缺点:

  • 它不处理关联,要手动指定
  • 不够智能,state change 是否是 container 关心的 change,需要手写很多 selector 去 filter
  • 对于列表类型的数据,数据的筛选和排序写起来非常笨重,不够直观

Key Value Cache + RxJS

class Database {
    cache = new Map()

    save(data, primaryKey = '_id') {
        // normalize
    }

    update(primaryKey, patch) {
        // ...
    }

    get(primaryKey) {
        // reselect
        return Observable.create(observer => {
            observer.next(this.cache.get(primaryKey))
        })
           /**
            * updateStream$ = socket.getUpdate(primaryKey)
            *    .merge(http.getUpdate(primaryKey))
            **/ 
            .combineLatest(updateStream$)
            .map(([data, patch]) => { ...data, patch })
            .takeUtil(deleteStream$)
    }
}

View

Model

优缺点:与 Redux + Normalzr + Reselect 类似

我们理想中的解决方案是什么样子的?

  • 数据全部 flatten,拥有同一个 primaryKey 的数据只存一份
  • 从数据层取数据时,能精确的定义需要的字段关联,并且自动感知相关的变更
  • 在从数据层获取列表类型的数据时,可以简单直观的定义查询条件、排序,分页等信息
  • 在没有消费者时不再 observe 相关数据的变更

Backend Database

Flatten Data

Backend Services

Nested Data

????

Observed Nested Data

Flatten Data

Components

Frontend

Lovefield

  • Relational
  • 支持 Observe
  • Query 支持 Predicate、limit、skip、OrderBy
  • Transaction
  • IndexedDB/LocalStorage/Websql 等持久化策略
  • DOM 不相关,可以运行在各种 Worker 里面

Basic

More

// Define Schema and Connect
const schemaBuilder = lf.schema.create('teambition', 1)

schemaBuilder.createTable('Task')
    .addColumn('_id', lf.Type.STRING)
    .addColumn('content', lf.Type.STRING)
    .addColumn('created', lf.Type.STRING)
    .addColumn('dueDate', lf.Type.STRING)
    .addColumn('priority', lf.Type.INTEGER)
    // ...
    .addPrimaryKey(['_id'])
    .addNullable(['content', 'created', 'dueDate', 'priority'])

schemaBuilder.createTable('Member')
    .addColumn('_id', lf.Type.STRING)
    .addColumn('name', lf.Type.STRING)
    .addColumn('avatarUrl', lf.Type.STRING)
    // ...
    .addPrimaryKey(['_id'])
    .addNullable([/** ... */])

//...

// Schema is defined, now connect to the database instance.
schemaBuilder
    .connect({ /** storeType: lf.DataStoreType.INDEXED_DB */ })
    .then(db => {
      // Schema is not mutable once the connection to DB has established.
    })
// Select
const taskTable = db.getSchema().table('Task')
const memberTable = db.getSchema().table('Member')
const query = db.select(
    taskTable._id, taskTable.content, memberTable._id,
    memberTable.name, memberTable.avatarUrl
)
    // join
    .from(taskTable, memberTable)
    // oh no....
    .where(lf.op.and(
        taskTable.created.lt(moment().startOf('day').valueOf()),
        taskTable._executorId.eq(member._id)
    ))
    .orderBy(taskTable.priority, lf.Order.DESC)
    .orderBy(taskTable.dueDate, lf.Order.DESC)
    // 第二页,每页20条数据
    // Sad story,因为查询的结果是笛卡尔积,这里不能这么查询
    .limit(20)
    .skip(20)

// Select Data
query.exec().then(data => // 笛卡尔积,Graph it)

// Observe
db.observe(query, diff => // effect)

//Unobserve
db.unobserve(query, callback)
const Tasks = [Task1, Task2, Task3 ...]
const Subtasks = [Subtask1, Subtask2, Subtask3 ...]
const Projects = [Project1]

const result = [
    {
        Task: Task1,
        Subtask: Subtask1,
        Project: Project1
    },
    {
        Task: Task1,
        Subtask: Subtask2,
        Project: Project1
    },
    {
        Task: Task2,
        Subtask: Subtask3,
        Project: Project1
    },
    {
        Task: Task2,
        Subtask: Subtask4,
        Project: Project1
    }
    // ....
]

缺点

  • 数据表的定义与查询不够直观

  • join 写起来非常繁琐, 查询出来的结果往往需要复杂的步骤处理

  • 存储的数据类型不能与后端对应

  • Observe 与 Unobserve 不够自动

ReactiveDB

  • 基于 Lovefield,继承它所有强大的功能
  • 使用 RxJS ,天然契合 Observe,无需关心何时 Unobserve
  • 直观的定义 Table,直观的定义关联,直观的查询数据,自动处理 Join 查询后的数据
  • 扩展 Lovefield 的数据类型,对应后端数据库的各种类型
  • 直观的分页查询
  • 直观的处理 Order
// 定义 Schema
ReactiveDB.defineSchema('Task', {
  _creatorId: {
    type: RDBType.STRING
  },
  _executorId: {
    type: RDBType.STRING
  },
  _id: {
    type: RDBType.STRING,
    primaryKey: true
  },
  _projectId: {
    type: RDBType.STRING
  },
  _stageId: {
    type: RDBType.STRING
  },
  _tasklistId: {
    type: RDBType.STRING
  },
  accomplished: {
    type: RDBType.DATE_TIME
  },
  content: {
    type: RDBType.STRING
  },
  created: {
    type: RDBType.DATE_TIME
  },
  customfields: {
    type: RDBType.OBJECT
  },
  dueDate: {
    type: RDBType.DATE_TIME
  },
  executor: {
    type: Association.oneToOne,
    virtual: {
      name: 'Member',
      where: memberTable => ({
        _executorId: memberTable._id
      })
    }
  },
  involveMembers: {
    type: RDBType.LITERAL_ARRAY
  },
  isArchived: {
    type: RDBType.BOOLEAN
  },
  isDone: {
    type: RDBType.BOOLEAN
  },
  note: {
    type: RDBType.STRING
  },
  pos: {
    type: RDBType.NUMBER
  },
  priority: {
    type: RDBType.NUMBER
  },
  project: {
    type: Association.oneToOne,
    virtual: {
      name: 'Project',
      where: projectTable => ({
        _projectId: projectTable._id
      })
    }
  },
  stage: {
    type: Association.oneToOne,
    virtual: {
      name: 'Stage',
      where: stageTable => ({
        _stageId: stageTable._id
      })
    }
  },
  startDate: {
    type: RDBType.DATE_TIME
  },
  subtaskCount: {
    type: RDBType.NUMBER
  },
  subtaskIds: {
    type: RDBType.LITERAL_ARRAY
  },
  subtasks: {
    type: Association.oneToMany,
    virtual: {
      name: 'Subtask',
      where: subtaskTable => ({
        _id: subtaskTable._taskId
      })
    }
  },
  tagIds: {
    type: RDBType.LITERAL_ARRAY
  },
  tasklist: {
    type: Association.oneToOne,
    virtual: {
      name: 'Tasklist',
      where: tasklistTable => ({
        _tasklistId: tasklistTable._id
      })
    }
  }
})
// 查询数据
const QueryToken = ReactiveDB.get('Task', {
  where: {
    created: {
      $gt: moment().startOf('day').valueOf()
    }
  },
  fields: [
    '_id',
    'content',
    {
      member: ['_id', 'name', 'avatarUrl']
    }
  ],
  // yes it works
  skip: 20,
  limit: 20,
  orderBy: [
    { fieldName: 'priority', orderBy: 'DESC' },
    { fieldName: 'dueDate', orderBy: 'DESC' }
  ]
})

QueryToken

  • concat, 处理分页
  • combine, 关联多个 query
  • changes, 产生一个 Observable,observe 一条 query
  • values,直接 获取 query 的值
Made with Slides.com