Kokuyou (黒曜) / Shunsuke Mori
Twitter or X: @kokuyouwind
Work at: Leaner Technologies, Inc.
Platinum / Drinkup Sponsor
Day 2: Leaner YAKINIKU Party
Procurement Domain BtoB Startup
調達領域のBtoBスタートアップ
エンジニア絶賛採用中!
Feel free to talk/DM us!
気軽に話しかけたり
DMしてください!
RubyKaigi 2023 でのライトニングトーク
RubyKaigi 2023 でのMatzの基調講演
Oh no, Matz presented on the same topic
I was planning to cover!
今年は…
Machine learning models trained on large amounts of text data.
OpenAI
ChatGPT 4
Anthropic
Claude 3
Microsoft
GitHub Copilot
大量のテキストデータを使ってトレーニングされた機械学習モデル
大規模言語モデル
LLMの例
昨年からのLLMの進化
Price Reduction
値下げ
Sentence Length Expansion
扱える文章長の拡大
x 1/8
x 480
Improvement of coding skills
コーディング能力の向上
+ 23.2 pt
(OpenAI gpt-3.5-turbo-0301 to
Anthropic Claude3 Haiku)
(OpenAI gpt-3.5-turbo-0301 to
Google Gemini 1.5 Pro 2M)
(GPT-4 to GPT-4o,
4096 → 2M
$2.0 → $.025 / 1MTok
67.0% → 90.2%
Now that LLM's capabilities have increased so much,
can we even guess RBS types for the entire project?
これだけLLMの能力が上がったなら、
プロジェクト全体のRBSの型推測もワンチャンいけるのでは…!?
Now that LLM's capabilities have increased so much,
can we even guess RBS types for the entire project?
これだけLLMの能力が上がったなら、
プロジェクト全体のRBSの型推測もワンチャンいけるのでは…!?
Now that LLM's capabilities have increased so much,
can we even guess RBS types for the entire project?
Goose
I made
as a tool to guess RBS types.
Duck
quacking like geese (Duck Goose Typing)
Duck
目次
Purpose of creating RBS Goose
How RBS Goose works
Tips for Development with LLM
Performance Evaluation of RBS Goose
Conclusion
RBS Gooseを作った目的
RBS Gooseの仕組み
LLMを使った開発のTips
RBS Gooseの性能評価
まとめ
目次
Purpose of creating RBS Goose
How RBS Goose works
Tips for Development with LLM
Performance Evaluation of RBS Goose
Conclusion
RBS Gooseを作った目的
RBS Gooseの仕組み
LLMを使った開発のTips
RBS Gooseの性能評価
まとめ
RBSとは?
Language for defining Ruby type structures
Rubyの型構造を定義するための言語
class Person
attr_reader :name
def initialize(name:)
@name = name
end
def name=(name)
@name = name
end
end
class Person
@name: String
attr_reader name: String
def initialize: (name: String) -> void
def name=: (String name) -> void
end
person.rb
person.rbs
RBSがなぜ必要か
Safe development through type checking
Detect invalid method calls, etc., before they are executed
use steep check, etc
Improved development experience, including complements
More accurate completion based on type
use steep-vscode or TypeProf for IDE, etc
型検査による安全な開発
不正なメソッド呼び出しなどを実行前に検知できる
steep checkなどが利用できる
補完などの開発体験の向上
型に基づいてより正確に補完できる
steep-vscodeやTypeProf for IDE が利用できる
RubyからRBSを生成する既存手法
rbs prototype rb
rbs prototype runtime
typeprof
Ruby
RBS
static parsing
静的構文解析
Ruby
RBS
dynamic load
動的ロード
Ruby
type level execution
型レベル実行
RBS
手法ごとに一長一短があり、一発で完璧なRBSを生成するのは難しい
Ruby
RBS
static parsing
静的構文解析
class Config
def self.configure(&block)
new.tap(&block)
end
%w[hoge fuga piyo].each do |v|
attr_accessor v
end
end
# config = Config.configure do |c|
# c.hoge = 1
# c.fuga = 'a'
# c.piyo = :piyo
# end
config.rb
class Config
def self.configure: () \
{ () -> untyped } -> untyped
end
# All untyped, and
# No attribute accessors
config.rbs
class Config
def self.configure(&block)
new.tap(&block)
end
%w[hoge fuga piyo].each do |v|
attr_accessor v
end
end
# config = Config.configure do |c|
# c.hoge = 1
# c.fuga = 'a'
# c.piyo = :piyo
# end
config.rb
class Config
def self.configure: () \
{ (*untyped) -> untyped } -> untyped
public
def fuga: () -> untyped
def fuga=: (untyped) -> untyped
def hoge: () -> untyped
def hoge=: (untyped) -> untyped
def piyo: () -> untyped
def piyo=: (untyped) -> untyped
end
# All Untyped
config.rbs
Ruby
RBS
dynamic load
動的ロード
class Config
def self.configure(&block)
new.tap(&block)
end
%w[hoge fuga piyo].each do |v|
attr_accessor v
end
end
# Required for Type Level Exec
config = Config.configure do |c|
c.hoge = 1
c.fuga = 'a'
c.piyo = :piyo
end
config.rb
class Config
def self.configure: { (Config) -> :piyo } -> Config
end
# Typed, but
# No attribute accessors
# (and I want the return value to be void.)
config.rbs
Ruby
type level execution
型レベル実行
RBS
untypedを直したり、足りないものを補うのは結構大変
Ruby
Existing Tools
既存ツール
RBS
RBS Gooseで何をしたいか
Guessing by LLM
LLMによる推測
Refined
RBS
Untyped becomes concrete type
The missing methods are compensated for
untypedが具体型になり、不足メソッドが補われる
先に断っておくと、実用レベルまでは実現できていません
目次
Purpose of creating RBS Goose
How RBS Goose works
Tips for Development with LLM
Performance Evaluation of RBS Goose
Conclusion
RBS Gooseを作った目的
RBS Gooseの仕組み
LLMを使った開発のTips
RBS Gooseの性能評価
まとめ
説明のための例
class Person
attr_reader :name
def initialize(name:)
@name = name
end
def name=(name)
@name = name
end
end
class Person
@name: String
attr_reader name: String
def initialize: (name: String) -> void
def name=: (String name) -> void
end
lib/person.rb
sig/person.rbs
RBS Goose の構成 (型の推測)
Ruby
RBS
Refined RBS
rbs prototype
(or other tools)
examples
Prompt
LLM
(e.g. ChatGPT)
Ruby
RBS
Refined RBS
examples
Prompt
LLM
(e.g. ChatGPT)
class Person
@name: untyped
attr_reader name: untyped
def initialize: (name: untyped) -> void
def name=: (untyped name) -> void
end
sig/person.rbs
RBS Goose の構成 (型の推測)
rbs prototype
(or other tools)
Ruby
RBS
Refined RBS
examples
Prompt
LLM
(e.g. ChatGPT)
class Example1
attr_reader :quantity
def initialize(quantity:)
@quantity = quantity
end
def quantity=(quantity)
@quantity = quantity
end
end
lib/example1.rb
class Example1
@quantity: untyped
attr_reader quantity: untyped
def initialize: (quantity: untyped) -> void
def quantity=: (untyped quantity) -> void
end
sig/example1.rbs
class Example1
@quantity: Integer
attr_reader quantity: Integer
def initialize: (quantity: Integer) -> void
def quantity=: (Integer quantity) -> void
end
refined/sig/example1.rbs
RBS Goose の構成 (型の推測)
rbs prototype
(or other tools)
Ruby
RBS
Refined RBS
examples
Prompt
LLM
(e.g. ChatGPT)
When ruby source codes and
RBS type signatures are given,
refine each RBS type signatures.
======== Input ========
```lib/example1.rb
...
```
```sig/example1.rbs
...
```
======== Output ========
```sig/example1.rbs
...
```
======== Input ========
```lib/person.rb
...
```
```sig/person.rbs
...
```
======== Output ========
Examples
Ruby Code
LLM Infer
RBS
Prototype
RBS Goose の構成 (型の推測)
rbs prototype
(or other tools)
Ruby
RBS
Refined RBS
steep prototype
examples
Prompt
LLM
(e.g. ChatGPT)
```sig/person.rbs
class Person
@name: String
attr_reader name: String
def initialize: (name: String) -> void
def name=: (String name) -> void
end
```
RBS Goose の構成 (型の推測)
Ruby
RBS
Refined RBS
examples
Prompt
LLM
(e.g. ChatGPT)
RBS Goose の構成 (型の推測)
rbs prototype
(or other tools)
複数ファイルの扱い
class Person
attr_reader :name
end
lib/person.rb
class PersonName
attr_reader :value
end
lib/person_name.rb
class Person
# Not a String
@name: PersonName
end
sig/person.rbs
複数ファイルの扱い: 戦略
Pass all constants list
Combine and pass on files that may be related by RAG
Infer all Ruby files at once
すべての定数のリストを渡す
RAGで関連しそうなファイルを組み合わせて渡す
すべてのRubyファイルをまとめて1度に推論させる
複数ファイルの扱い: 選択したもの
Infer all Ruby files at once
AI can make comprehensive decisions from all codes
Small Project can be stores in 128K tokens
Unrealistic as of last year, as 4K was the max
すべてのRubyファイルをまとめて1度に推論させる
小さなプロジェクトなら128Kトークンに収まる
昨年時点では4Kトークンが最大だったため非現実的だった
AIが全てのコードを見て総合的に判断できる
Act as Ruby type inferrer.
When ruby source codes and RBS type signatures are given,
refine each RBS type signatures.
Each file should be split in markdown code format.
Use class names, variable names, etc., to infer type.
========Input========
```ruby:lib/email.rb
class Email
# @dynamic address
attr_reader :address
def initialize(address:)
@address = address
end
def ==(other)
other.is_a?(self.class) && other.address == address
end
def hash
self.class.hash ^ address.hash
end
end
```
```rbs:sig/email.rbs
class Email
@address: untyped
attr_reader address: untyped
def initialize: (address: untyped) -> void
def ==: (untyped other) -> untyped
def hash: () -> untyped
end
```
```ruby:lib/person.rb
class Person
# @dynamic name, contacts
attr_reader :name
attr_reader :contacts
def initialize(name:)
@name = name
@contacts = []
end
def name=(name)
@name = name
end
def guess_country()
contacts.map do |contact|
case contact
when Phone
contact.country
end
end.compact.first
end
end
```
```rbs:sig/person.rbs
class Person
@name: untyped
@contacts: untyped
attr_reader name: untyped
attr_reader contacts: untyped
def initialize: (name: untyped) -> void
def name=: (untyped name) -> void
def guess_country: () -> untyped
end
```
```ruby:lib/phone.rb
class Phone
# @dynamic country, number
attr_reader :country, :number
def initialize(country:, number:)
@country = country
@number = number
end
def ==(other)
if other.is_a?(Phone)
# @type var other: Phone
other.country == country && other.number == number
else
false
end
end
def hash
self.class.hash ^ country.hash ^ number.hash
end
end
```
```rbs:sig/phone.rbs
class Phone
@country: untyped
@number: untyped
attr_reader country: untyped
attr_reader number: untyped
def initialize: (country: untyped, number: untyped) -> void
def ==: (untyped other) -> (untyped | nil)
def hash: () -> untyped
end
```
========Output========
```rbs:sig/email.rbs
class Email
@address: String
attr_reader address: String
def initialize: (address: String) -> void
def ==: (Object other) -> bool
def hash: () -> Integer
end
```
```rbs:sig/person.rbs
class Person
@name: String
@contacts: Array[(Email | Phone)]
attr_reader name: String
attr_reader contacts: Array[(Email | Phone)]
def initialize: (name: String) -> void
def name=: (String name) -> void
def guess_country: () -> (String | nil)
end
```
```rbs:sig/phone.rbs
class Phone
@country: String
@number: String
attr_reader country: String
attr_reader number: String
def initialize: (country: String, number: String) -> void
def ==: (Object other) -> (bool | nil)
def hash: () -> Integer
end
```
========Input========
```ruby:lib/user.rb
class User
def initialize(name:)
@name = name
end
attr_reader :name
end
```
```rbs:sig/user.rbs
class User
@name: untyped
def initialize: (name: untyped) -> void
attr_reader name: untyped
end
```
```ruby:lib/user_factory.rb
class UserFactory
def name(name)
@name = name
self
end
def build
User.new(name: @name)
end
end
```
```rbs:sig/user_factory.rbs
class UserFactory
@name: untyped
def name: (untyped name) -> self
def build: () -> untyped
end
```
========Output========
実際のプロンプト
たいてい、一発ではうまくいかない。
発生した型エラーを見ながら修正していく必要がある。
Ruby
RBS
Fixed RBS
examples
Prompt
LLM
(e.g. ChatGPT)
RBS Goose の構成 (エラーの修正)
❌
Errors
Steep Check
❌
まだ実験的
目次
Purpose of creating RBS Goose
How RBS Goose works
Tips for Development with LLM
Performance Evaluation of RBS Goose
Conclusion
RBS Gooseを作った目的
RBS Gooseの仕組み
LLMを使った開発のTips
RBS Gooseの性能評価
まとめ
RBS Goose の設定
どの LLM を使うか、ユーザーが選べるようにしたい
LLMフレームワークをアダプタとして利用
Use Langchain.rb gem by @andreibondarev
@andreibondarev 氏の Langchain.rb gem を利用する
Ruby
RBS Goose の設定例
api_key = ENV.fetch('OPENAI_ACCESS_TOKEN')
RbsGoose.configure do |c|
# Use the provided configuration methods
c.use_open_ai(api_key)
# or directly configure an instance of Langchain::LLM
c.llm.client = ::Langchain::LLM::OpenAI.new(api_key: )
# or Local Server such as Ollama
c.llm.client = ::Langchain::LLM::Ollama.new(
url: "http://localhost:11434"
)
end
RBS Goose のテスト
LLM API is Expensive and High latency
Critically unsuitable for CI.
Web mocks such as VCR gem can be used
Make it an exact match, including Request Body
Temperature should be set to 0 for reproducibility
LLM APIは費用が高く応答も遅い
CIと致命的に相性が悪い
VCR gem などの Webモックを利用すると良い
リクエストボディを含めた厳密一致を指定する
再現性のために、temperatureは0にする
RBS Goose のテスト - VCR セットアップ
# spec/spec_helper.rb
VCR.configure do |config|
config.cassette_library_dir =
'spec/fixtures/vcr_cassettes'
config.hook_into :webmock
config.default_cassette_options = {
match_requests_on: %i[method uri body],
record: ENV.fetch('RECORD', :once).to_sym
}
config.filter_sensitive_data('<openai_access_token>') {
ENV.fetch('OPENAI_ACCESS_TOKEN')
}
end
RBS Goose のテスト - VCRの利用
# spec/rbs_goose/type_inferrer_spec.rb
RSpec.describe RbsGoose::TypeInferrer, :configure do
it 'returns refined rbs' do
VCR.use_cassette('openai/infer') do
expect(described_class.new.infer).to
eq(refined_rbs_list)
end
end
end
記録されたリクエストの例
---
http_interactions:
- request:
method: post
uri: https://api.openai.com/v1/chat/completions
body:
encoding: UTF-8
string:
'{"messages":[
{"role":"user","content":"Act as Ruby type inferrer..."}],
"model":"gpt-3.5-turbo-1106","n":1,
"temperature":0.0}'
headers:
Content-Type:
- application/json
Authorization:
- Bearer <openai_access_token>
...
response:
...
目次
Purpose of creating RBS Goose
How RBS Goose works
Tips for Development with LLM
Performance Evaluation of RBS Goose
Conclusion
RBS Gooseを作った目的
RBS Gooseの仕組み
LLMを使った開発のTips
RBS Gooseの性能評価
まとめ
評価 1: Config & Runner
class Config
def self.configure(&block)
new.tap(&block)
end
%w[client role prompt].each do
attr_accessor _1.to_sym
end
end
class Runner
def initialize(config)
@config = config
end
def run
config.client.chat(
messages: [{
role: config.role,
content: config.prompt
}]
).chat_completion
end
private
attr_reader :config
end
lib/config.rb
lib/runner.rb
評価 1: Config & Runner
Let RBS Goose guess a small example involving metaprogramming
The base RBS is generated by each of the three methods
Tried OpenAI and Anthropic models + CodeGemma (local LLM)
steep check + Quality checks by read the RBS
Check if there are any untyped left that can be detailed, etc.
メタプログラミングを含む小さな例を推測させた
ベースとなるRBSは、事前に解説した3種類の手法で生成
OpenAIとAnthropicの各モデル + CodeGemma(ローカルLLM)を試した
steep checkの確認に加えて、目視での品質確認を実施
まだ具体化できるuntypedが残されていないか、などを確認
結果 1: 生成されたRBSの質
Platform | Model | Size | prototype rb base | prototype runtime base | Typeprof base |
---|---|---|---|---|---|
OpenAI | GPT-3.5 Turbo | Small | Perfect | Perfect | Almost |
OpenAI | GPT-4 Turbo | Large | Perfect | Perfect | Perfect |
OpenAI | GPT-4 Omni | Large | Perfect | Perfect | Perfect |
Anthropic | Claude 3 Haiku | Small | Perfect | Almost | Almost |
Anthropic | Claude 3 Sonnet | Medium | Almost | Not Good | Perfect |
Anthropic | Claude 3 Opus | Large | Almost | Almost | Almost |
Ollama(Local) | CodeGemma | Small | Not Good | Not Good | Not Good |
結果 1: アトリビュートアクセサ
Regardless of the base, the output was the same.
ベースのRBSを問わず、同じような出力になった
結果 1: Almostの例
There was a case of fabricating the return type
of LangChain::LLM::OpenAI#chat
LangChain::LLM::OpenAI#chat の返り値の型を捏造することがあった
結果 1: 興味深いケース
In one case, gpt-4-turbo commented
on why it was left untyped
1例だけ、 gpt-4-turbo が
「なぜuntypedのまま残したか」をコメントしているものがあった
結果 1: 実行時間 [秒]
Platform | Model | Size | prototype rb base | prototype runtime base | Typeprof base |
---|---|---|---|---|---|
OpenAI | GPT-3.5 Turbo | Small | 2.2 | 2.2 | 4.8 |
OpenAI | GPT-4 Turbo | Large | 7.6 | 11.4 | 7.2 |
OpenAI | GPT-4 Omni | Large | 1.8 | 1.7 | 1.9 |
Anthropic | Claude 3 Haiku | Small | 3.3 | 3.3 | 2.9 |
Anthropic | Claude 3 Sonnet | Medium | 3.5 | 8.6 | 3.0 |
Anthropic | Claude 3 Opus | Large | 14.6 | 13.0 | 13.1 |
Ollama(Local) | CodeGemma | Small | 7.1 | 7.4 | 4.1 |
Platform | Model | Size | prototype rb base | prototype runtime base | Typeprof base |
---|---|---|---|---|---|
OpenAI | GPT-3.5 Turbo | Small | Perfect (2.2) | Perfect (2.2) | Almost (4.8) |
OpenAI | GPT-4 Turbo | Large | Perfect (7.6) | Perfect (11.4) | Perfect (7.2) |
OpenAI | GPT-4 Omni | Large | Perfect (1.8) | Perfect (1.7) | Perfect (1.9) |
Anthropic | Claude 3 Haiku | Small | Perfect (3.3) | Almost (3.3) | Almost (2.9) |
Anthropic | Claude 3 Sonnet | Medium | Almost (3.5) | - | Perfect (3.0) |
Anthropic | Claude 3 Opus | Large | Almost (14.6) | Almost (7.4) | Almost (4.1) |
結果 1: 実行時間(PerfectかAlmostのもののみ)
実験 1: 考察
元となるRBS生成手法はどれにしても大差なかった
GPT-4 Omni が最速なのに理想的な出力だった
実行が手軽で速い rbs prototype rb に絞っても良さそう
rbs prototype rb + GPT-4 Omni の組み合わせが良さそう
The base is much the same for all methods
Looks good to focus on rbs prototype rb
For the model, the GPT system clearly performed better
GPT-4 Omni was the fastest but ideal output
rbs prototype rb + GPT-4 Omni combination looks good
モデルは GPT系の成績が明らかに良かった
評価 2: RbsGoose
Infer RBS from Ruby code in whole RbsGoose
The base used only rbs prototype rb
RbsGooseのRubyコード全体からRBSを推測する
ベースはrbs prototype rbのみを用いた
# File Count
❯ find lib -type f | wc -l
17
# Line Count
❯ find lib -type f | xargs cat | wc -l
698
# Size Count
❯ du -sh lib
68K lib
Platform | model | Model Size | Quality | time[sec] | cost[¢] |
---|---|---|---|---|---|
OpenAI | GPT-3.5 Turbo | Small | Poor | 4.3 | 0.44 |
OpenAI | GPT-4 Turbo | Large | Almost | 69.2 | 12.6 |
OpenAI | GPT-4 Omni | Large | Almost | 52.5 | 7.86 |
Anthropic | Claude 3 Haiku | Small | Poor | 33.4 | 0.65 |
Anthropic | Claude 3 Sonnet | Medium | Almost | 55.5 | 7.88 |
Anthropic | Claude 3 Opus | Large | Almost | 90.7 | 35.72 |
Ollama(Local) | codegemma | Small | Subtle | 95.9 | N/A |
結果 2: 生成されたRBSの質
結果 2: Almostな出力の概要
Overall, well guessed, including generics.
全体的にはジェネリクスも含めてよく推測されている
class RbsGoose::IO::ExampleGroup < ::Array[RbsGoose::IO::Example]
self.@default_examples: Hash[Symbol, RbsGoose::IO::ExampleGroup]
attr_accessor error_messages: String?
def self.load_from:
(String base_path, ?code_dir: String, ?sig_dir: String, ?refined_dir: String)
-> RbsGoose::IO::ExampleGroup
def self.default_examples: () -> Hash[Symbol, RbsGoose::IO::ExampleGroup]
private def self.load_example:
(String base_path, String code_dir, String path, String refined_dir, String sig_dir)
-> RbsGoose::IO::Example
private def self.to_rbs_path: (String path, String sig_dir) -> String
def to_target_group: () -> RbsGoose::IO::TargetGroup
def to_refined_rbs_list: () -> Array[RbsGoose::IO::File]
end
sig/rbs_goose/io/example_group.rbs
評価 2: 失敗していたポイント
Failure Description: Syntax Error in Struct or delegator
失敗内容: Struct や def_delegator で Syntax Error
class RbsGoose::Configuration
LLMConfig: Struct[client: ::Langchain::LLM::Base, ...
TemplateConfig: Struct[instruction: String, ...
def_delegator llm, :client, :llm_client
def_delegator llm, :mode, :llm_mode
...
評価 2: それらを直したらどうなるか
実験 2: 考察
Structなどの特殊ケースのRBSをうまく扱えない
rbs_railsやtypeprofなどはトップレベルにRBSを生成するので対応が取れない
exampleに含めるか, Fine Tuningを行う必要がありそう
やっぱり型エラーの自動修正が欲しい
Cannot handle RBS for special cases such as Struct well
Necessary to include it in the example, or require Fine Tuning
The 1:1 assumption of ruby and rbs was not a good
rbs_rails, typeprof, etc. generate RBS at the top level
I still want a fix for type errors
rubyとrbsを一対一の前提にしたのはあまり良くなかった
目次
Purpose of creating RBS Goose
How RBS Goose works
Tips for Development with LLM
Performance Evaluation of RBS Goose
Conclusion
RBS Gooseを作った目的
RBS Gooseの仕組み
LLMを使った開発のTips
RBS Gooseの性能評価
まとめ
まとめ
Introduced a case study of the creation of the RBS Goose
Explained how to compose the prompt and the intent
Some tips for development with LLM were presented
RBS Goose is still experimental
LLM could be used to do some interesting things
RBS Goose を作った事例を紹介した
プロンプトの構成方法と、その意図について解説した
LLMを使った開発のTipsをいくつか紹介した
LLM を使うと面白いことができるかも、というのが伝わると嬉しい
RBS Goose はまだ実験段階
気になっていること
明日のセッションスケジュール
AI補完と相性が良さそう
I tried GitHub Copilot and it completes quite well.
GitHub Copilot を試したら、結構補完してくれそう
Completion
Editing entire projects with AI could work well
like Open Interpreter or Copilot Workspace
Open Interpreter や Copilot Workspace など、
AIでプロジェクト全体を編集する戦略もやりやすくなりそう
RBS Goose 死亡のお知らせ?
RBS Gooseが Dead Duckになる (失敗に終わる) のか、
それとも金の卵を生むガチョウになるのかはまだわからないので、
Cook my own goose(自分で成功の機会を捨てる)前に
もう少し続けてみたいと思う。