Christopher Pitt PRO
Writer and coder, working at ringier.co.za
User asks question
Robot thinks
Robot answers question
User asks question
(using voice)
Robot thinks
(using chatgpt?)
Robot answers question
(using audio)
$relativePath = fmt('%/%', messageAudioFolderPath(), $fileName);
$type = Storage::mimeType($relativePath);
$response = Http::acceptJson()
->withHeaders([
'Authorization' => 'Token '.config('services.deepgram.secret'),
])
->withBody(
Storage::get($relativePath), $type
)
->post('https://api.deepgram.com/v1/listen');
if ($alternatives = $response->json('results.channels.0.alternatives')) {
return $alternatives[0]['transcript'];
}
throw new ConvertQuestionAudioToTextException(
'Could not convert question audio to text'
);
<div
class="container mx-auto"
x-data="{
isRecording: false,
blob: @entangle('blob').defer,
async startRecording() {
this.isRecording = true;
return window.AudioRecorder.start();
},
async stopRecording() {
var blob = await window.AudioRecorder.stop();
var base64 = await this.blobToBase64(blob);
this.isRecording = false;
this.blob = base64;
},
blobToBase64(blob) {
return new Promise(resolve => {
const reader = new FileReader();
reader.onloadend = () => resolve(reader.result);
reader.readAsDataURL(blob);
});
},
}"
>
By Reema Alzohairi
if (! function_exists('writeAudioFile')) {
function writeAudioFile(string $blob): string
{
$tempName = tempnam(sys_get_temp_dir(), 'question');
file_put_contents($tempName, file_get_contents($blob));
$path = Storage::putFile(
messageAudioFolderPath(),
new File($tempName),
);
return str($path)->afterLast('/');
}
}
$tempName = tempnam(sys_get_temp_dir(), 'answer').'.mp3';
Http::sink($tempName)
->withOptions([
'curl' => [
CURLOPT_USERPWD => 'apikey:'.config('services.watson-tts.secret'),
],
])
->accept('audio/mpeg')
->post(config('services.watson-tts.url').'/v1/synthesize', [
'text' => $text,
])
->onError(function (Response $response) {
throw new ConvertAnswerTextToAudioException($response->body());
});
if ($path = Storage::putFile(messageAudioFolderPath(), new File($tempName))) {
return str($path)->afterLast('/');
}
throw new ConvertAnswerTextToAudioException('Could not write answer audio file');
@if ($message->audio_file_name)
<audio
controls
autoplay
>
<source
src="/storage/messages/{{ $message->audio_file_name }}"
type="audio/mpeg"
>
Your browser does not support the audio element.
</audio>
@endif
$client = OpenAI::client(config('services.openai.secret'));
$messages = $conversation
->messages()
->get()
->map(fn(Message $message) => [
'role' => $message->role,
'content' => $message->text,
]);
$response = $client->chat()->create([
'model' => 'gpt-3.5-turbo',
'messages' => $messages,
'max_tokens' => $maxTokens,
]);
if (isset($response['choices'][0]['message']['content'])) {
return trim($response['choices'][0]['message']['content']);
}
throw new AskAiException($response->toArray());
class Message extends Model
{
use HasFactory;
use HasStates;
protected $guarded = [];
protected $casts = [
'state' => MessageState::class,
];
public function conversation(): BelongsTo
{
return $this->belongsTo(Conversation::class);
}
}
abstract class MessageState extends State
{
public static function config(): StateConfig
{
return parent::config()
->registerState([
AskingAiMessageState::class,
ConvertingAnswerTextToAudioMessageState::class,
ConvertingQuestionAudioToTextMessageState::class,
DoneMessageState::class,
FailedMessageState::class,
PrepareQuestionAudioMessageState::class,
QueuedMessageState::class,
])
->default(
QueuedMessageState::class
)
// ...
// system setting the prompt
->allowTransition(
QueuedMessageState::class,
DoneMessageState::class,
)
->allowTransition(
QueuedMessageState::class,
FailedMessageState::class,
)
// user asking a question
->allowTransition(
QueuedMessageState::class,
PrepareQuestionAudioMessageState::class,
)
->allowTransition(
QueuedMessageState::class,
AskingAiMessageState::class,
)
$this->message->state->transitionTo(
PrepareQuestionAudioMessageState::class
);
$this->prepareQuestionAudio();
$this->message->state->transitionTo(
ConvertingQuestionAudioToTextMessageState::class
);
$this->convertQuestionAudioToText();
// ...
} catch (ConvertQuestionAudioToTextException $e) {
$this->message->state->transitionTo(
FailedMessageState::class
);
$this->message->reason_failed = $e->getMessage();
$this->message->save();
}
try {
FFMpeg::open(fmt('%/%', messageAudioFolderPath(), $fileName))
->export()
->inFormat(new Mp3())
->save(fmt(
'%/%.%',
messageAudioFolderPath(),
str($fileName)->beforeLast('.'),
'mp3',
));
} catch (Throwable $e) {
throw new ConvertQuestionVideoToAudioException(
'Could not convert question video to audio'
);
}
public function mount(): void
{
if ($id = session()->get('conversation')) {
$this->conversation = Conversation::findOrFail($id);
return;
}
$conversation = Conversation::create([
'is_manager' => true,
]);
$message = $conversation
->messages()
->create([
'role' => 'system',
'text' => file_get_contents(config('prompts.manager')),
]);
$message->state->transitionTo(DoneMessageState::class);
session()->put('conversation', $conversation->id);
$this->conversation = $conversation;
}
Here are the list of commands:
- [list conversations]
- [create conversation]
- [forward message 'x' to conversation ID y] (where 'x' is the message you want to forward and 'y' is the ID of the conversation you are busy remembering)
- [reset]
These are the only things I want you to say. I do not want you to answer my questions directly because that is not your job. You are a conversation manager, acting as a LLM API between my database and nested conversations.
When I send you a message, and you do not already have an appropriate conversation ID in your memory, you look for an appropriate conversation using the '[list conversations]' command. I will give you back a list of conversations. You pick based on their summary, and you remember the ID of the conversation.
If I tell you there are no conversations, you issue the '[create conversation]' command so that I can create a new conversation. I will tell you an ID that you cna remember.
If you have a conversation ID in memory then you can forward my original message to it with the '[forward conversation ...]' command.
For example:
I say: "I want to paint my house"
You say: "[list conversations]" (because you don't have a conversation ID in memory yet)
I say: "there are no conversations"
You say: "[create conversation]"
I say: "created conversation with ID 1"
You say: "[forward message 'I want to paint my house' to conversation 1]"
Another example:
I say: "I want to paint my house"
You say: "[list conversations]" (because you don't have a conversation ID in memory yet)
I say: "conversations: conversation about 'riding a bicycle' ID of 2"
You say: "[create conversation]" (because none of the conversations are relevant)
I say: "created conversation with ID 3"
You say: "[forward message 'I want to paint my house' to conversation 3]"
Another example:
I say: "I want to paint my house"
You say: "[list conversations]" (because you don't have a conversation ID in memory yet)
I say: "conversations: conversation about 'painting houses' ID of 3"
You say: "[forward message 'I want to paint my house' to conversation 3]" (because that is the most relevant conversation)
Another example:
I say: "I want to reset" or "reset" or "let's start over" or something similar
You say: "[reset]"
- splitting conversations when they get too big
- merging conversations
- updating conversation summaries
- saving/loading context to enable branching
- forgetting conversations
- define global (per user) preferences
- allow dynamic values
- prompt nested conversations with these
- allow changing them mid-conversation
By Christopher Pitt