Graphics Programming Virtual Meetup

Discord

https://discord.gg/6TTRA5h

Twitter

https://twitter.com/GraphicMeetup

https://www.youtube.com/channel/UCbX05PBAE-582PYaRXdjRnw/

vk_mini_path_tracer

Chapter 4-5
Command Buffers and Writing an image

Link to the tutorial

https://nvpro-samples.github.io/vk_mini_path_tracer/

Source code

https://github.com/nvpro-samples/vk_mini_path_tracer

Chapter 4

Command Buffers

OpenGL vs Vulkan

"Execution of Work"

Immediately executed
- As-if rule applies
Implicitly synchronize
Large State machine

Deferred executed
- Manual Submission
Explicit Synchronization
Little State Machine
- Automatically resets

Command Buffers

Where we 'write' our GPU commands to
- ```
vkCmdDraw(command_buffer, ...);
```
Categories of commands:
- Binding - Pipelines, Shader resources, Buffers
- Drawing/Executing (raster & compute)
- Synchronization - Barriers
- Data movement - Copying data, transitioning images
Designed to be quick to write
- Needs to write only a few bytes and incrementing a pointer per command

Command Buffers cont'

Commands in a command buffer aren't guaranteed to operate in that order
- Must manually define synchronization
Able to be recorded in parallel
- Must record command buffers in separate threads
Explicitly submitted to a 'Queue'
- ```
vkQueueSubmit(...);
```

"Queue's" in Vulkan

The place where you submit work
Queues can support different capabilities & combinations of them
- Graphics
- Compute
- Transfer
Grouped into "Queue Families"
- Can be multiple "Queues" in a single family
Most hardware has a Uber Queue that supports all three capability types (abbreviated the GCT queue)

Command Pools

Hold the memory of for Command Buffer
A command pool can only work with one Queue family
Multiple Command Buffers can be allocated from a single Command Pool

VkCommandPoolCreateInfo cmdPoolInfo = nvvk::make<VkCommandPoolCreateInfo>();
cmdPoolInfo.queueFamilyIndex        = context.m_queueGCT;
VkCommandPool cmdPool;
NVVK_CHECK(vkCreateCommandPool(context, &cmdPoolInfo, nullptr, &cmdPool));

Quick Vulkan & NVVK notes

NVVK_CHECK() macro is for checking return values of Vulkan functions

If a function doesn't return void, it returns `VkResult` which is an enum

Returning `VK_SUCCESS` signals that the function didn't fail

VK_NULL_HANDLE is a type alias for 0. This is used when there is no 'valid' handle to use

However, in C++`nullptr` can be used instead

Allocating a Command Buffer

Command buffers can be 'primary' or 'secondary'
- Secondary is useful for multi threaded recording
- Secondary command buffers can't be submitted
  - Instead they are 'called' by primary buffers
We will only use primary command buffers

VkCommandBufferAllocateInfo cmdAllocInfo = nvvk::make<VkCommandBufferAllocateInfo>();
cmdAllocInfo.level                       = VK_COMMAND_BUFFER_LEVEL_PRIMARY;
cmdAllocInfo.commandPool                 = cmdPool;
cmdAllocInfo.commandBufferCount          = 1;
VkCommandBuffer cmdBuffer;
NVVK_CHECK(vkAllocateCommandBuffers(context, &cmdAllocInfo, &cmdBuffer));

Command Buffer Lifecycle

Multiple 'phases' once allocated
- Initial - Call "Begin" on it to make it ready to record
- Recording - Must "End" it when done recording
  - Where we call 'vkCmdYYY()` functions
- Executable - Ready to be submitted
- Pending - Has been submitted, currently running
  - Can't modify any resources the command buffer might reference
  - Returns to 'executable' once finished
  - Can be 'reset' to start the cycle over again.

Begin the Command Buffer

VkCommandBufferBeginInfo beginInfo = nvvk::make<VkCommandBufferBeginInfo>();
beginInfo.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT;
NVVK_CHECK(vkBeginCommandBuffer(cmdBuffer, &beginInfo));

ONE_TIME_SUBMIT - Don't allow reusing this command buffer.
This call will 'reset' the command buffer if it had been used previously.
- This is just moving a pointer back to the start, nothing expensive

We want to 'fill' the GPU buffer with the same value, 0.5f
This is to make sure we are actually modifying the code on the GPU
The reinterpret cast is due to the API only accepting 'uint32_t'.
- We want it to be filled with the bit pattern of a float, thus we must do dirty things

const float fillValue = 0.5f;
const uint32_t& fillValueU32 = reinterpret_cast<const uint32_t&>(fillValue);
vkCmdFillBuffer(cmdBuffer, buffer.buffer, 0, bufferSizeBytes, fillValueU32);

Record into the Command Buffer

Meaning: Our 'Fill GPU Buffer' command might not finish before we start reading from it on the CPU

Consecutive commands in a command buffer may work in any order they like, so long as they follow manually defined 'synchronization points'

Problem:
Vulkan doesn't guarantee when a command will be finished

Meaning: Use pipeline barriers to insert the desired order everything must happen in.

Solution:
Define the order things must happen in

// Add a command that says "Make it so that memory writes by the vkCmdFillBuffer call
// are available to read from the CPU." (In other words, "Flush the GPU caches
// so the CPU can read the data.") To do this, we use a memory barrier.
VkMemoryBarrier memoryBarrier = nvvk::make<VkMemoryBarrier>();
memoryBarrier.srcAccessMask   = VK_ACCESS_TRANSFER_WRITE_BIT; // Make transfer writes
memoryBarrier.dstAccessMask   = VK_ACCESS_HOST_READ_BIT;      // Readable by the CPU
vkCmdPipelineBarrier(cmdBuffer,                      // The command buffer
                     VK_PIPELINE_STAGE_TRANSFER_BIT, // From the transfer stage
                     VK_PIPELINE_STAGE_HOST_BIT,     // To the CPU
                     0,                              // No special flags
                     1, &memoryBarrier,              // An array of memory barriers
                     0, nullptr, 0, nullptr);        // No other barriers

They synchronize memory
But done by specifying the 'stages' for the various memory operations
A stage is a discrete 'step' that the GPU has when it is doing work
- Examples: Vertex shader, Fragment shader, Transfer
  - does include Compute shaders
They are likely the most 'complex' part of learning Vulkan
- Necessary for optimal performance
- Not obvious how to use them from the get go

But what are Pipeline Barriers?

Can be imagined like a scheduling dependency
- You have to finish task A before you can start task B
Several 'Types' of barriers:
- Memory Barriers are what we just used
- Buffer Memory Barriers
  - Can apply to a specific range of a buffer
- Image Memory Barriers
  - Can apply to a specific image (& part of said image)
  - Can perform 'layout transitions'
More technical info is in the tutorial, what we have suits our needs currently

More about Pipeline Barriers

Makes the command buffer ready to be submitted and executed

Ending a Command Buffer

NVVK_CHECK(vkEndCommandBuffer(cmdBuffer));

Now to submit it!

Submitting a Command Buffer

VkSubmitInfo submitInfo       = nvvk::make<VkSubmitInfo>();
submitInfo.commandBufferCount = 1;
submitInfo.pCommandBuffers    = &cmdBuffer;
NVVK_CHECK(vkQueueSubmit(context.m_queueGCT, 1, &submitInfo, VK_NULL_HANDLE));

vkQueueSubmit performance note:
This call is expensive. If you can, throw multiple 'command buffers' into the same submit when possible.

Now to read the data back! Right?

Not quite, we can't just start reading yet
We need to make sure the Command Buffer is finished executing
But we have Pipeline Barriers right?
That is to guarantee that the memory is ready to read once the `vkCmdFill` is finished
- It doesn't tell us when it is ready to be read from

"Easy Solution"

Just wait for the GPU to finish everything it is doing
```
vkQueueWaitIdle(context.m_queueGCT);
```
This will pause the running thread until the Queue we submitted the command buffer on finishes
A "Sledge Hammer" type solution.
- If other unrelated work was happening, we would wait for that work too
But: We aren't doing anything else so this is fine

"Better Solution"

Use VkFence's to individually wait on the submission
Can put a VkFence in a vkQueueSubmit
- Now the fence can be waited upon
'vkWaitForFences' will only sleep the thread until only the desired the submission is finished
Can go further with 'vkGetFenceStatus' to poll the fence to not put the thread to sleep
Ultimately, using the simplest solution is best

Cleanup

Delete the Command Pool once we are done

vkDestroyCommandPool(context, cmdPool, nullptr);

Can delete the Command Buffer individually, but easier to delete the pool

Finally we are doing work on the GPU!

Output should now be:

First four elements: 0.500000, 0.500000, 0.500000, 0.500000

Chapter 5

Writing an image to the disk

MUCH easier than the previous Chapter

Steps:
- Create Image
- Write data
- Close Image
- Success!

Add one library to the list of 'inclues'

#define STB_IMAGE_WRITE_IMPLEMENTATION
#include <fileformats/stb_image_write.h>

float* fltData = reinterpret_cast<float*>(data);
stbi_write_hdr("out.hdr", 
    render_width, render_height, 3, reinterpret_cast<float*>(data));

Change Printing code to:

3 is for 3 channels, for RGB

Voila!

Tech Note: sRGB & Linearity

We just wrote 0.5 to the entire image
So that should be literally 0.5 we are seeing!
Except no, while the image contains 0.5, we are seeing a slightly different color
- Most image editors will list an sRGB color of (188/255, 188/255, 188/255)
sRGB uses a 'curve' because humans do not perceive brightness linearly
The details are worth reading about but nuanced
- Don't want to ruin a perfectly good short chapter!
Generally, use Linear space for everything but the final render

Graphics Programming Virtual Meetup

Discord

Twitter

vk_mini_path_tracer

Chapter 4-5 Command Buffers and Writing an image

Link to the tutorial

Source code

Chapter 4

Command Buffers

OpenGL vs Vulkan

Command Buffers

Command Buffers cont'

"Queue's" in Vulkan

Command Pools

Quick Vulkan & NVVK notes

Allocating a Command Buffer

Command Buffer Lifecycle

Begin the Command Buffer

Record into the Command Buffer

Problem: Vulkan doesn't guarantee when a command will be finished

Solution: Define the order things must happen in

But what are Pipeline Barriers?

More about Pipeline Barriers

Ending a Command Buffer

Submitting a Command Buffer

Now to read the data back! Right?

"Easy Solution"

"Better Solution"

Cleanup

Finally we are doing work on the GPU!

Chapter 5

Writing an image to the disk

MUCH easier than the previous Chapter

Add one library to the list of 'inclues'

Change Printing code to:

Voila!

Tech Note: sRGB & Linearity

Next week:

Compute Shaders

Thanks for listening!

Questions?

Graphics Programming Virtual Meetup

Vulkan Mini Path Tracer Chapter 4-5

More from Charles Giessen

Chapter 4-5
Command Buffers and Writing an image

Problem:
Vulkan doesn't guarantee when a command will be finished

Solution:
Define the order things must happen in