Unless otherwise specified via an official form of communication by Hayden, indicating a course retcon, we will not assess you on any material in this deck of lecture slides in any assignment or exam.
To help you distinguish between assessable and non-assessable lecture material, they will be visibly different.
int main() {}
clang++-11 -std=c++20 -Wall -Wextra -pedantic -Werror -o simple simple.cpp
g++-10 -std=c++20 -Wall -Wextra -pedantic -Werror -o simple simple.cpp
Clang (LLVM project)
GCC (GNU)
cl.exe /std:c++latest /W4 /Wx /EHsc /permissive- /Fo"simple.exe" simple.cpp
MSVC (Windows only)
int main() {}
Compiler
simple
Source files are program text stored in some file.
int main() {}
simple.cpp
A source file with all of its headers included is called a translation unit (or TU for short).
Lexer
int main() {}
token{kind::int_, {.line=1,.col=1}, {.line=1,.col=4}, "int"}
token{kind::id_, {.line=1,.col=5}, {.line=1,.col=9}, "main"}
token{kind::lparen, {.line=1,.col=10}, {.line=1,.col=11}, "("}
token{kind::rparen, {.line=1,.col=11}, {.line=1,.col=12}, ")"}
token{kind::lcurly, {.line=1,.col=14}, {.line=1,.col=15}, "{"}
token{kind::rcurly, {.line=1,.col=15}, {.line=1,.col=16}, "}"}
token{kind::eof, {.line=1,.col=16}, {.line=1,.col=16}, "$"}
simple.cpp
Tokens
Lexer
int main() {}
|translation_unit -|declaration_seq --|declaration ---|function_definition ----|return_type: "int" @ {1,1}..{1,4} ----|identifier: "main" @ {1,5}..{1,9} ----|parameters: none ----|function_body -----|compound_statement: empty -|eof
simple.cpp
main int()
Parser
Tokens
IR
Lexer
int main() {}
|translation_unit -|declaration_seq --|declaration ---|function_definition ----|return_type: "int" @ {1,1}..{1,4} ----|identifier: "main" @ {1,5}..{1,9} ----|parameters: none ----|function_body -----|compound_statement ------|return_statement -------|primary_expression: 0 -|eof
simple.cpp
Parser
Tokens
Checker
IR
IR'
Lexer
Generates target file equivalent to source file.
Example on right is x86_64 assembly.
int main() {}
simple.cpp
Parser
Tokens
Checker
IR
IR'
main: # @main push rbp mov rbp, rsp xor eax, eax pop rbp ret
CodeGen
Target
int main() {}
Compiler
simple.cpp
Target program
Linker
Compiler
hello.cpp
hello.o
#include <iostream>
int main() {
std::cout << "Hi\n";
}
Assembler
Target program
x86_64 assembly
C++ Standard Library code
Compiler
Target program
x86_64 assembly
Assembler
libc++.so
Talk pls
Compiler
hello.cpp
hello.o
Linker
#include <iostream>
int main() {
std::cout << "Hi\n";
}
libc++.so
Pre-compiled library
Assembler
Target program
hello program
x86_64 assembly
clang++-11 -std=c++20 -Wall -Wextra -pedantic -Werror -o simple -O3 simple.cpp
g++-10 -std=c++20 -Wall -Wextra -pedantic -Werror -o simple -O3 simple.cpp
Clang (LLVM project)
GCC (GNU)
cl.exe /std:c++latest /W4 /Wx /EHsc /permissive- /Fo"simple.exe" /Ox simple.cpp
MSVC (Windows only)
Lexer
Tokens
hello.cpp
#include <iostream>
int main() {
std::cout << "Hi\n";
}
Parser
IR
IR'
Checker
IR''
Optimiser
Checker
Target
CodeGen
Example on Compiler Explorer
clang++-11 -std=c++20 -Wall -Wextra -pedantic -Werror -o simple -O3 -flto=thin simple.cpp
g++-10 -std=c++20 -Wall -Wextra -pedantic -Werror -o simple -O3 -flto simple.cpp
Clang (LLVM project)
GCC (GNU)
cl.exe /std:c++latest /W4 /Wx /EHsc /permissive- /Fo"simple.exe" /Ox /GL simple.cpp
MSVC (Windows only)
The compiler's optimiser can't make optimisations across different object files.
If you compile first.cpp today, second.cpp tomorrow, and link them three days from now, how can the compiler reasonably optimise on that?
The linker has all the object files at the same time, so it's able to optimise across object files during linking.
clang++-11 -std=c++20 -Wall -Wextra -pedantic -Werror -o simple -O3 -flto=thin -fuse-ld=lld simple.cpp
g++-10 -std=c++20 -Wall -Wextra -pedantic -Werror -o simple -O3 -flto -fuse-ld=gold simple.cpp
Clang (LLVM project)
GCC (GNU)
cl.exe /std:c++latest /W4 /Wx /EHsc /permissive- /Fo"simple.exe" /Ox /GL simple.cpp
MSVC (Windows only)
Set of programming tools used to build a project.
Compiler (clang++-11)
Linker (lld-11)
Linter (clang-tidy-11)
Package manager (vcpkg)
Debugger (lldb-11)
Libraries
Standard library (libc++-11, libc++abi-11)
// word_ladder.cpp
// implements word_ladder::generate
// lexicon.cpp
// implements word_ladder::lexicon
// word_ladder_test.cpp
// tests word_ladder::generate
#ifndef COMP6771_WORD_LADDER_HPP
#define COMP6771_WORD_LADDER_HPP
// headers...
namespace word_ladder {
[[nodiscard]] auto read_lexicon(std::string const& path) -> std::unordered_set<std::string>;
auto generate(std::string const&, std::string const&, std::unordered_set<std:string> const&)
-> std::vector<std::vector<std::string>>;
} // namespace word_ladder
#endif // COMP6771_WORD_LADDER_HPP
#ifndef COMP6771_WORD_LADDER_HPP
#define COMP6771_WORD_LADDER_HPP
// headers...
namespace word_ladder {
[[nodiscard]] auto read_lexicon(std::string const& path) -> absl::flat_hash_set<std::string>;
auto generate(std::string const&, std::string const&, absl::flat_hash_set<std:string> const&)
-> std::vector<std::vector<std::string>>;
} // namespace word_ladder
#endif // COMP6771_WORD_LADDER_HPP
Forgot to recompile
Recompiled
Recompiled
// word_ladder.cpp
// implements word_ladder::generate
// lexicon.cpp
// implements word_ladder::lexicon
// word_ladder_test.cpp
// tests word_ladder::generate
ld.lld: error: undefined symbol: word_ladder::generate(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)
>>> referenced by word_ladder_test1.cpp
>>> word_ladder_test1.o:(____C_A_T_C_H____T_E_S_T____0())
ld.lld: error: undefined symbol: word_ladder::generate(std::string const&, std::string const&, std::unordered_set<std::string> const&)
>>> referenced by word_ladder_test1.cpp
>>> word_ladder_test1.o:(____C_A_T_C_H____T_E_S_T____0())
Linker error!
Shell scripts aren't enough because they don't understand the notion of a dependency. They'll either compile everything every time (slow) or compile exactly what you ask for.
A build system automates the process of compiling and linking the edited parts of a program so that you don't need to worry about the process more than once.
Examples: make, ninja, Maven, Apache Ant, Cargo
What if we wanted to build for all three major operating systems?
Need to write three build scripts???
Yuck!
What if we wanted to build for all available major toolchains?
Need to write build scripts per OS, per toolchain???
Double yuck!!
CMake is a build system generator.
We state what we want; let CMake work out how to write the build script.
We'll now switch over to a live demo where we set up a project.
A toolchain file is a file that contains all the details about your toolchain.
You tell CMake where it is by defining CMAKE_TOOLCHAIN_FILE.
CMake then uses this toolchain file to generate all the toolchain-specific build rules.
Our toolchain files are located in
config/cmake/toolchain