COMP6771 Week 2.2
STL Algorithms
Common mistakes
- Bazel won't sync properly
- Is bazel version < 0.26?
- Using the old VM (old VM is running CLion 2018, new one is 2019)?
- Uninstall bazel, then reinstall the newest version using the custom apt repository method
- https://docs.bazel.build/versions/master/install-ubuntu.html#install-on-ubuntu
- Debugger not working
- Settings > build, execution > toolchains > debugger
- switch bundled to GDB
- chmod a+x ~/.CLion2019.*/config/plugins/clwb/gdb/gdbserver
- Settings > build, execution > toolchains > debugger
Starting on your assignment
- git clone https://github.com/cs6771/comp6771.git
- Open up clion
- If a project is already open, file > close project
- File > import bazel project > select comp6771 directory
- Default settings are fine
- VCS > enable version control integration
- Bazel > sync
Profiling
-
If you want your code to go faster, we've replaced std::set with std::unordered_set
- vcs > update project to get the changes
- You may need to run the following if you're using the VM:
-
git remote remove origin && git remote add origin https://github.com/cs6771/comp6771.git
-
- Profiling can be handy to work out where to optimise
- Clion has support for profiling (run > profile 'bazel run binary')
- We don't recommend using it for this assignment except for your own learning
- We won't show you how to use profiling here
- You can probably use this (but we haven't tried)
https://www.jetbrains.com/help/clion/cpu-profiler.html
- You can probably use this (but we haven't tried)
Git
- To use git with clion, VCS > enable version control systems (if not enabled)
- VCS > update project to download updates
- Always select rebase, not merge
- If there are conflicts, you should get a nice UI to merge changes
- VCS > commit
- Has a nice UI to see what you're committing
- When you commit, you can also push
- Only relevant if you have your own repo (eg. a fork)
- "checkout" (switch between) commits using button in bottom right
- Version control tab
- Both local changes and log are very useful
Debugging
- See slide 2.1 for common mistakes and how to fix them
- Breakpoints to pause at certain points
- Right click on breakpoints for fine-grained control
- Look at variables in your debugger window while paused
- Play button = Continue
- Two-step arrow = Run until next line
- Down arrow = Go inside the next function call
- Up arrow = leave the function
- Calculator icon = Type an expression in to evaluate
Principles of testing
- Test API, not implentation
- Don't make tests brittle
- If your code changes, your tests should change minimally
- Make tests simple
- It should be obvious what went wrong
- Don't put if statements or loops in your tests
- Any complex code should be put in a well-named function
Testing - Build rules
- Works this way no matter what test framework you use
- You can't test anything in a file with a main function
- Why not?
cc_library(
name = "factorial",
srcs = ["factorial.cpp"],
hdrs = ["factorial.h"],
)
cc_test(
name = "factorial_test",
srcs = ["factorial_test.cpp"],
deps = [
":factorial",
"//:catch",
],
)
Testing in general
- Testing almost always has a form that looks something similar to this
- Do some setup (eg. initialise variables)
- Run some code that should be tested (call your function to be tested)
- Check that things are as expected
- Sometimes these things are hard to distinguish, but usually not
Or in other words
SCENARIO("scenario") {
GIVEN("Some starting condition") {
// Initialise the variables
WHEN("My function is called") {
// Call my function
THEN("something should have happened") {
// Check that the thing happened as expected
}
}
}
}
Catch2 testing
- A scenario is a named group of tests
- GIVEN, WHEN, and THEN work the exact same way
- GIVEN should be labelled with the initialisation performed
- WHEN should be labelled with the code that you ran
- THEN should be labelled with the expectation you have of the result
- They just give us really nice errors
- REQUIRE is the thing that actually runs your tests
// Feel free to remove the string if there's
// only one GIVEN in this scenario.
SCENARIO("vectors can be sized and resized") {
GIVEN("A vector with some items") {
std::vector<int> v(5);
REQUIRE(v.size() == 5);
REQUIRE(v.capacity() >= 5);
WHEN("the size is increased") {
v.resize(10);
THEN("the size and capacity change") {
REQUIRE(v.size() == 10);
REQUIRE(v.capacity() >= 10);
}
}
WHEN("the size is reduced") {
v.resize(0);
THEN("the size changes but not capacity") {
REQUIRE(v.size() == 0);
REQUIRE(v.capacity() >= 5);
}
}
}
}
More advanced Catch2 testing
- You can chain together GIVE/WHEN/THEN
- Do it like you would english
- You can write these before writing code tests
- To run actual tests, use CHECK, CHECK_THAT, REQUIRE, or REQUIRE_THAT
- Require kills the test if it fails, check keeps on going
- REQUIRE and CHECK take in a boolean
- REQUIRE_THAT and CHECK_THAT take a value, and a matcher (https://github.com/catchorg/Catch2/blob/master/docs/matchers.md)
const auto hasAbc = Catch::Matchers::Contains(
"aBC", Catch::CaseSensitive::No);
SCENARIO("Do that thing with the thing", "[Tags]") {
GIVEN("This stuff exists") {
// make stuff exist
AND_GIVEN("And some assumption") {
// Validate assumption
WHEN("I do this") {
// do this
THEN("it should do this") {
REQUIRE(itDoesThis());
AND_THEN("do that") {
REQUIRE(itDoesThat());
REQUIRE_THAT(
getResultOfThat(), hasAbc);
}
}
}
}
}
}
Common algorithms
- What was the writer of this code trying to do?
- Does it do what it should?
- How long does it take you to work that out?
- How easy is it to read?
std::vector<int> nums;
int sum = 0;
for (int i = 0; i <= nums.size(); ++i) {
sum += i;
}
What about this?
std::vector<int> nums;
int sum = 0;
for (auto it = nums.begin(); i != nums.end(); ++i) {
sum += *i;
}
- What was the writer of this code trying to do?
- Does it do what it should?
- How long does it take you to work that out?
- How easy is it to read?
C++ range-for loops
std::vector<int> nums;
int sum = 0;
// Internally, this uses begin and end,
// but it abstracts it away.
for (const auto& i : nums) {
sum += i;
}
- What was the writer of this code trying to do?
- Does it do what it should?
- How long does it take you to work that out?
- How easy is it to read?
Algorithms
- Surely we can write a function that looks like this?
- We can (but it doesn't quite look like this)
- But we don't need to (the STL has us covered)
template <typename T>
T sum(iterable<T> cont) {
T total;
for (auto it = std::begin(cont); std::end(cont); ++i) {
total += *it;
}
return total
}
// What type of iterator is required here?
template <typename T, typename Container>
T sum(iterator_t<Container> first, iterator_t<Container> last) {
T total;
for (; first != last; ++first) {
total += *first;
}
return total
}
Standard Algorithms
- Surely we can write a function that looks like this?
- Turns out we can (but it doesn't quite look like this)
- But we don't need to (the STL has us covered)
std::vector<int> v{1, 2, 3};
int sum = std::accumulate(v.begin(), v.end(), 0);
Very powerful
What if we want the product instead of the sum?
What if we want to only sum up the first half of the numbers?
// What is the type of std::multiplies<int>()
int product = std::accumulate(v.begin(), v.end(), 1, std::multiplies<int>());
auto midpoint = v.begin() + (v.size() / 2);
// This looks a lot harder to read. Why might it be better?
auto midpoint = std::next(v.begin(), std::distance(v.begin(), v.end()) / 2);
int sum = std::accumulate(v.begin(), midpoint, 0);
Performance and portability
- Consider:
- Number of comparisons for binary search on a vector is O(log N)
- Number of comparisons for binary search on a linked list is O(N log N)
- The two implementations are completely different
- We can call the same function on both of them
- It will end up calling a function have two different overloads, one for a forward iterator, and one for a random access iterator
- Trivial to read
- Trivial to change the type of a container
// Lower bound does a binary search, and returns the first value >= the argument.
std::vector<int> sortedVec{1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
std::lower_bound(sortedVec.begin(), sortedVec.end(), 5);
std::list<int> sortedLinkedList{1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
std::lower_bound(sortedLinkedList.begin(), sortedLinkedList.end(), 5);
Algorithms with output sequences
Why doesn't the second one work?
char to_upper(char value) {
return std::toupper(static_cast<unsigned char>(value));
}
std::string s = "hello world";
// std::for_each modifies each element
std::for_each(s.begin(), s.end(), to_upper);
std::string upper;
// Algorithms like transform, which have output iterators,
// use the other iterator as an output.
std::transform(s.begin(), s.end(), upper.end(), to_upper);
Back inserter
Gives you an output iterator for a container that adds to the end of it
char to_upper(char value) {
return std::toupper(value);
}
std::string s = "hello world";
// std::for_each modifies each element
std::for_each(s.begin(), s.end(), toupper);
std::string upper;
// std::transform adds to third iterator.
std::transform(s.begin(), s.end(), std::back_inserter(upper), to_upper);
Lambda functions
- A function that can be defined inside other functions
-
Can be used with std::function<ReturnType(Arg1, Arg2)> (or auto)
- It can be used as a parameter or variable
- No need to use function pointers anymore
std::string s = "hello world";
// std::for_each modifies each element
std::for_each(s.begin(), s.end(), [] (char& value) { value = std::toupper(value); });
Lambda captures
- This doesn't compile
- The lambda function can get access to the scope, but does not by default
void AddN(std::vector<int>& v, int n) {
std::for_each(v.begin(), v.end(), [] (int& val) { val = val + n; });
}
Lambda captures - By Value
- Copies the value contained when the function was created
- Doesn't update when the original updates
- Safe
- Potentially slow
- May not work for non-copyable types (eg. ostream, unique pointer)
// Works great.
void AddN(std::vector<int> vec, int n) {
std::for_each(vec.begin(), vec.end(),
[=] (int& item) { item += n; });
}
// Even worse. This compiles successfully.
std::map<std::string, int> m;
auto emplace = [=] (const auto& key, const auto& value) { m.emplace(key, value); };
emplace("hello", 5);
// Not so great. Fails to compile.
void PrintList(const std::vector<int>& nums,
std::ostream& os) {
auto printer = [=] (int value) { os << value << '\n'; };
std::for_each(nums.begin(), nums.end(), printer);
}
Lambda captures - By reference
- Creates a reference to the original object
- Remains up to date with the value of the object
-
Potentially very dangerous
- Undefined behavior if you attempt to access it after the original goes out of scope
- Especially prone to bugs when you do multithreading (out of scope of this course)
- Fast
- Works with non-copyable types
std::map<std::string, int> m;
auto emplace = [&] (
const auto& key,
const auto& value) {
m.emplace(key, value);
};
// What happens here?
emplace("hello", 5);
auto GetGenerator() {
int upto = 0;
return [&] () { return upto++; }
}
// What happens here?
auto fn = GetGenerator();
std::cout << fn() << fn() << '\n';
Lambda captures - Generic
- Can use any expression
- Most frequently used for move captures, however
std::vector<int> vec{1, 2, 3};
int n = 10;
auto fn = [vec{std::move(vec)}, y=n + 1] () {
std::cout << vec.size() << '\n' << y;
};
// Should be 0
std::cout << vec.size() << '\n';
fn();
COMP6771 19T2 - 2.2 - STL Algorithms
By cs6771
COMP6771 19T2 - 2.2 - STL Algorithms
- 1,587