COMP6771 Week 2.2

STL Algorithms

Common mistakes

Bazel won't sync properly
- Is bazel version < 0.26?
- Using the old VM (old VM is running CLion 2018, new one is 2019)?
- Uninstall bazel, then reinstall the newest version using the custom apt repository method
- https://docs.bazel.build/versions/master/install-ubuntu.html#install-on-ubuntu
Debugger not working
- Settings > build, execution > toolchains > debugger
  - switch bundled to GDB
- chmod a+x ~/.CLion2019.*/config/plugins/clwb/gdb/gdbserver

Starting on your assignment

git clone https://github.com/cs6771/comp6771.git
Open up clion
- If a project is already open, file > close project
- File > import bazel project > select comp6771 directory
  - Default settings are fine
- VCS > enable version control integration
- Bazel > sync

Profiling

If you want your code to go faster, we've replaced std::set with std::unordered_set
- vcs > update project to get the changes
- You may need to run the following if you're using the VM:
  - ```
  git remote remove origin && git remote add origin https://github.com/cs6771/comp6771.git
```
Profiling can be handy to work out where to optimise
Clion has support for profiling (run > profile 'bazel run binary')
- We don't recommend using it for this assignment except for your own learning
- We won't show you how to use profiling here
  - You can probably use this (but we haven't tried)
    https://www.jetbrains.com/help/clion/cpu-profiler.html

Git

To use git with clion, VCS > enable version control systems (if not enabled)
VCS > update project to download updates
- Always select rebase, not merge
- If there are conflicts, you should get a nice UI to merge changes
VCS > commit
- Has a nice UI to see what you're committing
- When you commit, you can also push
  - Only relevant if you have your own repo (eg. a fork)
"checkout" (switch between) commits using button in bottom right
Version control tab
- Both local changes and log are very useful

Debugging

See slide 2.1 for common mistakes and how to fix them
Breakpoints to pause at certain points
- Right click on breakpoints for fine-grained control
Look at variables in your debugger window while paused
Play button = Continue
Two-step arrow = Run until next line
Down arrow = Go inside the next function call
Up arrow = leave the function
Calculator icon = Type an expression in to evaluate

Principles of testing

Test API, not implentation
Don't make tests brittle
- If your code changes, your tests should change minimally
Make tests simple
- It should be obvious what went wrong
- Don't put if statements or loops in your tests
- Any complex code should be put in a well-named function

Testing - Build rules

Works this way no matter what test framework you use
You can't test anything in a file with a main function
- Why not?

cc_library(
    name = "factorial",
    srcs = ["factorial.cpp"],
    hdrs = ["factorial.h"],
)

cc_test(
    name = "factorial_test",
    srcs = ["factorial_test.cpp"],
    deps = [
        ":factorial",
        "//:catch",
    ],
)

Testing in general

Testing almost always has a form that looks something similar to this
1. Do some setup (eg. initialise variables)
2. Run some code that should be tested (call your function to be tested)
3. Check that things are as expected
Sometimes these things are hard to distinguish, but usually not

Or in other words

SCENARIO("scenario") {
  GIVEN("Some starting condition") {
    // Initialise the variables
    WHEN("My function is called") {
      // Call my function
      THEN("something should have happened") {
        // Check that the thing happened as expected
      }
    }
  }
}

Catch2 testing

A scenario is a named group of tests
GIVEN, WHEN, and THEN work the exact same way
- GIVEN should be labelled with the initialisation performed
- WHEN should be labelled with the code that you ran
- THEN should be labelled with the expectation you have of the result
- They just give us really nice errors
REQUIRE is the thing that actually runs your tests

// Feel free to remove the string if there's
// only one GIVEN in this scenario.
SCENARIO("vectors can be sized and resized") {
  GIVEN("A vector with some items") {
    std::vector<int> v(5);

    REQUIRE(v.size() == 5);
    REQUIRE(v.capacity() >= 5);

    WHEN("the size is increased") {
      v.resize(10);

      THEN("the size and capacity change") {
        REQUIRE(v.size() == 10);
        REQUIRE(v.capacity() >= 10);
      }
    }
    WHEN("the size is reduced") {
      v.resize(0);

      THEN("the size changes but not capacity") {
        REQUIRE(v.size() == 0);
        REQUIRE(v.capacity() >= 5);
      }
    }
  }
}

More advanced Catch2 testing

You can chain together GIVE/WHEN/THEN
- Do it like you would english
- You can write these before writing code tests
To run actual tests, use CHECK, CHECK_THAT, REQUIRE, or REQUIRE_THAT
- Require kills the test if it fails, check keeps on going
- REQUIRE and CHECK take in a boolean
- REQUIRE_THAT and CHECK_THAT take a value, and a matcher (https://github.com/catchorg/Catch2/blob/master/docs/matchers.md)

const auto hasAbc = Catch::Matchers::Contains(
    "aBC", Catch::CaseSensitive::No);

SCENARIO("Do that thing with the thing", "[Tags]") {
  GIVEN("This stuff exists") {
    // make stuff exist
    AND_GIVEN("And some assumption") {
      // Validate assumption
      WHEN("I do this") {
        // do this
        THEN("it should do this") {
          REQUIRE(itDoesThis());
          AND_THEN("do that") {
            REQUIRE(itDoesThat());
            REQUIRE_THAT(
                getResultOfThat(), hasAbc);
          }
        }
      }
    }
  }
}

Common algorithms

What was the writer of this code trying to do?
Does it do what it should?
How long does it take you to work that out?
How easy is it to read?

std::vector<int> nums;

int sum = 0;
for (int i = 0; i <= nums.size(); ++i) {
  sum += i;
}

What about this?

std::vector<int> nums;

int sum = 0;
for (auto it = nums.begin(); i != nums.end(); ++i) {
  sum += *i;
}

What was the writer of this code trying to do?
Does it do what it should?
How long does it take you to work that out?
How easy is it to read?

C++ range-for loops

std::vector<int> nums;

int sum = 0;

// Internally, this uses begin and end,
// but it abstracts it away.
for (const auto& i : nums) {
  sum += i;
}

What was the writer of this code trying to do?
Does it do what it should?
How long does it take you to work that out?
How easy is it to read?

Algorithms

Surely we can write a function that looks like this?
- We can (but it doesn't quite look like this)
- But we don't need to (the STL has us covered)

template <typename T>
T sum(iterable<T> cont) {
  T total;
  for (auto it = std::begin(cont); std::end(cont); ++i) {
    total += *it;
  }
  return total
}

// What type of iterator is required here?
template <typename T, typename Container>
T sum(iterator_t<Container> first, iterator_t<Container> last) {
  T total;
  for (; first != last; ++first) {
    total += *first;
  }
  return total
}

Standard Algorithms

Surely we can write a function that looks like this?
- Turns out we can (but it doesn't quite look like this)
- But we don't need to (the STL has us covered)

std::vector<int> v{1, 2, 3};
int sum = std::accumulate(v.begin(), v.end(), 0);

Very powerful

What if we want the product instead of the sum?

What if we want to only sum up the first half of the numbers?

// What is the type of std::multiplies<int>()
int product = std::accumulate(v.begin(), v.end(), 1, std::multiplies<int>());

auto midpoint = v.begin() + (v.size() / 2);
// This looks a lot harder to read. Why might it be better?
auto midpoint = std::next(v.begin(), std::distance(v.begin(), v.end()) / 2);

int sum = std::accumulate(v.begin(), midpoint, 0);

Performance and portability

Consider:
- Number of comparisons for binary search on a vector is O(log N)
- Number of comparisons for binary search on a linked list is O(N log N)
- The two implementations are completely different
We can call the same function on both of them
- It will end up calling a function have two different overloads, one for a forward iterator, and one for a random access iterator
Trivial to read
Trivial to change the type of a container

// Lower bound does a binary search, and returns the first value >= the argument.
std::vector<int> sortedVec{1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
std::lower_bound(sortedVec.begin(), sortedVec.end(), 5);

std::list<int> sortedLinkedList{1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
std::lower_bound(sortedLinkedList.begin(), sortedLinkedList.end(), 5);

Algorithms with output sequences

Why doesn't the second one work?

char to_upper(char value) {
  return std::toupper(static_cast<unsigned char>(value));
}

std::string s = "hello world";
// std::for_each modifies each element
std::for_each(s.begin(), s.end(), to_upper);

std::string upper;
// Algorithms like transform, which have output iterators,
// use the other iterator as an output.
std::transform(s.begin(), s.end(), upper.end(), to_upper);

Back inserter

Gives you an output iterator for a container that adds to the end of it

char to_upper(char value) {
  return std::toupper(value);
}

std::string s = "hello world";
// std::for_each modifies each element
std::for_each(s.begin(), s.end(), toupper);

std::string upper;
// std::transform adds to third iterator.
std::transform(s.begin(), s.end(), std::back_inserter(upper), to_upper);

Lambda functions

A function that can be defined inside other functions
Can be used with std::function<ReturnType(Arg1, Arg2)> (or auto)
- It can be used as a parameter or variable
- No need to use function pointers anymore

std::string s = "hello world";
// std::for_each modifies each element
std::for_each(s.begin(), s.end(), [] (char& value) { value = std::toupper(value); });

Lambda captures

This doesn't compile
The lambda function can get access to the scope, but does not by default

void AddN(std::vector<int>& v, int n) {
  std::for_each(v.begin(), v.end(), [] (int& val) { val = val + n; });
}

Lambda captures - By Value

Copies the value contained when the function was created
- Doesn't update when the original updates
- Safe
- Potentially slow
- May not work for non-copyable types (eg. ostream, unique pointer)

// Works great.
void AddN(std::vector<int> vec, int n) {
  std::for_each(vec.begin(), vec.end(),
      [=] (int& item) { item += n; });
}

// Even worse. This compiles successfully.
std::map<std::string, int> m;
auto emplace = [=] (const auto& key, const auto& value) { m.emplace(key, value); };
emplace("hello", 5);

// Not so great. Fails to compile.
void PrintList(const std::vector<int>& nums,
                std::ostream& os) {
  auto printer = [=] (int value) { os << value << '\n'; };
  std::for_each(nums.begin(), nums.end(), printer);
}

Lambda captures - By reference

Creates a reference to the original object
- Remains up to date with the value of the object
- Potentially very dangerous
  - Undefined behavior if you attempt to access it after the original goes out of scope
  - Especially prone to bugs when you do multithreading (out of scope of this course)
- Fast
- Works with non-copyable types

std::map<std::string, int> m;
auto emplace = [&] (
    const auto& key,
    const auto& value) {
  m.emplace(key, value);
};
// What happens here?
emplace("hello", 5);

auto GetGenerator() {
  int upto = 0;
  return [&] () { return upto++; }
}

// What happens here?
auto fn = GetGenerator();
std::cout << fn() << fn() << '\n';

Lambda captures - Generic

Can use any expression
Most frequently used for move captures, however

std::vector<int> vec{1, 2, 3};
int n = 10;
auto fn = [vec{std::move(vec)}, y=n + 1] () {
  std::cout << vec.size() << '\n' << y;
};

// Should be 0
std::cout << vec.size() << '\n';

fn();

COMP6771 Week 2.2

STL Algorithms

Common mistakes

Starting on your assignment

Profiling

Git

Debugging

Principles of testing

Testing - Build rules

Testing in general

Catch2 testing

More advanced Catch2 testing

Common algorithms

What about this?

C++ range-for loops

Algorithms

Standard Algorithms

Very powerful

Performance and portability

Algorithms with output sequences

Back inserter

Lambda functions

Lambda captures

Lambda captures - By Value

Lambda captures - By reference

Lambda captures - Generic

COMP6771 19T2 - 2.2 - STL Algorithms

COMP6771 19T2 - 2.2 - STL Algorithms

cs6771

COMP6771 Week 2.2

STL Algorithms

Common mistakes

Starting on your assignment

Profiling

Git

Debugging

Principles of testing

Testing - Build rules

Testing in general

Catch2 testing

More advanced Catch2 testing

Common algorithms

What about this?

C++ range-for loops

Algorithms

Standard Algorithms

Very powerful

Performance and portability

Algorithms with output sequences

Back inserter

Lambda functions

Lambda captures

Lambda captures - By Value

Lambda captures - By reference

Lambda captures - Generic

COMP6771 19T2 - 2.2 - STL Algorithms

More from cs6771