COMP6771 Week 2.2

STL Algorithms

Common mistakes

  • Bazel won't sync properly
  • Debugger not working
    • Settings > build, execution > toolchains > debugger
      • switch bundled to GDB
    • chmod a+x ~/.CLion2019.*/config/plugins/clwb/gdb/gdbserver

Starting on your assignment

  • git clone https://github.com/cs6771/comp6771.git
  • Open up clion
    • If a project is already open, file > close project
    • File > import bazel project > select comp6771 directory
      • Default settings are fine
    • VCS > enable version control integration
    • Bazel > sync

Profiling

  • If you want your code to go faster, we've replaced std::set with std::unordered_set
    • vcs > update project to get the changes
    • You may need to run the following if you're using the VM:
      • git remote remove origin && git remote add origin https://github.com/cs6771/comp6771.git
  • Profiling can be handy to work out where to optimise
  • Clion has support for profiling (run > profile 'bazel run binary')

Git

  • To use git with clion, VCS > enable version control systems (if not enabled)
  • VCS > update project to download updates
    • Always select rebase, not merge
    • If there are conflicts, you should get a nice UI to merge changes
  • VCS > commit
    • Has a nice UI to see what you're committing
    • When you commit, you can also push
      • Only relevant if you have your own repo (eg. a fork)
  • "checkout" (switch between) commits using button in bottom right
  • Version control tab
    • Both local changes and log are very useful

Debugging

  • See slide 2.1 for common mistakes and how to fix them
  • Breakpoints to pause at certain points
    • Right click on breakpoints for fine-grained control
  • Look at variables in your debugger window while paused
  • Play button = Continue
  • Two-step arrow = Run until next line
  • Down arrow = Go inside the next function call
  • Up arrow = leave the function
  • Calculator icon =  Type an expression in to evaluate

Principles of testing

  • Test API, not implentation
  • Don't make tests brittle
    • If your code changes, your tests should change minimally
  • Make tests simple
    • It should be obvious what went wrong
    • Don't put if statements or loops in your tests
    • Any complex code should be put in a well-named function

Testing - Build rules

  • Works this way no matter what test framework you use
  • You can't test anything in a file with a main function
    • Why not?
cc_library(
    name = "factorial",
    srcs = ["factorial.cpp"],
    hdrs = ["factorial.h"],
)

cc_test(
    name = "factorial_test",
    srcs = ["factorial_test.cpp"],
    deps = [
        ":factorial",
        "//:catch",
    ],
)

Testing in general

  • Testing almost always has a form that looks something similar to this
    1. Do some setup (eg. initialise variables)
    2. Run some code that should be tested (call your function to be tested)
    3. Check that things are as expected
  • Sometimes these things are hard to distinguish, but usually not

 

Or in other words

SCENARIO("scenario") {
  GIVEN("Some starting condition") {
    // Initialise the variables
    WHEN("My function is called") {
      // Call my function
      THEN("something should have happened") {
        // Check that the thing happened as expected
      }
    }
  }
}

Catch2 testing

  • A scenario is a named group of tests
  • GIVEN, WHEN, and THEN work the exact same way
    • GIVEN should be labelled with the initialisation performed
    • WHEN should be labelled with the code that you ran
    • THEN should be labelled with the expectation you have of the result
    • They just give us really nice errors
  • REQUIRE is the thing that actually runs your tests
// Feel free to remove the string if there's
// only one GIVEN in this scenario.
SCENARIO("vectors can be sized and resized") {
  GIVEN("A vector with some items") {
    std::vector<int> v(5);

    REQUIRE(v.size() == 5);
    REQUIRE(v.capacity() >= 5);

    WHEN("the size is increased") {
      v.resize(10);

      THEN("the size and capacity change") {
        REQUIRE(v.size() == 10);
        REQUIRE(v.capacity() >= 10);
      }
    }
    WHEN("the size is reduced") {
      v.resize(0);

      THEN("the size changes but not capacity") {
        REQUIRE(v.size() == 0);
        REQUIRE(v.capacity() >= 5);
      }
    }
  }
}

More advanced Catch2 testing

  • You can chain together GIVE/WHEN/THEN
    • Do it like you would english
    • You can write these before writing code tests
  • To run actual tests, use CHECK, CHECK_THAT, REQUIRE, or REQUIRE_THAT
const auto hasAbc = Catch::Matchers::Contains(
    "aBC", Catch::CaseSensitive::No);

SCENARIO("Do that thing with the thing", "[Tags]") {
  GIVEN("This stuff exists") {
    // make stuff exist
    AND_GIVEN("And some assumption") {
      // Validate assumption
      WHEN("I do this") {
        // do this
        THEN("it should do this") {
          REQUIRE(itDoesThis());
          AND_THEN("do that") {
            REQUIRE(itDoesThat());
            REQUIRE_THAT(
                getResultOfThat(), hasAbc);
          }
        }
      }
    }
  }
}

Common algorithms

  • What was the writer of this code trying to do?
  • Does it do what it should?
  • How long does it take you to work that out?
  • How easy is it to read?
std::vector<int> nums;

int sum = 0;
for (int i = 0; i <= nums.size(); ++i) {
  sum += i;
}

What about this?

std::vector<int> nums;

int sum = 0;
for (auto it = nums.begin(); i != nums.end(); ++i) {
  sum += *i;
}
  • What was the writer of this code trying to do?
  • Does it do what it should?
  • How long does it take you to work that out?
  • How easy is it to read?

C++ range-for loops

std::vector<int> nums;

int sum = 0;

// Internally, this uses begin and end,
// but it abstracts it away.
for (const auto& i : nums) {
  sum += i;
}
  • What was the writer of this code trying to do?
  • Does it do what it should?
  • How long does it take you to work that out?
  • How easy is it to read?

Algorithms

  • Surely we can write a function that looks like this?
    • We can (but it doesn't quite look like this)
    • But we don't need to (the STL has us covered)
template <typename T>
T sum(iterable<T> cont) {
  T total;
  for (auto it = std::begin(cont); std::end(cont); ++i) {
    total += *it;
  }
  return total
}
// What type of iterator is required here?
template <typename T, typename Container>
T sum(iterator_t<Container> first, iterator_t<Container> last) {
  T total;
  for (; first != last; ++first) {
    total += *first;
  }
  return total
}

Standard Algorithms

  • Surely we can write a function that looks like this?
    • Turns out we can (but it doesn't quite look like this)
    • But we don't need to (the STL has us covered)
std::vector<int> v{1, 2, 3};
int sum = std::accumulate(v.begin(), v.end(), 0);

Very powerful

What if we want the product instead of the sum?

 

 

What if we want to only sum up the first half of the numbers?

// What is the type of std::multiplies<int>()
int product = std::accumulate(v.begin(), v.end(), 1, std::multiplies<int>());
auto midpoint = v.begin() + (v.size() / 2);
// This looks a lot harder to read. Why might it be better?
auto midpoint = std::next(v.begin(), std::distance(v.begin(), v.end()) / 2);

int sum = std::accumulate(v.begin(), midpoint, 0);

Performance and portability

  • Consider:
    • Number of comparisons for binary search on a vector is O(log N)
    • Number of comparisons for binary search on a linked list is O(N log N)
    • The two implementations are completely different
  • We can call the same function on both of them
    • It will end up calling a function have two different overloads, one for a forward iterator, and one for a random access iterator
  • Trivial to read
  • Trivial to change the type of a container
// Lower bound does a binary search, and returns the first value >= the argument.
std::vector<int> sortedVec{1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
std::lower_bound(sortedVec.begin(), sortedVec.end(), 5);

std::list<int> sortedLinkedList{1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
std::lower_bound(sortedLinkedList.begin(), sortedLinkedList.end(), 5);

Algorithms with output sequences

Why doesn't the second one work?

char to_upper(char value) {
  return std::toupper(static_cast<unsigned char>(value));
}

std::string s = "hello world";
// std::for_each modifies each element
std::for_each(s.begin(), s.end(), to_upper);

std::string upper;
// Algorithms like transform, which have output iterators,
// use the other iterator as an output.
std::transform(s.begin(), s.end(), upper.end(), to_upper);

Back inserter

Gives you an output iterator for a container that adds to the end of it

char to_upper(char value) {
  return std::toupper(value);
}

std::string s = "hello world";
// std::for_each modifies each element
std::for_each(s.begin(), s.end(), toupper);

std::string upper;
// std::transform adds to third iterator.
std::transform(s.begin(), s.end(), std::back_inserter(upper), to_upper);

Lambda functions

  • A function that can be defined inside other functions
  • Can be used with std::function<ReturnType(Arg1, Arg2)> (or auto)
    • It can be used as a parameter or variable
    • No need to use function pointers anymore
std::string s = "hello world";
// std::for_each modifies each element
std::for_each(s.begin(), s.end(), [] (char& value) { value = std::toupper(value); });

Lambda captures

  • This doesn't compile
  • The lambda function can get access to the scope, but does not by default
void AddN(std::vector<int>& v, int n) {
  std::for_each(v.begin(), v.end(), [] (int& val) { val = val + n; });
}

Lambda captures - By Value

  • Copies the value contained when the function was created
    • Doesn't update when the original updates
    • Safe
    • Potentially slow
    • May not work for non-copyable types (eg. ostream, unique pointer)
// Works great.
void AddN(std::vector<int> vec, int n) {
  std::for_each(vec.begin(), vec.end(),
      [=] (int& item) { item += n; });
}
// Even worse. This compiles successfully.
std::map<std::string, int> m;
auto emplace = [=] (const auto& key, const auto& value) { m.emplace(key, value); };
emplace("hello", 5);
// Not so great. Fails to compile.
void PrintList(const std::vector<int>& nums,
                std::ostream& os) {
  auto printer = [=] (int value) { os << value << '\n'; };
  std::for_each(nums.begin(), nums.end(), printer);
}

Lambda captures - By reference

  • Creates a reference to the original object
    • Remains up to date with the value of the object
    • Potentially very dangerous
      • Undefined behavior if you attempt to access it after the original goes out of scope
      • Especially prone to bugs when you do multithreading (out of scope of this course)
    • ​Fast
    • Works with non-copyable types
std::map<std::string, int> m;
auto emplace = [&] (
    const auto& key,
    const auto& value) {
  m.emplace(key, value);
};
// What happens here?
emplace("hello", 5);
auto GetGenerator() {
  int upto = 0;
  return [&] () { return upto++; }
}

// What happens here?
auto fn = GetGenerator();
std::cout << fn() << fn() << '\n';

Lambda captures - Generic

  • Can use any expression
  • Most frequently used for move captures, however
std::vector<int> vec{1, 2, 3};
int n = 10;
auto fn = [vec{std::move(vec)}, y=n + 1] () {
  std::cout << vec.size() << '\n' << y;
};

// Should be 0
std::cout << vec.size() << '\n';

fn();
Made with Slides.com