Data Structure Deep Dive

Recap

  Linear Data Structures: A linear data structure traverses the data elements sequentially, in which only one data element can directly be reached.

Non-Linear Data Structures:

Every data item is attached to several other data items in a way that is specific for reflecting relationships. The data items are not arranged in a sequential structure.

 

Objectives

  • Complexity
  • Array

Complexity

  • A way of comparing how different algorithms perform in an objective manner that doesn’t rely on hardware.
  • We look at the number of steps that it takes to execute an algorithm and we call this the time complexity.
  • Time Complexity : the number of steps involved to run an algorithm.
  • Memory Complexity : the amount of memory it takes to run an algorithm.

Big-O Notation

  • Big O notation is used in Computer Science to describe the performance or complexity of an algorithm.
  • Big O specifically describes the worst-case scenario and the execution time required or the space used by an algorithm.

The use of Big O

  It can measure how the performance of an algorithm changes, based on the size of the input set of data.

   The reason the letter O (in Big-O) is used is because the growth rate of an algorithm’s time complexity is also referred to as the order of operation.

   It usually describes the worst case scenario of how long it takes to perform a given operation. And it’s important to note that many different algorithms and data structures have more than one Big-O value.

Examples of Big O notations

  • O(1) (best case): Given the page that a business's name is on and the business name, find the phone number.

  • O(1) (average case): Given the page that a person's name is on and their name, find the phone number.

  • O(log n): Given a person's name, find the phone number by picking a random point about halfway through the part of the book you haven't searched yet, then checking to see whether the person's name is at that point. Then repeat the process about halfway through the part of the book where the person's name lies.

The phone book as an example to compare other kinds of operations and their running time.

  • O(n): Find all people whose phone numbers contain the digit "5".

  • O(n log n): There was a mix-up at the printer's office, and our phone book had all its pages inserted in a random order. Fix the ordering so that it's correct by looking at the first name on each page and then putting that page in the appropriate spot in a new, empty phone book.

How to Determine Complexities

  1. Sequence of statements: If each statement is "simple" then the time for each statement is constant and the total time is also constant: O(1)
  2. if-then-else statements: if sequence 1 is O(N) and sequence 2 is O(1) the worst-case time for the whole if-then-else statement would be O(N).
  3. for loops:  The loop executes N times, so the sequence of statements also executes N times. Since we assume the statements are O(1), the total time for the for loop is N * O(1), which is O(N) overall.
  4. Nested loops:  the total complexity for the two loops is O(N^2).

Array

  • Homogeneity: all elements are of the same data type, hence homogeneous data
  • Array elements are stored at contiguous memory locations

Arrays in Memory

  • Contiguouts block in memory.
  • Every element occupies the same amount of space in memory.
  • If an array starts at memory address x, and the size of each element in the array is y, we can calculate the memory address of the ith element by using the following expression: x + i * y.
  • If we know the index of an element, the time to retrieve the element will be the same, no matter where it is in the array.
  • They’re also memory efficient because we don’t have to store any extra information with each element in the array.

Type of Array

  1.  One dimensional (just a linear list): only one subscript is required to access an individual element

2. Two dimensional (matrix/table): two indices/subscripts are required (row, column)

 Search for a target (key) in the search space  Search space examples are:

   - All students in the class

   - All numbers in a given list

 One of the two possible outcomes

 Target is found (success)

 Target is not found (failure)

 

Searching Algorithms

 Search in a sequential order

 Termination condition

   - Target is found (success)

   - List of elements is exhausted (failure)

 

Sequential Search

Given: A list of N elements, and the target

1. index <-- 0

2. Repeat steps 3 to 5

3. Compare target with list[index]

4. if target = list[index] then

                 return index // success

     else if index >= N - 1

                 return -1 // failure

5. index <-- index + 1

 

Sequential Search Algorithm

Complexity: O(n): it contains single loop

Binary Search

  Search through a sorted list of items

 Sorted list is a pre-condition for Binary Search!

 Repeatedly divides the search space (list) into two

 Divide-and-conquer approach

    1. Divide: Break the given problem into subproblems of same type.
  2. Conquer: Recursively solve these subproblems
   3. Combine: Appropriately combine the answers

Binary Search Algorithm

 Middle <- (LI + HI) / 2

 One of the three possibilities

     Key is equal to List[Middle]

        o success and stop

     Key is less than List[Middle]

       o Key should be in the left half of List, or it does not exist

     Key is greater than List[Middle]

        o Key should be in the right half of List, or it does not exist

 Termination Condition

      List[Middle] is equal to Key (success) OR LI > HI (Failure)

Complexity: O(log n)

Insertion Algorithm

     Insert operation is to insert one or more data elements into an array. Based on the requirement, a new element can be added at the beginning, end, or any given index of array.

Deletion Operation

In delete operation, the element to be deleted is searched using binary search and then delete operation is performed followed by shifting the elements.

Sort Algorithm

— Selection sort: Orders a list of values by repeatedly putting the smallest or largest unplaced value into its final position.

The algorithm:

  — Look through the list to find the smallest value.

  — Swap it so that it is at index 0.

  — Look through the list to find the second-smallest value.

  — Swap it so that it is at index 1.

  — Repeat until all values are in their proper places.

Sort Algorithm

— Merge sort: Repeatedly divides the data in half, sorts each half, and combines the sorted halves into a sorted whole.

 

The algorithm:

  — Divide the list into two roughly equal halves.

  — Sort the left half.

  — Sort the right half.

  — Merge the two sorted halves into one sorted list.

Merge sort runtime

  • Bubble sort is a simple sorting algorithm. This sorting algorithm is comparison-based algorithm in which each pair of adjacent elements is compared and the elements are swapped if they are not in order.
  • This algorithm is not suitable for large data sets as its average and worst case complexity are of Ο(n^2) where n is the number of items.

Sort Algorithm

Bubble Sort Algorithm

—   Initialize the size of the list to be sorted to be the actual size of the list.

—  Loop through the list until no element needs to be exchanged with another to reach its correct position.

     —   Loop (i) from 0 to size of the list to be sorted - 2.
           — Compare the ith and (i + 1)st elements in the unsorted list.
           — Swap the ith and (i + 1)st elements if not in order ( ascending or descending as desired).

    —   Decrease the size of the list to be sorted by 1.

2) Bubble sort starts with very first two elements, comparing them to check which one is greater.

1) We take an unsorted array for our example. Bubble sort takes Ο(n^2) time so we're keeping it short and precise.

3) Next, we compare 33 with 27. we find that 27 is smaller than 33 and these two values must be swapped.

5) We know then that 10 is smaller 35. Hence they are not sorted then swap

4) Next we compare 33 and 35. We find that both are in already sorted positions.

6) We swap these values. We find that we have reached the end of the array. Repeat the same step until it is fully sorted

Indexing: inspect or update an element using its index. Performance is very fast O(1)

randomNumber = numbers[5];

numbers[20000] = 100;

 

Insertion: add an element at certain index

– Start: very slow O(n) because of shift

– End : very fast O(1) because no need to shift

 

Removal: remove an element at certain index

– Start: very slow O(n) because of shift

– End : very fast O(1) because no need to shift

 

Array Operations and Big O notatitons

Array Operations and Big O notatitons

Search: performance depends on algorithm

    1) Linear: slow O(n)

    2) Binary : O(log n)

 

Sort: performance depends on algorithm

   1) Bubble: slow O(n2)

   2) Selection: slow O(n2)

   3) Merge : O(n log n)

 

Thank you

Data Structure Deep Dive

By Sofwan Abdulwahab

Data Structure Deep Dive

  • 224