Homework 7:

Due by 11:59pm on Tuesday, 12/29

Instructions

Download hw07.zip.

Note that this homework will require you to code in both Scheme and Python. Your solutions to questions 1-2 should be placed in hw07.scm, and your solutions to question 3-6 should be placed in hw07.sql.

To complete this homework assignment, you will need to use SQLite version 3.8.3 or greater. See Lab 13 for setup and usage instructions.

To check your progress, you can run sqlite3 directly by running:

./sqlite3 --init hw07.sql

Readings: You might find the following references useful:

Questions

Streams

Q1: Multiples of 3

Define implicitly an infinite stream multiples-of-three that contains the multiples of 3.

You may use the map-stream function defined below. map-stream takes in a one-argument function f and a stream s and returns a new stream containing the elements of s with f applied.

(define (map-stream f s)
	(if (null? s)
		nil
		(cons-stream (f (car s)) (map-stream f (cdr-stream s)))))

Do not define any other helper functions.

(define (map-stream f s)
    (if (null? s)
    	nil
    	(cons-stream (f (car s)) (map-stream f (cdr-stream s)))))

(define multiples-of-three
  'YOUR-CODE-HERE
)

Q2: Run-Length Encoding

Run-length encoding is a very simple data compression technique, whereby runs of data are compressed and stored as a single value. A run is defined to be a contiguous sequence of the same number. For example, in the (finite) sequence

1, 1, 1, 1, 1, 6, 6, 6, 6, 2, 5, 5, 5

there are four runs: one each of 1, 6, 2, and 5. We can represent the same sequence as a sequence of two-element lists:

(1 5), (6 4), (2 1), (5 3)

Notice that the first element of each list is the number in a run, and the second element is the number of times that number appears in the run.

We will extend this idea to streams. Write a function called rle that takes in a stream of data, and returns a corresponding stream of two-element lists, which represents the run-length encoded version of the stream. You do not have to consider compressing infinite streams - the stream passed in will eventually terminate with nil.

scm> (define s (cons-stream 1 (cons-stream 1 (cons-stream 2 nil))))
s
scm> (define encoding (rle s))
encoding
scm> (car encoding)  ; Run of number 1 of length 2
(1 2)
scm> (car (cdr-stream encoding))  ; Run of number 2 of length 1
(2 1)
scm> (define s (list-to-stream '(1 1 2 2 2 3))) ; Makes a stream with the same elements as the list passed in
scm> (stream-to-list (rle s)) 
((1 2) (2 3) (3 1))
(define (rle s)
  'YOUR-CODE-HERE
)

SQL

Dog Data

In each question below, you will define a new table based on the following tables.

CREATE TABLE parents AS
  SELECT "abraham" AS parent, "barack" AS child UNION
  SELECT "abraham"          , "clinton"         UNION
  SELECT "delano"           , "herbert"         UNION
  SELECT "fillmore"         , "abraham"         UNION
  SELECT "fillmore"         , "delano"          UNION
  SELECT "fillmore"         , "grover"          UNION
  SELECT "eisenhower"       , "fillmore";

CREATE TABLE dogs AS
  SELECT "abraham" AS name, "long" AS fur, 26 AS height UNION
  SELECT "barack"         , "short"      , 52           UNION
  SELECT "clinton"        , "long"       , 47           UNION
  SELECT "delano"         , "long"       , 46           UNION
  SELECT "eisenhower"     , "short"      , 35           UNION
  SELECT "fillmore"       , "curly"      , 32           UNION
  SELECT "grover"         , "short"      , 28           UNION
  SELECT "herbert"        , "curly"      , 31;

CREATE TABLE sizes AS
  SELECT "toy" AS size, 24 AS min, 28 AS max UNION
  SELECT "mini"       , 28       , 35        UNION
  SELECT "medium"     , 35       , 45        UNION
  SELECT "standard"   , 45       , 60;

Your tables should still perform correctly even if the values in these tables change. For example, if you are asked to list all dogs with a name that starts with h, you should write:

SELECT name FROM dogs WHERE "h" <= name AND name < "i";

Instead of assuming that the dogs table has only the data above and writing

SELECT "herbert";

The former query would still be correct if the name grover were changed to hoover or a row was added with the name harry.

Q3: Size of Dogs

The Fédération Cynologique Internationale classifies a standard poodle as over 45 cm and up to 60 cm. The sizes table describes this and other such classifications, where a dog must be over the min and less than or equal to the max in height to qualify as a size.

Create a size_of_dogs table with two columns, one for each dog's name and another for its size.

-- The size of each dog
CREATE TABLE size_of_dogs AS
  SELECT "REPLACE THIS LINE WITH YOUR SOLUTION";

The output should look like the following:

sqlite> select * from size_of_dogs;
abraham|toy
barack|standard
clinton|standard
delano|standard
eisenhower|mini
fillmore|mini
grover|toy
herbert|mini

Q4: By Parent Height

Create a table by_parent_height that has a column of the names of all dogs that have a parent, ordered by the height of the parent from tallest parent to shortest parent.
-- All dogs with parents ordered by decreasing height of their parent
CREATE TABLE by_parent_height AS
  SELECT "REPLACE THIS LINE WITH YOUR SOLUTION";

For example, fillmore has a parent (eisenhower) with height 35, and so should appear before grover who has a parent (fillmore) with height 32. The names of dogs with parents of the same height should appear together in any order. For example, barack and clinton should both appear at the end, but either one can come before the other.

sqlite> select * from by_parent_height;
herbert
fillmore
abraham
delano
grover
barack
clinton

Q5: Sentences

There are two pairs of siblings that have the same size. Create a table that contains a row with a string for each of these pairs. Each string should be a sentence describing the siblings by their size.
-- Filling out this helper table is optional
CREATE TABLE siblings AS
  SELECT "REPLACE THIS LINE WITH YOUR SOLUTION";

-- Sentences about siblings that are the same size
CREATE TABLE sentences AS
  SELECT "REPLACE THIS LINE WITH YOUR SOLUTION";

Each sibling pair should appear only once in the output, and siblings should be listed in alphabetical order (e.g. "barack and clinton..." instead of "clinton and barack..."), as follows:

sqlite> select * from sentences;
abraham and grover are toy siblings
barack and clinton are standard siblings

Hint: First, create a helper table containing each pair of siblings. This will make comparing the sizes of siblings when constructing the main table easier.

Q6: Low Variance

We want to create a table which contains the total height of all dogs that share a fur type, but only for the fur types where the heights are not spread out too much. To approximate this, we'll only consider the fur types where each dog with that fur is within 30% of the average height of all dogs with that fur.

For example, if the average height for short-haired dogs is 10, then in order to be included in our output, all dogs with short hair must have a height of at most 13 and at least 7.

Your output should first include the fur type and then the total height for the fur types that meet this criteria

-- Total size for each fur type where all of the heights differ by no more than 30% from the average height
CREATE TABLE low_variance AS
  SELECT "REPLACE THIS LINE WITH YOUR SOLUTION";

-- Example:
SELECT * FROM low_variance;
-- Expected output:
-- curly|63

Explanation: There is at least one long haired dog with a height that is not within 30% of the average height of long haired dogs, and at least one short haired dog with a height not within 30% of the average height of short haired dogs, so neither short nor long haired dogs are included in the output. There are two curly haired dogs: one with height 32, and one with height 31, for a total height of 63.