Table Exercises

We work with the schools table from before.

from datascience import *
import numpy as np

schools = Table.read_table('data/cal_unis.csv')
schools.show(3)
Name Institution City County Enrollment Founded
California State Polytechnic University, Humboldt CSU Arcata Humboldt 6025 1913
California State University, Bakersfield CSU Bakersfield Kern 9613 1965
University of California, Berkeley UC Berkeley Alameda 45307 1869

... (29 rows omitted)

Pull up the Python Reference and try these exercises yourself! They are roughly in order of increasing difficulty.

Exercise 1: Variable names

How do we get all the column labels of schools?

schools.labels
('Name', 'Institution', 'City', 'County', 'Enrollment', 'Founded')

Exercise 2: Reorder columns

How do we reorder the columns, as below?

Name Founded Institution City County Enrollment

Hint: use one of select, drop, or with_columns.

schools.select(
  "Name", "Founded",
  "Institution", "City",
  "County", "Enrollment"
)
Name Founded Institution City County Enrollment
California State Polytechnic University, Humboldt 1913 CSU Arcata Humboldt 6025
California State University, Bakersfield 1965 CSU Bakersfield Kern 9613
University of California, Berkeley 1869 UC Berkeley Alameda 45307
California State University Channel Islands 2002 CSU Camarillo Ventura 6128
California State University, Dominguez Hills 1960 CSU Carson Los Angeles 16426
California State University, Chico 1887 CSU Chico Butte 14183
University of California, Davis 1905 UC Davis Yolo 39679
California State University, Fresno 1911 CSU Fresno Fresno 23999
California State University, Fullerton 1957 CSU Fullerton Orange 40386
California State University, East Bay 1957 CSU Hayward Alameda 13673

... (22 rows omitted)

Exercise 3: Filtering

schools.show(3)
Name Institution City County Enrollment Founded
California State Polytechnic University, Humboldt CSU Arcata Humboldt 6025 1913
California State University, Bakersfield CSU Bakersfield Kern 9613 1965
University of California, Berkeley UC Berkeley Alameda 45307 1869

... (29 rows omitted)

  1. How do we get a table with only UC schools?
schools.where("Institution", "UC")
Name Institution City County Enrollment Founded
University of California, Berkeley UC Berkeley Alameda 45307 1869
University of California, Davis UC Davis Yolo 39679 1905
University of California, Irvine UC Irvine Orange 35937 1965
University of California, Los Angeles UC Los Angeles Los Angeles 46430 1882
University of California, Merced UC Merced Merced 9110 2005
University of California, Riverside UC Riverside Riverside 26809 1954
University of California, San Diego UC San Diego San Diego 42006 1960
University of California, Santa Barbara UC Santa Barbara Santa Barbara 26420 1891
University of California, Santa Cruz UC Santa Cruz Santa Cruz 19478 1965
  1. How do we get a table with all the schools in Los Angeles?
schools.where("City", "Los Angeles")
Name Institution City County Enrollment Founded
University of California, Los Angeles UC Los Angeles Los Angeles 46430 1882
California State University, Los Angeles CSU Los Angeles Los Angeles 26460 1947

Exercise 4: Rename Columns

How do we update schools such that the column Name is renamed University? Hint: Check out the method relabeled.

schools = schools.relabeled("Name", "University")
schools.show(3)
University Institution City County Enrollment Founded
California State Polytechnic University, Humboldt CSU Arcata Humboldt 6025 1913
California State University, Bakersfield CSU Bakersfield Kern 9613 1965
University of California, Berkeley UC Berkeley Alameda 45307 1869

... (29 rows omitted)

There are many ways to approach a problem. Suppose you didn’t know the method relabeled existed:

schools = schools.with_column("University", schools.column("Name")).drop("Name")
schools.show(3)
Institution City County Enrollment Founded University
CSU Arcata Humboldt 6025 1913 California State Polytechnic University, Humboldt
CSU Bakersfield Kern 9613 1965 California State University, Bakersfield
UC Berkeley Alameda 45307 1869 University of California, Berkeley

... (29 rows omitted)

Questions for you as you read the above code:

  1. What methods are being chained together here? Which is executed first, with_column or drop?
  2. When is the column method being called?
  3. What would happen if you switched the two methods? (try it out)
  4. Why is University now the last column?

Exercise 5: Debugging show

The following cells exhibit a tricky, but potentially common bug. First, check out the Python Reference to understand what show does.

schools = Table.read_table('data/cal_unis.csv')
schools = schools.show(3)
Name Institution City County Enrollment Founded
California State Polytechnic University, Humboldt CSU Arcata Humboldt 6025 1913
California State University, Bakersfield CSU Bakersfield Kern 9613 1965
University of California, Berkeley UC Berkeley Alameda 45307 1869

... (29 rows omitted)

schools = schools.show(4)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[12], line 1
----> 1 schools = schools.show(4)

AttributeError: 'NoneType' object has no attribute 'show'

Like print, show is for display purposes. It does not return anything—i.e., it returns None. The second cell above therefore assigns schools to None!

Despite this, show—just like print—is useful when you want to display intermediate output for debugging purposes.

schools = Table.read_table('data/cal_unis.csv')
schools.show(3) # just the first three rows
schools = schools.relabeled("Name", "University")
schools.show(3) # and again
schools = schools.with_columns("Name", "City")
schools         # the last evaluated value in a cell
                # is displayed by default
Name Institution City County Enrollment Founded
California State Polytechnic University, Humboldt CSU Arcata Humboldt 6025 1913
California State University, Bakersfield CSU Bakersfield Kern 9613 1965
University of California, Berkeley UC Berkeley Alameda 45307 1869

... (29 rows omitted)

University Institution City County Enrollment Founded
California State Polytechnic University, Humboldt CSU Arcata Humboldt 6025 1913
California State University, Bakersfield CSU Bakersfield Kern 9613 1965
University of California, Berkeley UC Berkeley Alameda 45307 1869

... (29 rows omitted)

University Institution City County Enrollment Founded Name
California State Polytechnic University, Humboldt CSU Arcata Humboldt 6025 1913 City
California State University, Bakersfield CSU Bakersfield Kern 9613 1965 City
University of California, Berkeley UC Berkeley Alameda 45307 1869 City
California State University Channel Islands CSU Camarillo Ventura 6128 2002 City
California State University, Dominguez Hills CSU Carson Los Angeles 16426 1960 City
California State University, Chico CSU Chico Butte 14183 1887 City
University of California, Davis UC Davis Yolo 39679 1905 City
California State University, Fresno CSU Fresno Fresno 23999 1911 City
California State University, Fullerton CSU Fullerton Orange 40386 1957 City
California State University, East Bay CSU Hayward Alameda 13673 1957 City

... (22 rows omitted)