Histograms and Ranges
Distributions and the area principle
There is an existing chapter that fully describes histograms. We highly recommend you read the textbook, then review the Data 6 lecture notebook.
Ranges
np.arange is a NumPy function useful for producing sequences of equally spaced numbers. Read the chapter for more details.
Read Inferential Thinking
Read Ch 5.2, which describes the np.arange function.
Before continuing, make sure that you:
- Know that the full function signature of np.arange(start, end, step).
- Know what the default arguments for startand/orstepare when you pass in one or two arguments.
- Remember that a range always includes its startvalue, but does not include itsendvalue. It counts up bystep, and it stops before it gets to theend.
Histograms
Read Inferential Thinking
Read Ch 7.2, which describes histograms in detail.
Before continuing, make sure that you:
- Understand terminology related to histograms:
- Bins (lower bound, upper bound)
- Density, area, proportion.
 
- Can use the area principle to explain histogram shape and bar density, area, and dimensions.
- Can compute area and proportion from bar dimensions.
- Can determine use cases for using bar charts over histograms, and vice versa.
- Can use the histmethod and specify the optional parameterdensityasTrueorFalse
Lecture Notebook
This notebook mostly covers the hist Table method in the datascience package. See the Data 6 Python Reference for full information.
tbl.hist(column): This table method has many optional arguments, but we highlight the most important ones here:
- bins: Specify bounds of bins, as an array. All but the last element of the array specifies bin lower bounds; the last element specifies the upper bound of the rightmost bin. If not specified, the default produces 10 equally spaced bins.
- density: Boolean value (- Trueor- False).- Trueby default calculates height as percent per unit. If- False, calculates height as count in bin.
