Hands on Cars dataset

Plan

  1. Data Organization
    1. Changing Data types
      1. Len -> Number; Removing the geographic role
      2. With -> Number
    2. Convert to Mesure and to Dimensions
      1. (Len, Width) -> [Dimension -> Measure]
      2. (AWD, Minvan, …) -> [Measure -> Dimension]
    3. Create additional Variables
      1. Class
      2. Brand
      3. Model
      4. Engine Type
      5. Drive Train
    4. Organize in folders and hide unused variables
    5. Create Hierarchies
      1. Brand -> Model -> Name
      2. Engine Type -> Cylinders
  2. Explore Single variables
    1. Frequency distribution - Discrete
      1. Absolute versus Relative
      2. Sort;
      3. BarH, BarV, Pie
      4. Using hierarchies
      5. Labels
    2. Frequency distribution - Continuos
      1. Histogram; Controlling the bins
        1. Crop
        2. LogScale
      2. Boxplot
  3. Joint distribution of 2 variables
    1. 2 categorical
      1. Absolute versus Relative
    2. 2 continuos variables
    3. Scatter Plot Matrix