In the Prep Data section of our notebook, we are performing two operations:
- Selecting the significant independent variables (columns) that we will use as features for our regression models and only including these columns in our data frame.
- Encoding categorical variables so that they can be used by our models.
To incorporate the Prep Data section as a step in our Kubeflow pipeline, please modify your copy of our notebook to meet the following requirements:
- Create a new pipeline step.
- Set the step name to
- Specify the correct step on which
prep_datadepends as the Depends on parameter.
- As part of this annotation, include all cells that contain code that is core
to the step
- Exclude cells in the Prep Data section that are not core to the functionality of this step.
When you are finished, compare your notebook to the solution and make any necessary changes so that your notebook matches the solution.