Solution - Lab: Create
In the Build Models subsection of the Modeling section, we split data for training and evaluation and then build and evaluate three regression models. In this lab, we’ll define a pipeline step for the data splitting portion of our workflow.
Annotate one or more cells in the Build Models section of our notebook to meet the following requirements:
- Annotate one or more cells in our notebook to create a pipeline step named
split_datathat splits our dataset for use in later training and evaluation steps.
- Specify the correct dependency relationship for
Requirement 1: We accomplish splitting the data in just one cell. Annotate this
cell as depicted below to create the
split_data pipeline step.
Requirement 2: The
split_data step depends on the output of
selects just the significant columns in our dataset and transforms categorical
columns for training.
Our pipeline can now be depicted as: