How to set up and reproduce data science experiments

In this learning path, you will learn how to set up data science projects. You will also learn how to consistently reproduce or execute Jupyter notebooks in the data science projects and serve the developed models in the form of a web service on top of Red Hat OpenShift.

Explore data, notebooks, and models

After starting your server, three sections appear in JupyterLab's launcher:

  • Notebook
  • Console
  • Other

On the left side of the navigation pane, locate the Name explorer panel (Figure 4). This panel is where you can create and manage your project directories.

The Name explorer panel in a JupyterLab workspace shows available options.
Figure 4: The Name explorer panel in a JupyterLab workspace shows available options.

Clone a GitHub repository

Now it's time to populate your JupyterLab notebook with a GitHub repository. Select the Git/Clone a Repository menu option. A dialog box appears (Figure 5).

Enter the URL for the Git repository you want to clone.
Figure 5: Enter the URL for the Git repository you want to clone.

Enter the repository URL, which is https://github.com/rh-aiservices-bu/peer-review.git, and select Clone to clone the peer-review repository. Once cloned, you should be able to see the contents of the peer-review repository as shown in Figure 6.

The contents of the cloned peer-review repository are shown in the console.
Figure 6: The contents of the cloned peer-review repository are shown in the console.

Scroll down the Launcher panel on the right to view the Terminal under Other (Figure 7).

A terminal is available in the Other section of the Launcher page.
Figure 7: A terminal is available in the Other section of the Launcher page.

Click Terminal and type the following commands into the resulting terminal:

 cd peer-review

 pip install -r requirements.txt

The pip command will install the packages you need, as shown in Figure 8.

Install packages required for notebooks (from requirements.txt).
Figure 8: Install packages required for notebooks (from requirements.txt).
Previous resource
Overview: How to set up and reproduce data science experiments
Next resource
Review and reproduce notebooks