How to access, download, and analyze data for S3 usage

In this learning path, you will start your Jupyter notebook server and select preferences for S3 usage. You will also learn how to access and download the data you create as well as analyze it, using a variety of skills and tools.

Getting ready to run analysis on your new CSV file

Double-click the 'newtruckdata.csv' file. File contents should appear as shown in Figure 10.

The user interface shows the contents of the newtruckdata.csv file.
Figure 10. The user interface shows the contents of the newtruckdata.csv file.

 

Since you now have data, you can open the next Jupyter notebook, simpleCalc.ipynb, and perform the following operations:

  • Create a dataframe.
  • Perform simple total and average calculations.
  • Print the calculation results.

Double-click the simpleCalc.ipynb file. When you execute the cells in the  notebook, results appear like the ones shown in Figure 11.

The  SimpleCalc.ipynb notebook shows the results of executing its cells.
Figure 11. The SimpleCalc.ipynb notebook shows the results of executing its cells.

 

The cells in Figure 11 show the mileage of four vehicles.  In the next cell, we calculate total mileage, total rows (number of vehicles) and the average mileage for all vehicles.  Execute the “Perform Calculations” cell to see basic calculations performed on the data (Figure 12).

Calculations show the total mileage as 742, for four vehicles, and an average mileage of 185.5.
Figure 12. Calculations show the total mileage as 742, for four vehicles, and an average mileage of 185.5.

 

Success! You have analyzed your run results using Red Hat OpenShift Data Science.

Previous resource
Access and download S3 data
Next resource
Analyzing your S3 data access run results