Python is a
programming language that can be used for data analysis. Compared
with R, python is broader. There is lot’s of discussion about which
of the two is the best; see more on
https://www.r-bloggers.com/2018/12/why-r-for-data-science-and-not-python/
.
However the
libraries/packages in Python to be used for data analysis can do more
or less the same thing as R packages. Of course the commands are
different.
Install
The set up is as
follows:
1. install python on
your machine: https://www.python.org/downloads/
2. install anaconda,
GUI for python: https://docs.anaconda.com/anaconda/
3. start
anaconda-navigator.
From here you see
different programs/interfaces. For working with python jupyter
notebook is your choice. However, you can also work directly from the
python prompt.
For data analysis
some libraries need to be installed (use conda install from the
prompt)
- numpy
- pandas
- matplotlib
- scipy
Download the attached scripts and data to work with the examples
Jupyter in
anaconda
Now we are ready to
start a first analysis. Use the following notebook(.ipynb) in
jupyter, and run the commands in each chunk.
Download and run the
notebook gem2.ipynb together with the dataset gemeentedata.xls. This
is just a short intro into the basic of statistics using python.
R-studio in
anaconda
As you see from the
navigator it is also possible to run R (rstudio) to run
scripts(.Rmd) for data analysis. Download and run the following
script together with the data: rforjournalists_stats2.Rmd and
gemeentedata.xls
R kernel in
Jupyter
t is also possible
to use jupyter notebook for python and for R. After installing the R
kernel is available for jupyter notebooks. I had some trouble to get
R in Jupyter up and running, based on the following:
https://www.thetopsites.net/article/50566743.shtml.
Open a jupyter
notebook and load gem_r_injupyter.ipynb
together
with the dataset
Convert
.Rmd into .ipynb
Finally,
the following: how to translate .Rmd into .ipynb, so that you can use
and the R script as jupyter notebook using the R kernel. Install
the following:
https://rdrr.io/github/mkearney/rmd2jupyter/f/README.md
Run
in R:
rmd2jupyter('rforjournalists-stats.Rmd')
file saved as rforjournalists-stats.ipynb
Now you can run the .ipynb file in jupyter notebook with an R kernel.