Join our Updates

We send newly fresh and beautiful contents to your inbox regularly. Enjoy each premium newsletters in your mail.
Email address
Your email will never be shared

Data sciences: Where shall I begin?


From my experience reading publications on many data sciences related groups on Facebook, this question is very frequently asked by users.

The answers that are usually provided vary drastically from starting with Python or R programming courses, to taking some data sciences courses on YouTube, Coursera, etc.
Little attention is paid to the background of the student. Generally, the responses tend to position Data Sciences as a programming or algorithmic field.
In addition, I have seen all kind of questions based on visualizations and modeling results produced in Python or R. Many tend to show a lack of basic statistics understanding. 

It is very important to point out that all these domain of specialization (Data Sciences, Data Analytics, Data Engineers) have the word Data in it.  By definition, «Statistics is the science of conducting studies to collect, organize, summarize, analyze and draw conclusions from data.». It is the core foundation of all these new fields. It is not surprising that in the US, most undergraduate students are required to take basic statistics course in order to graduate. 

My suggestions for a beginner learner, is to start by taking a sound Basic statistics course. You can take such course with any university or from qualified instructors preferably with a background in Statistics.  

Some of the topics you will learn are:

  • Statistics, data and statistical thinking
  • Types of data
  • Basic notions of samples and populations
  • Methods for describing quantitative data and qualitative data
  • Counting techniques (Permutations and Combinations)
  • Probability
  • Discrete Random variables
  • Continuous random variables (Normal distribution)
  • Sampling Distributions
  • Inferences based on a single sample (Confidence intervals and tests of hypotheses)
  • Inferences based on two samples 
  • ANOVA (Analysis of Variance)
  • Correlations and Simple Linear Regression
  • Multiple regression
  • Basic categorical data analysis

A basic statistics course will provide the necessary foundation to start learning other machine learning topics.

It is also recommended to have some basic math skills such as college algebra, calculus and linear algebra.

While taking the statistics course, the next course in mind are SQL and Spark SQL.  You need to develop strong SQL skills in order to extract and analyze large datasets.  Python and R are needed programming languages. You can start with Python first. Overtime, there could be some cases where you need to learn R because it’s the most complete statistical programming language.

Like it? Share with your friends!


What's Your Reaction?

confused confused
fun fun
geeky geeky
love love
lol lol
omg omg
win win

I have over twenty years of work experience in the field of statistics as an Applied Statistician and a Data Scientist. For the last twelve years, I have also been teaching undergraduate college level statistics courses at St Petersburg College.  As a Data scientist, I have developed over the years a strong interest in computational statistics.  I am also interested in teaching data sciences techniques  to students and those interested in using data sciences techniques in their respective fields.   My programming languages of choice are Python and R.  I am also interested in Spark, Databricks, PySpark.


Join our Updates

We send newly fresh and beautiful contents to your inbox regularly. Enjoy each premium newsletters in your mail.
Email address
Your email will never be shared
Choose A Format
Personality quiz
Series of questions that intends to reveal something about the personality
Trivia quiz
Series of questions with right and wrong answers that intends to check knowledge
Voting to make decisions or determine opinions
Formatted Text with Embeds and Visuals
The Classic Internet Listicles
The Classic Internet Countdowns
Open List
Submit your own item and vote up for the best submission
Ranked List
Upvote or downvote to decide the best list item
Upload your own images to make custom memes
Youtube, Vimeo or Vine Embeds
Soundcloud or Mixcloud Embeds
Photo or GIF
GIF format