Data Science
One of the buzziest areas in technical skills is that of Data Science including the fields of Data Analytics and Machine Learning. The terms often get used interchangeably or mixed up as they sound very similar. The Venn Diagram shows the skills needed for each role.
Let's define Data Science first and then examine the skills needed and how the fields fit together.
What is Data Science?
Data Science is a term used to encompass how you gain insights into future behavior from big data. It includes:
- Data cleansing, preparation of the data through manipulation, and analyzing those results.
- The ability to capture data that often comes from multiple sources including web scraping, APIs, and other databases.
- Using skills such as machine learning and analysis to glean insights from the data and to find patterns.
What is Data Analytics?
Data Analytics is a field where the user looks at existing data - querying it and manipulating it - to answer questions. This also involves being able to present the data in a useful and easy to understand way.
What is Machine Learning?
Let's not forget about this part. Machine Learning is the practice of using algorithms to grab data, learn from it and forecast trends by using statistical analysis and predictive analysis. Facebook is a great example of this as is Amazon and Netflix recommendations.
Algorithms
Algorithms are essentially the steps a program should take but they are created using words or diagrams not code. Most of my students learn how to create pseudo code or flowcharts in either Advanced Python or Make Your Own Web App.
Data Science | Data Analyst | Machine Learning |
---|---|---|
Python or R programming languages | SQL | In-depth programming skills in Python, R or Scala |
SQL and NoSQL databases like MongoDB | Excel | Knowledge of probability and statistics |
Algorithms/modeling | Tableau or other similar visualization tools | Data modeling skills |
Understand multiple analytical functions | Knowledge of mathematical statistics |
Why Learn Data Science?
Well, it's fun and useful. But the primary reason is just how critical it is to the future.
- Data Science offers one of the most promising long-term and rapidly growing career path based on technological advances around how data is mined to make it useful for every industry. Pus, you can build cool apps like the Netflix recommendation tool.
- A 2019 report by the National Association of Manufacturing and Deloitte Consulting predicted the the US will create 3.5 million STEM jobs by 2025, but about 2 million will be unfilled due to a lack of skilled workers.
- Employment for data scientists is anticipated to grow by 16% by 2028.
Based on this, there are already arguments put forth that we should shift our emphasis on teaching Calculus to data science for high school students.
Another huge reason is simply all the advantages from learning to think this way. Coding teaches you how to solve a problem, how to layout a logical solution, how to synthesize information, and how to go out and find the information you need. You need to learn all those skills to be a data scientist and even a few more.
- Statistics
- Algorithms
- Communication Skills
- Ability to Collaborate
Data Science is a mix of math, programming, communication and my favorite - critical thinking.
- How do you figure out the answer to that question?
- How do you structure a program to get the answer you need?
Anyone can write code - but can you solve a problem?