How to become Data Scientist
Familiarity with the basics of programming will be a big advantage.
You can make it a little easier for yourself: start learning one language and focus on all the nuances of its syntax.
When choosing a language, pay attention to fuzzywuzzy Python. First, it is ideal for beginners, and its syntax is relatively simple. Secondly, Python is multifunctional and in demand in the labor market.
What to read
Automating Routine Tasks with Python: A Practical Beginner's Guide. A practical guide for those learning from scratch. It is enough to read the chapter "Manipulating strings" and complete the practical tasks from it.
Computers learn to act on their own, we no longer need to write detailed instructions for performing certain tasks. Therefore, machine learning is of great importance for almost any field, but above all it will work well where there is Data Science.
The first step in learning machine learning is to become familiar with its three main forms.
1) Supervised learning is the most advanced form of machine learning. The idea is to build a function that predicts target marks for new data based on historical data for which we know the "correct" values (target labels). Historical data is tagged. Labeling (assignment to a class) means that you have a specific output value for each line of data. This is the essence of the algorithm - lear more here Python workshop.
2) Learning without a teacher. We don't have tagged variables, but a lot of raw data. This allows you to identify what are called patterns in the historical input data, as well as draw interesting conclusions from a general perspective. So there is no output here, only the pattern visible in the uncontrolled input. The beauty of unsupervised learning is that it lends itself to many combinations of patterns, so these algorithms are more complex.
3) Reinforcement learning is applied when you have an algorithm with examples that are not labeled, as in unsupervised learning. However, you can supplement the example with positive or negative responses according to the solutions suggested by the algorithm. Reinforcement learning is about applications for which an algorithm must make decisions that have consequences. It's like learning through trial and error. An interesting example of reinforcement learning is when computers learn to play video games on their own.
Data Mining and Data Visualization
Data Mining is an important research process. It includes the analysis of hidden data models according to various translation options into useful information that is collected and generated in data warehouses to facilitate business decisions designed to reduce costs and increase income.
What to read and watch
How data analysis works. Great video with easy-to-understand explanation of data analysis.
Django filepathfield is an interesting article that takes a closer look at the importance of data analysis in the field of Data Science.
It is not very interesting to be engaged exclusively in theory, it is important to try your hand at practice. Here are some good options for doing this.
Use Kaggle. It hosts a data analysis competition. There are many open datasets that you can analyze and publish your results. Plus, you can watch scripts posted by other contributors and learn from successful experiences.
Confirmation of qualifications
After you've learned everything you need to analyze data and try your hand at open competitions, start looking for a job. Independent confirmation of your qualifications will be an advantage.