Issue #24 - Can ChatGPT replace a Data Scientist?
The newsletter of MLPills.dev
💊 Article of the week
Welcome to a special issue! In this article, we will try to get a model that could be able to predict whether the price of Bitcoin will increase or decrease the next day. However, we will use a different approach today, we will ask ChatGPT. We will pretend we don't know anything (or very little about Python, Time Series and Machine Learning). We will then see if Data Scientists can be replaced in the near future or not. Read it here!
Thanks for reading Machine Learning Pills! Subscribe for free so you don’t miss any future issues! Plus you’ll receive a free gift!
✍️ Test your knowledge!
We will ask a question that is related to the article but it cannot be directly answered by reading it. This time is just to know your opinion. Let’s play! I’ll also be very happy to read any comments!
The correct answer will be revealed next week!
📢 What’s everyone talking about?
Check the best AI-related news of this week:
Code Interpreter was released by OpenAI in Beta to ChatGPT Plus users. This plugin of ChatGPT can execute code and interact with the files we upload. Currently available for Python, this powerful tool takes AI to the next level, enabling seamless collaboration between humans and machines. However, it doesn’t have access to the Internet.
GPT-4 API is now available to all paying OpenAI API customers.
PikaLabs joins the text-to-video battle with impressive quality compared to its competitors.
💡 We also recommend…
You can also check 4 of my articles published on Medium:
Refine Time Series Data by Eliminating Noise through Data Smoothing
Uncovering Patterns in Time Series Data by Eliminating Outliers
❓ Get ready for your interview!
What are the steps of any Data Science project?
Define the problem or question to be answered: Clearly articulate the problem you aim to solve or the question you want to address through data analysis.
Gather and understand the data: Collect relevant data from various sources and gain a thorough understanding of its structure, quality, and potential limitations.
Prepare and clean the data: Cleanse the data by handling missing values, duplicates, and outliers, ensuring its reliability and quality for further analysis.
Perform exploratory data analysis (EDA): Explore and visualize the data to gain insights, identify patterns, and uncover relationships between variables.
Engineer relevant features from the data: Transform and create new features from the existing data to enhance the predictive power and improve the performance of the models.
Select appropriate modeling techniques: Choose the suitable algorithms and modeling approaches based on the problem requirements and available data.
Train the models using the prepared data: Use the prepared data to train the selected models, adjusting their parameters to optimize their performance.
Evaluate the model's performance: Assess the models' performance by comparing their predictions against known outcomes using appropriate evaluation metrics.
Refine and tune the models for better results: Fine-tune the models by adjusting hyperparameters, trying different algorithms, or applying regularization techniques to improve their performance.
Deploy the finalized model into a production environment: Integrate the chosen model into a production system or application, ensuring scalability, efficiency, and reliability for real-world use.
Communicate the findings and results to stakeholders: Present and communicate the outcomes of the data analysis, using visualizations and reports to effectively convey insights and provide recommendations.
Monitor and maintain the deployed model: Continuously monitor the model's performance, collect feedback, and periodically update the model to ensure accuracy and relevance in the production environment.
Don't miss out on additional questions on the website!
📝Check if you were right!
This was the result of last week’s poll. You did great this week!
The Forecast Bias is indeed in the original variable magnitude. However, both the Forecast Interval Coverage and the Prediction Direction Accuracy are in percentage.
You can read more about this in our previous article: Performance Metrics for Time Series Forecasting.




