Analyzing Formula 1 Team Performance with SQL and Python: Part 3

Key Findings

Following are the key findings of this data analysis,

  • Since 2000 to 2024, Ferrari, Red Bull and Mercedes have emerged as the most dominant constructors based on wins and number of points.

  • Red Bull has maintained dominance in the last 4 years

  • Lewis Hamilton has emerged as the most successful driver in terms of wins, points and podiums

  • Max Verstappen has emerged as the most dominant driver in the past 5 years

  • Lewis Hamilton has maintained most dominance in single circuits

  • Performance of constructors tend to change based on the circuit type

Limitations of the Data

The dataset only contains data from 2000 onwards. Since formula 1 first started in 1950, the dataset only contains about 1/3rd of overall data. The dataset doesn’t contain valuable data like fastest lap, sprint race data which ultimately decides who wins the drivers’ championship and constructors championship. The dataset doesn’t include the final finishing point of drivers and constructors in the points table which could have enabled more analysis and comparisons.

Future Studies

This study can be further expanded to include predictive models to make predictions for the upcoming season. Machine learning algorithms such as,

  • Regression models (for predicting race positions or points): Linear regression, Random Forest, XGBoost, etc.

  • Classification models (for predicting outcomes, like race win/no win): Logistic Regression, SVM, Random Forest, etc.

  • Time-series models (if you want to account for performance trends over time): ARIMA, LSTM (for deep learning-based predictions) can be used to derive accurate predictions for the upcoming formula 1 season

Conclusion

This study has highlighted the profound impact of data in Formula 1, showcasing how data-driven insights influence team strategies, driver performance, and race outcomes. By examining key performance metrics, such as race wins, podium finishes, and team dominance, we have gained valuable insights into how top teams like Ferrari, Red Bull, and Mercedes have shaped the sport. Furthermore, the study underscores the significance of circuit-specific performance trends, revealing how certain teams thrive on specific tracks. Although the dataset has its limitations, such as its exclusion of important race data like fastest laps and sprint race results, it provides a robust foundation for further research. Future studies leveraging machine learning models can enhance predictive accuracy and deepen our understanding of the evolving dynamics of Formula 1 racing.