Data Science, academia to job

June 10, 2025

8 MIN READ
Data Science, academia to job

June 10, 2025

8 MIN READ

The intersection between academia and the job market in the area of data science and machine learning is wonderful — and complex. While in personal projects and competitions we are invited to focus on developing innovative algorithms and obtaining impressive and accurate scores, in the business environment, the challenges go far beyond simply building models.

Some recent readings have awakened this reflection on the main differences between the construction of algorithms in academic contexts/personal portfolios and their construction within large corporations. Software engineering can play a crucial role in the practical implementation of ML systems.

To better understand these differences, I turned to the excellent book “Designing Machine Learning Systems”, by Chip Huyen. It offers valuable insights into the complexities involved in implementing machine learning systems in production environments.

 

Scalability versus maintenance

One of the key differences is the emphasis on scalability and maintenance of systems that happens in the job market. On the other hand, there is the constant search for metrics optimization, which usually dominates the academic/competition/portfolio scenario.

While the challenges posed in competitions like Kaggle often involve honing a single model to maximize accuracy, or another performance metric, in the data science market professionals are faced with the task of integrating these models into larger systems, dealing with concerns such as scalability, computational efficiency, solution explainability, and long-term maintainability. This means that, yes, a few percentage points of accuracy are often given up, or even the choice of the model itself, for a more fluid integration with the system as a whole.

 

Individual versus team

The book also underscores the importance of interdisciplinary collaboration and effective communication in the workplace. Data scientists often need to collaborate with software engineers, business analysts, and other professionals to develop solutions that meet the needs and constraints of the business.

This collaboration contrasts sharply with the more ‘individualistic’ nature of the academic or competitive environment. This is because participants often work independently to develop their solutions.

 

Data Maintenance

However, one of the most significant differences between the academic approach and professional practice is in the stage of preparing and maintaining the data. In personal projects or challenges, often the focus is mainly on the modeling itself. Much of the time and resources are dedicated to experimenting with algorithms, feature engineering techniques, and hyperparameter optimization. Therefore, this approach tends to underestimate the importance of data quality, and all the time spent on its preparation, labeling, validation with business, and the like.

But it is worth remembering that the need to ensure data quality and consistency is not just an initial obstacle. It persists throughout the entire lifecycle of the ML system. A particularly relevant aspect is the concept of “model drift”. It refers to the deterioration of model performance over time due to potential variations in the input data. These changes can be caused by a variety of factors, such as changes in user behavior patterns or changes in operating conditions. Dealing with model drift requires continuous vigilance and effective monitoring strategies, as well as proactive maintenance of models in production.

 

Software Engineering

In summary, it was thinking about all these differences, and after working the last entire year on a project focused on maintenance and monitoring of models, that I chose to do a specialization in Software Engineering. The ability to understand and apply HE principles in machine learning projects allows this look beyond simple model building and addresses the broader challenges associated with implementing AI solutions in real-world environments.

One of the main advantages of delving deeper into the Ops universe is the ability to develop more robust and scalable ML systems. Deepening knowledge in software development practices broadens the vision. It also enables the adoption of a more structured and modular approach in the design and implementation of data pipelines and models. This makes systems easier to maintain and scale as business demands can evolve.

 

Data Science Skills

Today we have the role of the Machine Learning Engineer. Deeply immersed in data and software engineering, he brings this look to the implementation processes and MLOps in a much more active way than the Data Scientist.

However, I reinforce that these are skills that all Data Science professionals would benefit from having. They provide a more comprehensive understanding of testing best practices and continuous integration, and continuous delivery (CI/CD). And this is essential to ensure the quality and reliability of the model. They also broaden the look in terms of trade-off costs x scalability, ease in maintaining the life cycle x complexity, and even in the users’ understanding of how the solution works. Depending on the business, this may be the most important metric.

This is knowledge that makes you more capable of developing models in a useful and absolutely viable way for the company to implement them. And at the end of the day, that’s about it.

 

Leticia Gerola is a Data Science specialist at Programmers. His main responsibility is to lead and implement Machine Learning projects in a wide range of clients from various market sectors through cloud technology. With a career transition from journalism to the data area, he has a passion for new projects involving Artificial Intelligence and MLOps. In her leisure time, Leticia surfs on the beaches of the coast of São Paulo and participates in capoeira circles in the capital.

RELATED POSTS

No Results Found

The page you requested could not be found. Try refining your search, or use the navigation above to locate the post.

Recent Post

Stay up to date on the latest trends, innovations and insights.