Introduction
The stock of a data scientist is at an all-time high right now. There aren’t too many professions out there that can rival the specter, luster and respect a data scientist commands as we head into 2020.
I have seen non-data science folks (or non-technical folks) look at a data scientist as someone with superpowers. There are plenty of reasons for this (media hype being one of them) but there’s no doubt that the job of a data scientist is a highly valued one.
Check out Gartner’s publishes Hype Cycle for Artificial Intelligence in 2019 below:
To support this, here is Linkedin’s 2019 report on the most promising jobs and I am sure you would have guessed the profession that tops the list:
Those numbers are eye-popping. From Fortune 500 companies to retail stores, organizations around the world want to build a team of top data science professionals. 2019 broke all previous records of investment in data science and AI.
But despite all of these positive trends, there is an underlying feeling of discomfort. Data scientists are quitting or changing their jobs at a rapid pace. Why is this happening? Is there something we aren’t being told?
Let’s analyze 5 key reasons why data scientists are leaving their seemingly dream jobs. If you have faced any of them yourself, or want to share your own experience, share your thoughts with the community in the comments section after this article!
1. Expectation vs. Reality – Here Lies A Wide, Wide Gap!
This is one of the most prevalent issues in the data science field. There is an ever-widening gap between what data scientists expect and what they actually work on in the industry.
There are multiple reasons for this and they can vary from one data scientist to another. The level of experience also plays a part in this expectation chasm.
Let’s take the example of aspiring data scientists. They are typically self-learned and have gathered their knowledge from books and online courses. They don’t have a lot of exposure to real-world projects and datasets. I’ve also come across plenty of aspiring data scientists who had no idea about:
- How a machine learning pipeline works
- The role software engineering plays in the overall data scientist skillset
- What putting a model into production/deploying a model means, etc.
- The importance of data cleaning and that it takes up the majority of your time
As I mentioned in the introduction above, the chance to play around with swanky machine learning tools and state-of-the-art frameworks is too tempting for freshers (and everyone else, honestly!).
Here’s the reality – the industry doesn’t work like that. There are too many factors at play to make a data science project anything close to what we experience in online data science competitions.
How do you collect and store data, how to properly perform version control, how to deploy your model into production – these are just some of the key aspects the organization expects you to know.
This mismatch in expectations is a major roadblock and leads to data scientists quitting their jobs. I always advise freshers and amateur data scientists to constantly talk to their seniors and organization alumni to bridge this gap between expectation and reality.
2. Mapping a Data Scientists’ Role to Business Goals
Here’s another (un)popular expectation problem. This is primarily attributable to the hype around data science and artificial intelligence (AI) in recent years.
Executives, CxOs, C-Suite folks, investors – all of these people in the higher echelons of businesses want to showcase that their organization or project is at the forefront of the latest technological advances. AI right now is THE field to invest in.
Here’s the problem – we’ve seen a ton of these senior folks believe that AI is a silver bullet for their business problems. If they invest in AI and the right experts, they’ll find the solution in double-quick time.
Unfortunately, that’s not how it works. Data science projects typically involve a lot of experiments, trial and error methods and iterations of the same process before they reach the final outcome. It takes months and months to arrive at the desired outcome.
Data Warehouse and artificial intelligence infrastructure require a heavy investment (depending on the size of the company) but the discoveries in the work may take time as forming actionable insights from vast swathes of data is often very time-consuming. This is the reason why data scientists demand a flexible approach – one where they are given their time and space to work on data.
This does not go down well with business leaders in a lot of spheres. I have seen this lead to a mass exodus in projects when data scientists eventually get too frustrated with their senior leadership and their unrealistic expectations.
How Data Scientists and Business Leaders can work together effectively:
- Establish solid communication between data science and business teams. They must be cohesive and co-ordinated
- Harness business intuition and knowledge from business leaders. This can work wonders for data scientists
- Co-develop a measurable performance matrix for the business to measure the performance progress of data scientists
- Agility plays a great role in extracting the best from a data scientist
I would highly encourage all data science professionals and business leaders to go through the below series of articles by Dr. Om Deshmukh. He lays out the framework for running a successful data science project in a detailed fashion:
- A Data Science Leader’s Guide to Managing Stakeholders
- How can you Convert a Business Problem into a Data Problem? A Successful Data Science Leader’s Guide
- 4 Key Aspects of a Data Science Project Every Data Scientist and Leader Should Know
- Deployed your Machine Learning Model? Here’s What you Need to Know About Post-Production Monitoring
3. Lack of Upskilling for Data Science Professionals
Who doesn’t love new challenges? I would argue that the data science field is ripe for these challenges given the rapid pace at which advancements happen. Take the Natural Language Processing (NLP) domain for instance. The number of developments that have happened in the last two years is mind-boggling.
Almost every data scientist would love to work on these new techniques and frameworks. I mean, who would enjoy building and then iterating on the same logistic regression model for years?
The data scientists’ role is not immune to the stagnancy factor. There is a brick wall you hit after a certain point of time and the feeling of wanting a new challenge is always around the corner.
Add to this the above two factors we mentioned about managing expectations. It’s a heady mix of things, right? It’s inevitable that any employee would suffer from a lack of motivation after a certain point.
This is especially true in bigger companies where flexibility is low. I’m sure a lot of you must have experienced this if you’ve worked in any blue-chip firm. Startups and medium-sized businesses are still better in this regard (but they present a different set of challenges too).
Here are three key reasons I’ve encountered that lead to employee attrition:
- Lack of Infrastructure: That’s the case with most businesses, they lack the infrastructure like computing systems, accessibility to tools, etc. to support the role of a data scientist
- Scope of Business: The operational capacity of the business might be restricted and narrow. Beyond a point, it might get difficult for a data scientist to infer more insights from data
- Lack of Research & Development: As a data scientist you would love to explore the field beyond your scope of work. For example, if you are a Computer Vision expert and want to learn about NLP then an R&D zone would be the best place for you. Most companies lack that and this results in attrition
4. No Clear Benchmarking in Salary Pay-Outs
Ah – I can see your eyes light up at the above heading. Salary is one of the primary reasons people want to break into data science and make it a full-time career.
We regularly see reports from McKinsey, Glassdoor and the like where they showcase exorbitant average salaries for data scientists. Most folks would have their head turned by the numbers quoted in those reports.
The sky is the benchmark for the salary of a data scientist. I’m sure you’ve read the news this year when we saw top data scientists being poached by companies like Google and Apple (Ian Goodfellow comes to mind).
This is becoming a regular occurrence. Data scientists who are doing exceptional work in their respective fields are usually poached by top Fortune 500 companies who offer highly inflated salaries while the mid-cap and small firms cannot offer so much (usually).
I feel there is a need for some standardization/benchmark when it comes to compensation. Even in the mid-cap firms, there needs to be a clear demarcation in the salary of a fresher with high skills versus an experienced data scientist with the same level of skills. Not benchmarking salaries leads to:
- Dissatisfactory work performance even by a high potential employee
- A major trigger for employees to influence each other in the office to consider opportunities elsewhere
Again – this aspect is not too different from any other job, is it?
5. Great Exposure to Different Data Science Projects on Different Platforms
What would you wish for the most between these two options:
- Option 1: A 9-5 job where you have to align your skills and results to achieve companies objectives, or
- Option 2: A highly flexible work life where you can work from anywhere and achieve high self-growth?
Most of you might choose Option 2. Who doesn’t love flexibility at work and a free hand at choosing what you want to work on?
Today there are a plethora of options for a data scientist to choose from:
- They can try their luck at competing on platforms like Kaggle, Analytics Vidhya, etc, and win exciting prices and tremendous fame in the community
- Freelancers are in much demand as companies today have exciting short term projects to offer
- Freelance data scientists know their way around Spark, Hadoop, Hive, Pig, SQL, Neo4J, MySQL, Python, R, Scala, TensorFlow, NLP, Computer Vision or anything machine learning because they get to jump into the problem and discover how to solve that
- Writing blogs and personal branding is the hot pick this season for a lot of data scientists. Just like Grant Sanderson – he is my favorite!
Organizations cannot offer most of these to resident data science professionals for obvious logistical and project-related reasons. This is an inevitable cost of any project, honestly.
How can Companies Retain their Champion Data Scientists?
Here are a few tried and tested ways in which a business can retain its most talented data scientists:
- Create a power-packed learning environment: This is essential for the personal and professional growth of an individual. This field is booming with something new to explore every day and at this pace its critical to give a progressive learning environment for data scientists
- Formulate a strong Research and Development team: Creating an R&D team can enable quality research that can be undertaken in the field. Enabling employees to conduct research on deep topics is a recipe for excellence
- Benchmark their compensation: Benchmarking compensation will instill trust and give data scientists an assurance that they are being paid according to the best industry standards (understandably, this is significantly harder to do)
End Notes
Everything about the data science field is ultra-dynamic. We are still understanding a lot of things so settling on one aspect or process or structure is proving difficult for businesses.
With time, I am sure we will have robust systems and processes in place and data scientists will have a fulfilling working environment. This point needs work, both from a business point-of-view as well as a data scientists’ point-of-view.
I want to hear your views on this. Are you working on data science problems? Have you experienced any of the above issues? Are there any other issues you want to share? Let us know in the comments section below!