Introduction
I am deeply passionate about 2 fields: Data Science and start-ups.
I feel data science is the only way to enable logical decisions in this world and constantly improve yourself through self-exploration. On the other hand, I am also a start-up guy! Before on-boarding Analytics Vidhya, I had done multiple internships across various start-ups.
I have an inherent fascination and deep rooted respect for entrepreneurs. I consume any kind of content I get, which helps me understand them better. I hope one day, I’ll be able to create significant value for this world we live in through my own pursuit! So, it was only a matter of time before I put out this list – A list of datapreneurs across the globe.
Who is a datapreneur?
A datapreneur is basically an entrepreneur focused on data science. In interest of avoiding the definition conflict, I am using data science in a broad sense – any effort made to extract information from data. So this would include Big Data, Business Intelligence, Business Analytics, Predictive Modeling, Machine learning etc.
Please see that I am excluding entrepreneurs using data science to solve other problems. Hence, you will not see Larry Page & Sergey Brin in this list! Nor would you see Airbnb or Uber. Similarly, I have not included the likes of Doug Cutting (creator of Hadoop & Lucene). Hope this gives you a purpose of this list.
Further, in order to represent this list in meaningful manner, I have divided the datapreneurs by their focus areas, namely:
- Data Products
- Data Science services
- Data Science Trainings
- Data Science Communities
A few more things to note before we look at the list:
- The list is not in any order, each of these contribution is immense and unique!
- Some of these companies would have overlapping presence – for example, SAS creates data products and does trainings as well. I have classified them in the area, which I thought was their primary focus
Data Products
Jim Goodnight & John Sall (SAS)

Jim Goodnight holds a doctorate in statistics from North Carolina State University, where he was a faculty member from 1972 to 1976. Harvard Business School named him a “Great American Business Leader” for his leadership of a business that has changed the way Americans have lived, worked and interacted. Jim is currently the CEO of SAS.

Christian Chabot (Tableau)


Michael F Koehler (TeraData)

Micheal is associated with Teradata for the last 37 years. He became CEO of Teradata in 2010, and the year 2011 became the best year in its company history with more revenue growth and new customers added than any single year in its history. Teradata is one of the largest company in big data space today. Koehler holds a bachelor’s degree in business administration from the University of Delaware.
Arun C Murthy (Hortonworks)


Michael Saylor (MicroStrategy)
MicroStrategy was founded in 1989 in Wilmington, DE, by fellow MIT alumni Michael J. Saylor and Sanju Bansal. MicroStrategy’s early focus was on data mining software for businesses which later evolved into providing the most flexible, powerful, scalable, and user-friendly analytics and identity management platforms, offered either on premises or in the cloud.
MicroStrategy is positioned by Gartner, Inc. in the “Leaders” quadrant in Gartner’s 2013 “Magic Quadrant for Business Intelligence and Analytics Platforms” report, and in the “Challengers” quadrant in Gartner’s 2013 “Magic Quadrant for Mobile Application Development Platforms” report.
Mr. Saylor has served as Chairman of the Board of Directors and Chief Executive Officer since founding MicroStrategy in November 1989. Mr. Saylor holds a B.S. in Aeronautics and Astronautics and B.S. in Science, Technology and Society from the Massachusetts Institute of Technology. Mr. Saylor is the author of bestselling book ‘The Mobile Wave’.
Roman Stanek (GoodData)


Lars Bjork ( Qlik )

Currently, Lars Bjork is the CEO of QlikTech. He holds an MBA from the University of Lund, Sweden and a Degree in Engineering from the Technical College in Helsingborg. Before Qlik, Mr. Björk held several positions as CFO at companies such as ScandStick and Resurs Finance. Under Mr. Björk’s leadership, Qlik has grown 3x in revenues.
Alex Carp (Palantir)
Started in 2004, Palantir helps companies to find answers to the most complex questions by making products for human driven analysis of real world data. Palantir has received more than $215 million in U.S. government contract work since 2009, while FORBES estimated that the company took in about $450 million in revenue in 2013.
Palantir was founded by Alex Carp, Peter Theil, Joe Lonsdale and Stephen Cohen. Alex is the current CEO of Palantir, a Palo Alto-based software firm worth an estimated $20 billion. Alex holds has a bachelor’s degree from Haverford College, a Doctor of Jurisprudence degree from Stanford University, and a doctorate in neoclassical social theory from Frankfurt University.
Christophe Bisciglia (Cloudera, WibiData )

Prior, Christophe worked as a senior engineer at Google where he founded and led Google’s Academic Cloud Computing Initiative, which provides Google hosted computational resources to facilitate education and research to universities around the world. He completed his education from University of Washington.
Josh James (Domo)

Prior to Domo, Josh served as CEO of Omniture, a SaaS-based web analytics company that he co-founded in 1996 and took public in 2006. Omniture was the number one returning venture investment out of 1,008 venture capital investments in 2004, as well as the number two performing technology IPO of 2006. He was named the 2006 Ernst & Young Entrepreneur of the Year and Brigham Young University’s Technology Entrepreneur of the Decade. In 2009, he facilitated Omniture’s sale to Adobe for $1.8 billion.
Dwight Merriman (MongoDB)
Recently, MongoDB has been included in the list of 17 best startups to work in America. To help people learn MongoDB, they also own MongoDB University where you’ll find training courses for every set of audience.
MongoDB was founded by Dwight and Eliot Horowitz in the year 2007. Dwight holds a computer science degree from Miami University. In 1995, he co-founded DoubleClick (acquired by Google for $3.1 billion) and served as Chief Technology Officer for 10 years. Earlier he was Co-Founder, Chairman, and the original architect of Panther Express (merged with CDNetworks), a content distribution network (CDN) technology. Dwight is also a Co-Founder of, and investor in, Business Insider and Gilt Groupe.
John Schroeder (MapR)
This company relies on Apache Hadoop and claims to be the largest Hadoop distribution player that sells Hadoop projects and support services. Their core product is the MapR software that runs on clusters of commodity servers. The software is available in three editions.
MapR was founded in 2009 by current CEO, John Schroeder. John holds a bachelor degree in computer science from SIU. Prior to MapR, John held executive positions in number of software companies such as Calista Technologies, Rainfinity, Brio Technologies etc. Nearly 90% of MapR’s revenues are derived from subscription to their software. The company is expected is close 2015 at a revenue figure of $200 million.
Jonathan Ellis (Datastax)
Datastax develops solution based on commercially supported, enterprise-ready Apache Cassandra, the open source NoSQL database technology widely-acknowledged as the best foundation for tackling the most challenging big data problems.

Data Science Services
Gurjeet Singh (Ayasdi)
Ayasdi is an advanced analytics company that provides machine learning software to Fortune 500 companies to solve their complex data challenges. Ayasdi pioneered the use of Topological Data Analysis (TDA), to simplify and accelerate complex data analysis.
Ayasdi was founded in 2008 at Stanford. Gurjeet Singh is Ayasdi’s CEO and Co-Founder. Gurjeet holds a B.Tech. from Delhi University, and a Ph.D. in Computational Mathematics from Stanford University. Before starting Ayasdi, he worked at Google and Texas Instruments. Gurjeet was named by Silicon Valley Business Journal as one of their 40 Under 40 in 2015. Ayasdi has shown promising growth in the past few years and raised $55 million in March this year.
Carlos Guestrin (Dato)


Srikant Velamakani (Fractal)

Srikanth has a BS in Electrical Engineering from IIT-Delhi and MBA degree from IIM Ahmedabad. A former investment banker, he co-founded Fractal more than 14 years ago. Prior to Fractal, he worked on structured debt transactions and collateralized bond obligations at ANZ Investment Bank and ICICI.
Dhiraj C. Rajaram (Mu-Sigma)


Arnab Gupta (Opera Solutions)

Arnab founded opera solutions in 2004 and has guided the company in becoming a premier center of Big Data science and practice. He holds a MBA degree from Harvard Business School. Prior to opera solutions, he also founded a business consulting firm Mitchell Madison Group and Zeborg, a business intelligence company. Opera is estimated to have annual revenue of $100 million.
Anil Kaul (AbsolutData)

Absolutdata intends to empower companies to make better decisions through optimal use of data. In 2008, Absolutdata was ranked among the fastest-growing companies in India and Asia by the ‘Deloitte Technology Fast 50 India‘ and the ‘Deloitte Technology Fast 500 Asia Pacific‘ programs.
Anil has over twenty years of experience in marketing, strategic consulting and quantitative modeling. He has a PhD in quantitative marketing from Cornell University. He is a recognized thought leader in the industry, having published articles in leading management and academic journals such as the McKinsey Quarterly, Marketing Science, Journal of Marketing Research and International Journal of Research in Marketing.
Trainings
Andrew Ng (Coursera)
Coursera, one of the largest open source courses platform available on internet today, was founded by Andrew Ng with Daphne Koller. He also serves as Chief Scientist of Baidu, a Chinese language search engine.

Sebastian Thrun (Udacity)


Vik Paruchuri (Dataquest)
Vik founded Dataquest in November 2014 with a motive of helping people learn real world data science skills. Dataquest allows you to gain hands on experience on R, Python, Linear Algebra and other essentials modules of data science interactively.

Gaurav Vohra and Sarita Digumarti (Jigsaw Academy)
Jigsaw “aims to meet the growing demand for talent in the field of analytics by providing industry-relevant training and education to develop business-ready professionals.

Lovleen Bhatia (Edureka)
Edureka aims to make learning easy, interesting, affordable and accessible to millions of learners across the Globe. With the use of technology, excellent instruction and flexible schedule it aims to become the largest and most engaging learning platform on earth.

Data Science Communities
Anthony Goldbloom (Kaggle)

Gregory Piatetsky-Shapiro (Kdnuggets)
Gregory is a role model for us at Analytics Vidhya. He started Kdnuggets back in 1997, when people had no idea what data science is. Just imagine – this was before Google started! The best way to understand Kdnuggets is to think of them as Craigslist of analytics – If you need any thing in data science / analytics, Kdnuggets is probably your best bet.
Gregory Piatetsky-Shapiro, Ph.D. is the Founder of KDnuggets, which provides consulting in the areas of business analytics, data mining, data science, and knowledge discovery. He has extensive experience developing CRM, customer attrition, cross-sell, segmentation and other models for some of the leading banks, insurance companies, and telcos. He also worked on data analysis of clinical trial, microarray, and proteomic data for several leading biotech and pharmaceutical companies. He is also the co-founder of ACM SIGKDD, the leading professional organization for Knowledge Discovery and Data Mining.
Rohit Sivaprasad (Datatau)

Sivaprasad himself is a data evangelist and has built up much of his data science knowledge on his own, through online courses. He has contributed to the scikit-learn machine learning toolkit for Python.
P.S. I truly believe Kunal should be part of this list. He started Analytics Vidhya in April 2013 with a vision to remove the silos of data science knowledge across the world. Today, Analytics Vidhya is the world’s largest and fastest growing analytics community (as per Alexa ranking). However, our publishing guidelines prohibit me to put him in this list.
End Notes
I think we owe a lot to these datapreneurs for what they have done to create these products, services, trainings and communities. It is difficult to imagine the data science eco-system with out them. I also think that this is just the start of a revolution in making and we will see a lot of action in this area in coming years.
What do you think about these contributions? Do you think, there are others who should be added to this list? Or some contribution I have missed? Do let me know your thoughts in comments below.
