image missing
Date: 2024-06-17 Page is: DBtxt003.php txt00001078

People
People who are data scientists

Tim O'Reilly selects some important people who are data scientists

COMMENTARY

Peter Burgess

I wrote a short piece for Forbes last week, which went out under the title Tim O'Reilly Picks the World's Most Powerful Data Scientists . It's one of those slide shows that abbreviates the short paragraph underneath each picture, so I figured I'd republish the full text here.

But more than that, I wanted to add a bit of color on the rationale I used for the somewhat unconventional list I chose. I also plan to write a separate post that gives you the 'outtakes' - the much longer list that I wrote up before settling on the seven (really ten, since I chose three pairs) that I handed in to Forbes.

That final list is very different from the one I started with. The original list skewed much more towards 'in the trenches' data science practitioners and inventors. I wasn't happy with it, mainly thinking about the Forbes business audience. Names like 'Doug Cutting, the creator of Hadoop' might be too much inside baseball for people outside of tech.

I then did a list that represented business leaders behind big data plays - everyone from Larry Page and Mark Zuckerberg and Jeff Bezos to Jim Simons of Renaissance Technologies. But I wasn't happy with that list either, since it seemed too obvious. The final list was a combination of a few big names, a few practitioners, and a few people who are outside the 'envelope' of the original assignment because they aren't strictly data scientists, but who tell an important story about industries that are being affected by big data. With that preamble, here's the final list I submitted to Forbes:

Tim O'Reilly Picks the World's Most Powerful Data Scientists

'The success of companies like Google, Facebook, Amazon, and Netflix, not to mention Wall Street firms and industries from manufacturing and retail to healthcare, is increasingly driven by better tools for extracting meaning from very large quantities of data. 'Data Scientist' is now the hottest job title in Silicon Valley. Larry Page Google has done more than any other company to push the boundaries of what is possible with big data. Along with Sergey Brin, Page built the search engine that tamed the web, solved the problem posed by John Wanamaker a century ago ('Half the money I spend on advertising is wasted; the trouble is I don't know which half.'), and has proceeded to accumulate the largest database on the planet in his quest to provide access to all the world's information. Anyone who recognizes the data science behind Google's self-driving car can understand the scope of Page's vision. Jeff Hammerbacher and DJ Patil Hammerbacher and Patil coined the term 'data scientist', now Silicon Valley's hottest job title, and built the first formal data science teams, at Facebook and LinkedIn respectively. Now at Cloudera, Hammerbacher has been key to driving the success of Hadoop as a standard tool for processing large, unstructured data sets with a network of commodity computers. As Data Scientist in Residence at Greylock Ventures, Patil is seeking out the next generation of hot data-driven startups. Sebastian Thrun and Peter Norvig When Thrun and Norvig decided to teach their Stanford course, Introduction to Artificial Intelligence, over the internet, they quickly signed up over 140,000 students [actually 160,000 - the number was still going up as I went to press], proving that this is no longer just an academic subject. Norvig is Google's chief scientist; Stanford professor Thrun is the leader of Google's effort to build a self-driving car that relies not just on AI algorithms, but the memory of hundreds of thousands of miles driven by Google's street view vehicles, recording and measuring everything they saw. Elizabeth Warren The banking system excesses that led to the economic crash of 2008 are an example of big data gone wrong. As the provisional head of the Consumer Finance Protection Bureau, Elizabeth Warren began the job of building the algorithmic checks and balances needed to counter the sorcerer's apprentices of Wall Street. In her campaign for the US Senate, she promises to continue that fight. Todd Park The Chief Technology Officer of the Department of Health and Human Services is leading the charge to transform American healthcare into a data driven business. From medical diagnostics to insurance reimbursement to community health statistics, Park is finding ways to use data to make healthcare more effective and affordable. Sandy Pentland Sandy is not only a wide-ranging polymath, he's providing the intellectual leadership on how sensors, the internet of things, geolocation and promiscuous connectivity can be used to uncover insights regarding human behavior. Sandy is also looking at privacy - an important adjunct to the data space - and helping develop the conversation regarding the trade-offs between privacy and the value of personal data. Hod Lipson and Michael Schmidt When Cornell computer scientists Hod Lipson and Michael Schmidt created an AI program that could distill the laws of motion merely by observing data about the swinging of a pendulum, they kicked off the field of robotic science, in which AIs try to derive meaning from datasets too large or complex for humans to study.' Why I chose the people I did Larry was obvious. He stood in for all the Silicon Valley companies that are built on big data, as one of the founders, and now the CEO of what is arguably the most important of those companies. I chose Jeff and DJ rather than Mark Zuckerberg to stand in for the importance of social networking both to give a nod to the practitioners at the heart of these great companies, but also because they coined the term 'data scientist.' Sebastian and Peter were also a shoe-in. In addition to the things I wrote about in Forbes, Norvig is the author of the leading AI textbook, and Thrun a key player in the development of the Google autonomous vehicle. They are out on the front lines in a variety of ways, but the astonishing success of their course stands in for the way that data science is becoming mainstream. From there, things were driven a bit by advocacy. I knew the financial industry had to be in the story. I toyed with including Nobel prize winner Harry Markowitz, the father of modern quantitative portfolio theory (and thus arguably the guy who started Wall Street down the path that led to all the trouble) or Jim Simons, of whom my friend Peter Bloom says 'no one in history has made as much money based on math as he has.' But I finally settled on Elizabeth Warren because I wanted to emphasize the need to rein in the speculation on Wall Street rather than to celebrate it. And I wanted to highlight that when she was working on the Consumer Finance Protection Board, she was thinking hard about what role technology could play in building a truly 21st century regulatory agency, and in my books, that will have to mean what I've been calling 'algorithmic regulation.' To highlight the Obama administration's attempts to make health care more accountable and numbers driven, I could have chosen Federal CTO Aneesh Chopra, Don Berwick of the Center for Medicare Services, or Farzad Moshtahari, the National Coordinator for Healthcare. I chose Todd for his entrepreneurial background, his tireless enthusiasm, and his inspirational role in making public health something that is measurable. I wanted someone to highlight the role that sensors are increasingly playing in generating data throughout our society. I decided to highlight Sebastian Thrun's AI course rather than his self-driving car, so he was already covered. I might have chosen Pascale Witz, who heads up GE's diagnostics business, or Deborah Estrin, who has been a pioneer in distributed sensing and is now working on a platform for sensors in mobile health. But I ended up settling on Sandy Pentland because he also brought in an opportunity to highlight the issues of privacy that are so important in the big data world. Finally, I threw in Hod Lipson and Michael Schmidt, despite the fact that their work has not yet had wide impact outside fairly narrow scientific circles, precisely because of the opportunity to get the Forbes readership thinking about truly disruptive innovation. It's easy for articles like this to focus on what's hot now. But for me, the right thing to focus on is what's happening now that will become hot over time. If, as William Gibson says, 'the future is here, it's just not evenly distributed yet,' telling people about things that are already well distributed misses the point. Part of what I try to do in my work is to help people see what's already happening that will have an outsized impact in years to come. This was a tough assignment. I enjoyed doing it, though even with all the explanations above, I'm not truly satisfied with my final list. There is so much fascinating work going on in this field, so many people to recognize, and I just can't do justice to them all. P.S. Last night I was chatting with +DJ Patil, and he suggested someone I hadn't thought of, who might well have made the final cut if he'd occurred to me: Jim Hansen, the NASA climate scientist who has been at the forefront of the global warming debate. Climate change is a big data problem par-excellence, modeling systems that are chaotic and unbelievably complex, and where the stakes may be higher than anything else we face. And it's an area where, if anywhere, data science rather than politics ought to be driving our decisions. Collapse this post

SITE COUNT Amazing and shiny stats
Copyright © 2005-2021 Peter Burgess. All rights reserved. This material may only be used for limited low profit purposes: e.g. socio-enviro-economic performance analysis, education and training.