Definitions[edit | edit source]

Data scientists are

the information and computer scientists, database and software engineers and programmers, disciplinary experts, curators and expert annotators, and others, who are crucial to the successful management of a digital data collection.[1]

A data scientist is

a practitioner who has sufficient knowledge in the overlapping regimes of business needs, domain knowledge, analytical skills, and software and systems engineering to manage the end-to-end data processes in the data life cycle.[2]

Overview[edit | edit source]

"In the early days of experiments, experts in a particular domain would perform the data analysis. With the advent of computers for analysis, additional skills in statistics or machine learning were needed for more sophisticated analysis, or domain experts would work with software and system engineers to build customized analytical applications. With the increase in complexity of compute-intensive simulations across parallel processors, computational science techniques were needed to implement the algorithms on these architectures. For data-intensive applications, all of these skill groups are needed to distribute both the data and the computation across systems of resources working in parallel. While data scientists seldom have strong skills in all these areas, they need to have an understanding of all areas to deliver value from data-intensive applications and work in a team that spans all of these areas."[3]

A data scientist has strong business acumen, coupled with the ability to communicate findings to both business and IT leaders in a way that can influence how an organization approaches a business challenge. Good data scientists will not just address business problems; they will pick the right problems that have the most value to the organization.

Whereas a traditional data analyst may look only at data from a single source a data scientist will most likely explore and examine data from multiple disparate sources. The data scientist will sift through incoming data with the goal of discovering a previously hidden insight, which in turn can provide a competitive advantage or address a pressing business problem. A data scientist does not simply collect and report on data, but also looks at it from many angles, determines what it means, then recommends ways to apply the data.

The intellectual contributions of data scientists are key drivers for progress in the information sciences/data collections field. The career path for data scientists is not yet mature. The mechanisms to recognize their contributions are not fully in place."[4]

References[edit | edit source]

Source[edit | edit source]

  • "Overview" section: IBM, "What is data scientist" (full-text).
Community content is available under CC-BY-SA unless otherwise noted.