Definitions[edit | edit source]

Data science

is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing."[1]
is the empirical synthesis of actionable knowledge from raw data through the complete data lifecycle process.[2]
focus[es] on data quality to make a decision successfully as real and actionable data. It is inherently cross-functional to categorize data work flows such as getting data and analysing data.[3]

Overview[edit | edit source]

"Data science combines various technologies, techniques, and theories from various fields, mostly related to computer science and statistics, to obtain actionable knowledge from data."[4]}}

These techniques and theories come from many fields, including signal processing, mathematics, probability models, machine learning, statistical learning, computer programming, data engineering, pattern recognition and learning, visualization, uncertainty modeling, data warehousing, and high performance computing with the goal of extracting meaning from data and creating data products. The subject is not restricted to only big data, although the fact that data is scaling up makes big data an important aspect of data science.[5]

"The high quality of data makes sure of applications to do predictive analysis. There are a lot of technologies on data science such as storing and query of Structured datastructured]]/unstructured database, sorting and filtering of data with flexibility, scalability, and speed, etc. For social applications, tangible impact and sentiment analysis on data is used to demonstrate the value of data."

"Data science across the entire data life cycle incorporates principles, techniques, and methods from many disciplines and domains including data cleansing, data management, analytics, visualization, engineering, and in the context of Big Data, now also includes Big Data Engineering."[6]

"In its purest form, data science is the fourth paradigm of science, following experiment, theory, and computational sciences. The fourth paradigm is a term coined by Dr. Jim Gray in 2007."[7]

References[edit | edit source]

See also[edit | edit source]

This page uses Creative Commons Licensed content from Wikipedia (view authors). Smallwikipedialogo.png
Community content is available under CC-BY-SA unless otherwise noted.