Difference between Hadoop, HDFS, Big Data, MapReduce, Data Science | Who is data scientist
Data Science
Data
Science is the science to study data and convert it into actionable form. . For
this, there are number of frameworks available. Data Science inherently deals
with huge amount of data which is analyzed by framework.
Terminologies of Data Science
Data Science is an art for converting the big data to actionable form. Big Data, huge amount of data, is the input for this processing. Hadoop and Spark are frameworks to work with Big Data and perform Data Science over it. HDFS (Hadoop Distributed File System) is the concept of Hadoop storage through which it distributes data over multiple nodes and maintains it thereby. Map Reduce is the concept of machine learning when data is mapped (grouped) on basis of one criteria and Reduce in another form (on the basis of other criteria).
Data
Scientist
Data scientist is someone who finds new discoveries with data. They look for meaning/ knowledge of data. They look for patterns in data. Knowledge of Mathematics, statistics and computers is essential for any data scientist. To be a good data scientist, implementation of the most optimum algorithm is the need of an hour.
Data
scientist is given big data and a question to answer. Different patterns are
analyzed using differently discovered algorithms on these input parameters to
give best results.
The knowledge of data scientist lies somewhere between
- · Programming
- · Business Intelligence
- · Algorithms
This
is probably an indifferent skill set. It does not deal only with programming
and implementations but with discovering new algorithms and finding predictive
knowledge.
Comments
Post a Comment