A concise introduction to the emerging field of data science, explaining its evolution, relation to machine learning, current uses, data infrastructure issues, and ethical challenges.

The goal of data science is to improve decision making through the analysis of data. Today data science determines the ads we see online, the books and movies that are recommended to us online, which emails are filtered into our spam folders, and even how much we pay for health insurance. This volume in the MIT Press Essential Knowledge series offers a concise introduction to the emerging field of data science, explaining its evolution, current uses, data infrastructure issues, and ethical challenges.

It has never been easier for organizations to gather, store, and process data. Use of data science is driven by the rise of big data and social media, the development of high-performance computing, and the emergence of such powerful methods for data analysis and modeling as deep learning. Data science encompasses a set of principles, problem definitions, algorithms, and processes for extracting non-obvious and useful patterns from large datasets. It is closely related to the fields of data mining and machine learning, but broader in scope. This book offers a brief history of the field, introduces fundamental data concepts, and describes the stages in a data science project. It considers data infrastructure and the challenges posed by integrating data from multiple sources, introduces the basics of machine learning, and discusses how to link machine learning expertise with real-world problems. The book also reviews ethical and legal issues, developments in data regulation, and computational approaches to preserving privacy. Finally, it considers the future impact of data science and offers principles for success in data science projects.

John D. Kelleher is Academic Leader of the Information, Communication, and Entertainment Research Institute at Technological University Dublin. He is the coauthor of Data Science and the author of Deep Learning, both in the MIT Press Essential Knowledge series.

About

A concise introduction to the emerging field of data science, explaining its evolution, relation to machine learning, current uses, data infrastructure issues, and ethical challenges.

The goal of data science is to improve decision making through the analysis of data. Today data science determines the ads we see online, the books and movies that are recommended to us online, which emails are filtered into our spam folders, and even how much we pay for health insurance. This volume in the MIT Press Essential Knowledge series offers a concise introduction to the emerging field of data science, explaining its evolution, current uses, data infrastructure issues, and ethical challenges.

It has never been easier for organizations to gather, store, and process data. Use of data science is driven by the rise of big data and social media, the development of high-performance computing, and the emergence of such powerful methods for data analysis and modeling as deep learning. Data science encompasses a set of principles, problem definitions, algorithms, and processes for extracting non-obvious and useful patterns from large datasets. It is closely related to the fields of data mining and machine learning, but broader in scope. This book offers a brief history of the field, introduces fundamental data concepts, and describes the stages in a data science project. It considers data infrastructure and the challenges posed by integrating data from multiple sources, introduces the basics of machine learning, and discusses how to link machine learning expertise with real-world problems. The book also reviews ethical and legal issues, developments in data regulation, and computational approaches to preserving privacy. Finally, it considers the future impact of data science and offers principles for success in data science projects.

Author

John D. Kelleher is Academic Leader of the Information, Communication, and Entertainment Research Institute at Technological University Dublin. He is the coauthor of Data Science and the author of Deep Learning, both in the MIT Press Essential Knowledge series.

Books for National Depression Education and Awareness Month

For National Depression Education and Awareness Month in October, we are sharing a collection of titles that educates and informs on depression, including personal stories from those who have experienced depression and topics that range from causes and symptoms of depression to how to develop coping mechanisms to battle depression.

Read more

Horror Titles for the Halloween Season

In celebration of the Halloween season, we are sharing horror books that are aligned with the themes of the holiday: the sometimes unknown and scary creatures and witches. From classic ghost stories and popular novels that are celebrated today, in literature courses and beyond, to contemporary stories about the monsters that hide in the dark, our list

Read more

Books for LGBTQIA+ History Month

For LGBTQIA+ History Month in October, we’re celebrating the shared history of individuals within the community and the importance of the activists who have fought for their rights and the rights of others. We acknowledge the varying and diverse experiences within the LGBTQIA+ community that have shaped history and have led the way for those

Read more