Rate this book

Sharing Big Data Safely: Managing Data Security

Name: Sharing Big Data Safely: Managing Data Security
Rating: 3 (3 reviews)
ISBN: 9781491953648

Ted Dunning, Ellen Friedman

Rate this book

Many big data-driven companies today are moving to protect certain types of data against intrusion, leaks, or unauthorized eyes. But how do you lock down data while granting access to people who need to see it? In this practical book, authors Ted Dunning and Ellen Friedman offer two novel and practical solutions that you can implement right away.

Ideal for both technical and non-technical decision makers, group leaders, developers, and data scientists, this book shows you how to:

Share original data in a controlled way so that different groups within your organization only see part of the whole. You'll learn how to do this with the new open source SQL query engine Apache Drill.
Provide synthetic data that emulates the behavior of sensitive data. This approach enables external advisors to work with you on projects involving data that you can't show them.
If you're intrigued by the synthetic data solution, explore the log-synth program that Ted Dunning developed as open source code (available on GitHub), along with how-to instructions and tips for best practice. You'll also get a collection of use cases.

Providing lock-down security while safely sharing data is a significant challenge for a growing number of organizations. With this book, you'll discover new options to share data safely without sacrificing security.

96 pages, ebook

Published September 15, 2015

About the author

Ted Dunning

14 books4 followers

What do you think?

Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars

0 (0%)

4 stars

1 (20%)

3 stars

3 (60%)

2 stars

1 (20%)

1 star

0 (0%)

Displaying 1 - 3 of 3 reviews

Stefan C

4 reviews

January 10, 2017

I think that the title of this book is misleading. Safely sharing data implies confidentiality, integrity, authenticity etc. Contrary, this book only talks about obfuscating values by using a tool written by the authors. Even in its limited context it fails to explain its most important challenge: how to design KPIs to asses the goodness of fake data.

Bill Metangmo

5 reviews1 follower

November 27, 2017

- Share data safely also means provide data outise the company(kaggle) etc ...

- Apache Drill supports views chaining security

- synthetic generated data can be useful to resolve issue if you can't have access to data, stacktraces ... of a client

Mike Fowler

208 reviews10 followers

January 29, 2021

Interesting pointers as to why anonymising data is hard. Introduces log-synth, a tool for generating random data that can be used to share data models without the real data.

data technology

Displaying 1 - 3 of 3 reviews

Join the discussion

Sharing Big Data Safely: Managing Data Security

Ted Dunning, Ellen Friedman

About the author

Ted Dunning

Ratings & Reviews

Friends & Following

Community Reviews

Join the discussion