Privacy-Preserving Data Lakes for Cross-Organizational Insights
Abstract:
Various organizations like hospitals, banks, e-commerce, retail and supply chain, etc. are generating incredible amounts of data by virtue of digital technology. Machines also contribute to data in the form of closed circuit television streaming, web site logs, etc. Social media and smart phones generate tons of data every minute. The large amounts of data created by all the sources can be processed and analyzed to support decision-making. However, data analytics is vulnerable to privacy violations.
Introduction:
As there is exponential growth in volume and variety of data because of diverse applications of computers in all domain areas. It has been achieved due to affordable availability of computer technology, storage, and network connectivity. This large scale data includes person-specific private and sensitive data such as gender, zip code, disease, caste, shopping cart, religion etc., which is being stored in the public domain. The data holder can release this data to a third party data analyst to gain deeper insights and identify hidden patterns which are useful in making important decisions that may help in improving businesses, provide value added services to customers [1], prediction, forecasting and recommendation [2]. One of the most noticeable applications of data analytics is in recommendation systems that most ecommerce sites like Amazon, Flip kart use to suggest products to customers based on their buying habits.
Privacy threats in data analytics:
Privacy is the ability of a person to decide whom he wants to share data, and to whom he wants access control. If the data is in public domain then it is a threat to personal privacy as the data is maintained by data holder. Data holder can be social networking application, websites, mobile apps, ecommerce site, banks, hospitals etc. The duty of the data holder is to ensure privacy of users data.
Conclusion:
Organizations are very sensitive about data privacy, yet new techniques allow for safe data sharing while ensuring the privacy of individual data. Techniques like such may allow for encrypted searches on data lakes and the cloud while preserving the analytical quality of data.Commonly implemented solutions do not provide adequate protection from data theft and disclosures of privacy. None could avoid data breaches by encrypting at rest.
Comments
Post a Comment