Mobile Sensor Data Anonymization

Paper type: 
Workshop paper
Mohammad Malekzadeh, Richard G. Clegg, Andrea Cavallaro, and Hamed Haddadi
ACM/IEEE International Conference on Internet-of-Things Design and Implementation
Year: 
2019
Abstract: 
Motion sensors such as accelerometers and gyroscopes measure the instant acceleration and rotation of a device, in three dimensions. Raw data streams from motion sensors embedded in portable and wearable devices may reveal private information about users without their awareness. For example, motion data might disclose the weight or gender of a user, or enable their re-identification. To address this problem, we propose an on-device transformation of sensor data to be shared for specific applications, such as monitoring selected daily activities, without revealing information that enables user identification. We formulate the anonymization problem using an information-theoretic approach and propose a new multi-objective loss function for training deep autoencoders. This loss function helps minimizing user-identity information as well as data distortion to preserve the application-specific utility. The training process regulates the encoder to disregard user-identifiable patterns and tunes the decoder to shape the output independently of users in the training set. The trained autoencoder can be deployed on a mobile or wearable device to anonymize sensor data even for users who are not included in the training dataset. Data from 24 users transformed by the proposed anonymizing autoencoder lead to a promising trade-off between utility and privacy, with an accuracy for activity recognition above 92% and an accuracy for user identification below 7%.
Description: 
This paper looks the problem of releasing time-series data when privacy is a concern. It uses information theory to look at what extra information could "leak" if our device sends motion data. For example, can users be reidentified or can features such as height and weight be determined. A machine learning framework is given that can produce a tradeoff between allowing useful data to pass through while distorting the signal minimally to disguise information we wish to be private.
Preprint: 
bibtex: 
@inproceedings{moh_mobile_2019, booktitle={ACM/IEEE International Conference on Internet-of-Things Design and Implementation}, title={Faces in the Clouds: Long-Duration, Multi-User, Cloud-Assisted Video Conferencing}, author ={Mohammad Malekzadeh and Richard G. Clegg and Andrea Cavallaro and Hamed Haddadi}, year={2019} } }