PDS 2017 Program

Wednesday August 16

10.00 Bus from downtown Tromsø (from bus terminal at Prostneset; close to hotel Edge) to Sommarøy; short stop at the airport to pick up participants arriving with morning flights from Oslo

11.30 Arrival Sommarøy hotel and lunch

13.00 Opening, Dag Johansen, Department of Computer Science, UiT – The Arctic University of Norway

Keynote: “Security, Privacy, and the Free Rider Problem: The Dark Side of the Internet of Things”, Stephen B. Wicker, School of Electrical and Computer Engineering, Cornell University

Abstract: Health care, sports, and other devices associated with the “Internet of Things” (IoT) often have communication capabilities that are provided through the addition of simple, modular components for which security is an afterthought.  As the DDOS attacks of October 2016 have shown, many IoT devices are not only a threat to themselves, but also a danger to the functionality of the Internet.  In this talk we will explore the extent to which manufacturers and consumers are exhibiting “free rider” behavior, exploiting Internet connectivity while being unable or unwilling to pursue security solutions that prevent IoT devices from being infected with malware.  The structure of the free-rider problem (FRP) will be introduced, and it will be shown that policy and technology solutions can be connected to the deeper structure of the manufacturer and consumer IoT FRPs.  We review results from our preliminary investigation of IoT malware, showing that even basic security measures will have substantial impact.

13.45-14.00 Break

Biometric Key Generation from Body Impedance Data“, Kasper Bonne Rasmussen, Department of Computer Science, University of Oxford

Abstract: The huge number of ubiquitous electronic devices and services deployed today, motivate the need for effortless user authentication and identification. While biometrics are a natural and convinient means of achieving user authentication, their use poses privacy risks, due mainly to the difficulty of preventing theft and abuse of biometric data. One way to  minimize information leakage is to derive biometric keys from users’ raw biometric measurements. Such keys can be used in subsequent security protocols and ensure that no sensitive biometric data needs to be transmitted or permanently stored.

In this talk we explore the use of human body impedance as a biometric trait for deriving secret keys. Building upon Randomized Biometric Templates as a key generation scheme, we devise a mechanism that supports consistent regeneration of unique keys from users’ impedance measurements. The underlying set of biometric features are found using a feature learning technique based on Siamese networks. Compared to prior feature extraction methods, the proposed technique offers significantly improved recognition rates in the context of key generation. Besides computing experimental error rates, we tailor a known key guessing approach specifically to the used key generation scheme and assess security provided by the resulting keys. We give a very conservative estimate of the number of guesses an adversary must make to find a correct key. The data confirms that our key generation approach produces high quality cryptographic keys.

WiFi Scanning, Crowd Monitoring, Privacy: An Experience Report“, Maarten van Steen,
University of Twente, The Netherlands

Abstract: Since a number of years we have been involved in large-scale monitoring of pedestrian movements by scanning hand-held devices such as smartphones. Dutch law states that the MAC address of a personal device is to be considered private information, rendering WiFi scanning practically problematic.

In this talk I will zoom into a number of issues including misunderstandings on the quality of WiFi scans, simple schemes for replacing MAC addresses with pseudonyms, and the difficulty of setting up a secure and privacy-preserving system for tracking pedestrians.

15.00-15.30 Break, refreshment/snacks, check-in hotel

SGX Enforcement of Use-Based Privacy”, Eleanor Birrell, Department of Computer Science, Cornell University

Abstract: Modern applications collect and generate vast quantities of data, and much of this data is private, sensitive, or subject to legal restrictions. However, the current approach to handling data privacy—notice and consent—is inadequate for today’s privacy requirements. Use-based privacy is an alternative philosophy that stems from the observation that most privacy concerns are about uses, not accesses. In this talk, I will describe our work developing a policy language for use-based privacy, and I will describe our new system that uses of Intel’s Software Guard Extensions (SGX) as a root of trust for enforcing use-based privacy in decentralized, distributed systems.

View from Hillesøytoppen

16.30 Social event (2-2,5 hours): guided tour to Hillesøytoppen (weather depending); local (UiT) guides Lars Brenna and Svein Arne Pettersen

19.00 Poster session (and software demos); refreshment

20.15 Dinner and social

Thursday August 17

09.00 Workshop resumes

“Privacy in the Cloud, Hard-won Lessons from Shipping Information Retrieval and Discovery Experiences at Scale to Microsoft Office 365 Users“, Bjørn Olstad and Troels Walsted Hansen,  Microsoft Development Center Norway

Abstract: Office 365 is Microsoft’s Enterprise cloud productivity suite, used by a massive set of organizations and information workers. These users put their trust in O365 by storing their confidential documents and emails in the service. At the same time they depend on the system to provide quick and effortless access to relevant information across all their devices. Users want to leverage the collective explicit and implicit insights across their people network to achieve more but without compromising perceived or actual privacy expectations. Privacy and security requirements are quickly increasing both from users, organization and legislation. Machine learning is at the same time dramatically increasing the potential effectiveness gains that can be made by reasoning over large amounts of data and activity streams. The talk will share practical lessons learned connected to these mega trends around security and machine learning. Concrete experiences such as enterprise search and Delve will be discussed. These experiences rely on Office Graph to capture relevant signals about interactions between users and content and process this in a secure, compliant and privacy preserving manner. Office 365 search, Office Graph and Delve are all developed by Microsoft Development Center Norway (MDCN), with offices in Oslo, Tromsø and Trondheim. In this talk you will learn about the hard-won privacy lessons learned from shipping these experiences while maintaining the trust of users and organizations.

9.45-10.00 Break

“Building and Measuring Privacy-Preserving Mobility Analytics”, Emiliano De Cristofaro, Department of Computer Science, University College London

Abstract: Location data can be extremely useful to study commuting patterns and disruptions, as well as to predict real-time traffic volumes. At the same time, however, the fine-grained collection of user locations raises serious privacy concerns, as this can reveal sensitive information about the users, such as, life style, political and religious inclinations, or even identities. In our paper, we study the feasibility of crowd-sourced mobility analytics over aggregate location information: users periodically report their location, using a provably secure, privacy-preserving aggregation protocol, so that the server can only recover aggregates — i.e., how many, but not which, users are in a region at a given time. We experiment with real-world mobility datasets obtained from the Transport For London authority and the San Francisco Cabs network, and present a novel methodology based on time series modeling that is geared to forecast traffic volumes in regions of interest and to detect mobility anomalies in them. In the presence of anomalies, we also make enhanced traffic volume predictions by feeding our model with additional information from correlated regions. Finally, we discuss challenges related to the possible privacy leakage from the aggregates themselves, as well as other applications of privacy-friendly analytics from aggregate statistics.

User-Centric Personal Data Analytics on the Edge”, Hamed Haddadi, School of Electronic Engineering and Computer Science, Queen Mary University of London

Abstract: In this talk, I discuss the ways in which we can utilize edge-computing to improve the scalability and privacy of user-centered analytics in the context of Databox project. I present a hybrid framework where edge devices and resources centered around the user, collectively referred to as fog, can complement the cloud for providing privacy-aware, yet accurate and efficient analytics. I present the evaluations of the proposed framework on a number of exemplar applications, and discuss the broader implications of such approaches for future systems.

11.00-11.30 Break, hotel check-out

Efficient Machine Learning for Disease Detection in the Human Digestive System”, Michael Riegler, Simula Research Laboratory, Oslo

Abstract: Health care has a long history of adopting technology to save lives and improve the quality of living. Visual information is frequently applied for disease detection and assessment, and the established fields of computer vision and medical imaging provide essential tools. It is, however, a misconception that disease detection and assessment are provided exclusively by these fields and that they provide the solution for all challenges. Integration and analysis of data from several sources, real-time processing, and the assessment of usefulness for end-users are core competences of the multimedia community and are required for the successful improvement of health care systems.

We will present investigations we made using machine learning for the use case of disease detection in the gastrointestinal (GI) tract, where the detection of abnormalities provides the largest chance of successful treatment if the initial observation of disease indicators occurs before the patient notices any symptoms. Although such detection is typically provided visually by applying an endoscope, we are facing a multitude of new multimedia challenges for different scenarios. In real-time assistance for colonoscopy, we combine sensor information about camera position and direction to aid in detecting, investigate means for providing support to doctors in unobtrusive ways, and assist in reporting. In the area of large-scale capsular endoscopy, we investigate questions of scalability, performance and energy efficiency for the recording phase, and combine video summarization and retrieval questions for analysis.

12.00-13.15 Lunch

META-pipe: marine metagenomics data analysis service”, Lars Ailo Bongo, Department of Computer Science, UiT – The Arctic University of Norway

Abstract: The ELIXIR (https://www.elixir-europe.org/) distributed infrastructure for life science data provides data analysis services for the 500.000 life science researchers in Europe. In this talk we will describe the ELIXIR marine metagenomics use case, and show how we use the ELIXIR platforms for compute and data interoperability. Particularly, we will describe the META-pipe marine metagenomics data analysis pipeline and its backend that is designed for distributed execution on ELIXIR compute resources. We will describe how it uses the ELIXIR Authentication and Authorization Infrastructure (AAI) for user management, and ELIXIR compute clouds for job execution. We show how our backend enables scalable distributed data analyses, and how we have simplified the management of the META-pipe analysis service.

“Safeguarding Analytics on Privacy Sensitive Data”,  Anders Tungeland Gjerdrum, Department of Computer Science, UiT – The Arctic University of Norway

Abstract: Cloud providers offering Software as a Service (SaaS) are increasingly being trusted by customers to store their data and provide means for analysis. This data is often accompanied by strict privacy and security policies requiring rigid enforcement mechanisms making curation and analysis non-trivial. Moreover, to offset the cost of hosting the potentially large amounts of data privately, SaaS companies even employ (Infrastructure as a Service) IaaS cloud providers not under the direct supervision of the administrative entity responsible for the data.

Intel software guard extensions (SGX) is a trusted computing infrastructure potentially guaranteeing confidentiality, integrity and attestation of code and data running on third-party services. Fundamentally, SGX provides primitives to create secure segments of code and data receding in process memory, effectively reducing the trusted computing base to only include the CPU, the trusted code segments and Intel. This talk evaluates the feasibility of using trusted computing primitives to safeguard analysis of privacy-sensitive data on third-party cloud providers. Moreover, based on the performance characteristics of Intel SGX, we will present a prototype analytic framework targeting confidentiality, integrity and correctness of code and data running on an untrusted infrastructure.

Workshop closing

Bus transportation back to Tromsø (~ 15.00) arrival downtown before 17.00)