Technology Final Report Secure/Resilient Systems and Data Dissemination/Provenance

They are all coming to the governmentagencies such as US State Department for dissemination to many stakeholders, to make sure ofsecurity of classified information cyber data, user data,

Trang 1

Technology Final Report

Secure/Resilient Systems and

Data Dissemination/Provenance

September 2017

Prepared forThe Northrop Grumman Cyber Research Consortium

As part of IS Sector Investment Program

Prepared byBharat BhargavaCERIAS, Purdue University

Trang 2

Table of Contents

1 Executive Summary 3

1.1 Statement of Problem 3

1.2 Current State of Technology 4

1.3 Proposed Solution 5

1.4 Technical Activities, Progress, Findings and Accomplishments 18

1.5 Distinctive Attributes, Advantages and Discriminators 23

1.6 Tangible Assets Created by Project 24

1.7 Outreach Activities and Conferences 25

1.8 Intellectual Property Accomplishments 26

2 General Comments and Suggestions for Next Year 26

List of Figures Figure 1 High-level view of proposed resiliency framework … ……… ……….……….……….5

Figure 2 Service acceptance test ……… ……… 7

Figure 3 View of space and time of MTD-based resiliency solution….……… … 9

Figure 4 Moving target defense application example…….…… ……… 10

Figure 5 High-level resiliency framework architecture.……… … 11

Figure 6 System states of the framework….……….12

Figure 7 Data d 1 leakage from Service X to Service Y….………14

Figure 8 Data Sensitivity Probability Functions ……… ………………15

Figure 9 Encrypted search over database of active bundles (by Leon Li, NG “WAXEDPRUNE” project) ……16

Figure 10 Experiment Setup For Moving Target Defense (MTD)………20

Figure 11 EHR dissemination in cloud (created by Dr Leon Li, NGC) ……… 21

Figure 12 AB performance overhead with browser's crypto capabilities on / off …… ……… 22

Figure 13 Encrypted Search over Encrypted Database ……… ……… 23

List of Tables Table 1 Executive Summary……….……… ….1

Table 2 Operations supported by different crypto systems … ……… 15

Table 3 Moving Target Defense (MTD) Measurements…… ……….21

Table 4 Encrypted Database of Active Bundles Table ‘EHR_DB’ ……… 22

Trang 3

1 Executive Summary

Title Secure/Resilient Systems and Data Dissemination/Provenance

Author(s) Bharat Bhargava

Principal Investigator Bharat Bhargava

zero-The volume of information and real time requirements have increased due to the advent ofmultiple input points of emails, texts, voice, tweets They are all coming to the governmentagencies such as US State Department for dissemination to many stakeholders, to make sure ofsecurity of classified information (cyber data, user data, attack event data) so that it can beidentified as classified (secret) and disseminated based on access privileges to the right user in aspecific location on a specific device For forensics/provenance, the determination of the identity

of all who have accessed/updated/disseminated the sensitive cyber data including the attackevent data is needed There is need to build systems capable of collecting, analyzing and reacting

to dynamic cyber events across all domains while also ensuring that cyber threats are notpropagated across security domain boundaries and compromise the operation of system

Trang 4

Solutions that develop a science of cyber security that can apply to all systems, infrastructure,and applications are needed The current resilience schemes based on replication lead to anincrease in the number of ways an attacker can exploit or penetrate the systems It is critical to

design a vertical resiliency solution from the application layer down to physical infrastructure in

which the protection against attacks is integrated across all the layers of the system (i.e.,

application, runtime, network) at all times, allowing the system to start secure, stay secure and return secure+ (i.e return with increased security than before) [13] after performing its function.

1.2 Current State of Technology

Current industry-standard cloud systems such as Amazon EC2 provide coarse-grain monitoringcapabilities (e.g CloudWatch) for various performance parameters for services deployed in thecloud Although such monitors are useful for handling issues such as load distribution andelasticity, they do not provide information regarding potentially malicious activity in the domain.Log management and analysis tools such as Splunk [1], Graylog [2] and Kibana [3] providecapabilities to store, search and analyze big data gathered from various types of logs onenterprise systems, enabling organizations to detect security threats through examination bysystem administrators Such tools mostly require human intelligence for detection of threats andneed to be complemented with automated analysis and accurate threat detection capability toquickly respond to possibly malicious activity in the enterprise and provide increased resiliency

by providing automation of response actions In addition Splunk is expensive

There are well-established moving target defense (MTD) solutions designed to combat againstspecific threats, but limited when there are exploits beyond their boundaries For instance,application-level redundancy and replication schemes prevent exploits that target the applicationcode base, but fail against code injection attacks that target runtime execution, e.g buffer andheap overflows, and control flow of the application

Instruction set randomization [51], address space randomization [4], randomizing runtime [5],and system calls [6] have been used to effectively combat against system-level (i.e return-oriented/code injection) attacks System-level diversification and randomizations are consideredmature and tightly integrated into some operating systems Most of these defensive securitymechanisms (i.e instruction/memory address randomizations) are effective for their targets,however, modern sophisticated attacks require defensive solution approaches to be deeplyintegrated into the architecture, from the application-level down to the infrastructuresimultaneously and at all times

Several general approaches have been proposed for controlling access to shared data andprotecting its privacy DataSafe is a software-hardware architecture that supports dataconfidentiality throughout their lifecycle [7] It is based on additional hardware and uses atrusted hypervisor to enforce policies, track data flow, and prevent data leakage Applicationsrunning on the host are not required to be aware of DataSafe and can operate unmodified andaccess data transparently The hosts without DataSafe can only access encrypted data, but it isunable to track data if they are disclosed to non- DataSafe hosts The use of a special architecturelimits the solution to well-known hosts that already have the required setup It is not practical to

Trang 5

assume that all hosts will have the required hardware and software components in a domain service environment.

cross-A privacy-preserving information brokering (PPIB) system has been proposed for secureinformation access and sharing via an overlay network of brokers, coordinators, and a centralauthority (CA) [8] The approach does not consider the heterogeneity of components such asdifferent security levels of client’s browsers, different user authentication schemes, trust levels ofservices The use of a trusted third party (TTP) creates a single point of trust and failure

Other solutions address secures data dissemination in untrusted environments Pearson et al.present a case study of EnCoRe project that uses sticky policies to manage the privacy of shareddata across different domains [9] In the EnCoRe project, the sticky policies are enforced by aTTP and allow tracking of data dissemination, which makes it prone to TTP-related issues Thesticky policies are also vulnerable to attacks from malicious recipients

1.3 Proposed Solution

We propose an approach for enterprise system and data resiliency that is capable of dynamically

adapting to attack and failure conditions through performance/cost-aware process and data replication, data provenance tracking and automated software-based monitoring & reconfiguration of cloud processes (see Figure 1) The main components of the proposed

solution and the challenges involved in their implementation are described below

Figure 1 High-level view of proposed resiliency framework

Trang 6

1.3.1 Software-Defined Agility & Adaptability

Adaptability to adverse situations and restoration of services is significant for high performanceand security in a distributed environment Changes in both service context and the context ofusers can affect service compositions, requiring dynamic reconfiguration While changes in usercontext can result in updated priorities such as trading accuracy for shorter response time in anemergency, as well as updated constraints such as requiring trust levels of all services in acomposition to be higher than a particular threshold in a critical mission, changes in servicecontext can result in failures requiring the restart of a whole service composition Advances invirtualization have enabled rapid provisioning of resources, tools, and techniques to build agilesystems that provide adaptability to changing runtime conditions In this project, we will buildupon our previous work in adaptive network computing [10], end-to-end security in SOA [11]and the advances in software-defined networking (SDN) to create a dynamically reconfigurableprocessing environment that can incorporate a variety of cyber defense tools and techniques Ourenterprise resiliency solution is based on two main industry-standard components: The cloudmanagement software of OpenStack [12] – Nova, which provides virtual machines on demand;and the Software Defined Networks (SDN) solution – Neutron, which provides networking as aservice and runs on top of OpenStack

The solution that we developed for monitoring cloud processes and dynamic reconfiguration ofservice compositions as described in [10] involved a distributed set of monitors in every servicedomain for tracking service/domain-level performance and security parameters and a centralmonitor to keep track of the health of various cloud services Even though the solution enablesdynamic reconfiguration of entire service compositions in the cloud, it requires replication,registration and tracking of services at multiple sites, which could have performance and costimplications for the enterprise In order to overcome these challenges, the proposed frameworkutilizes live monitoring of cloud resources to dynamically detect deviations from normal service

behavior and integrity violations, and self-heal by reconfiguring service compositions through software-defined networking of automatically migrated service instances A component of this software-defined agility and adaptability solution is live monitoring of services as described

below

1.3.1.1 Live Monitoring

Cyber-resiliency is the ability of a system to continue degraded operations, self-heal, or deal withthe present situation when attacked [13] We may need to shut down less critical computations,communications and allow for weaker consistency as long as the mission requirements aresatisfied For this we need to measure the assurance level, (integrity/accuracy/trust) of the systemfrom the Quality of Service (QoS) parameters such as response time, throughput, packet loss,delays, consistency, acceptance test success, etc

To ensure the enforcement of SLAs and provide high security assurance in enterprise cloudcomputing, a generic monitoring framework needs to be developed The challenges involved ineffective monitoring and analysis of service/domain behavior include the following:

 Identification of significant metrics, such as response time, CPU usage, memory usage,

etc., for service performance and behavior evaluation

Trang 7

 Development of models for identifying deviations from performance (e.g., achieving thetotal response time below a specific threshold) and security goals (e.g., having servicetrust levels above a certain threshold).

 Design and development of adaptable service configurations and live migration solutionsfor increased resilience and availability

Development of effective models for detection of anomalies in a service domain relies on carefulselection of performance and security parameters to be integrated into the models Modelparameters should be easy-to-obtain and representative of performance and securitycharacteristics of various services running on different platforms We plan to investigate andutilize the following monitoring tools that provide integration with OpenStack in order to gathersystem usage/resiliency parameters in real time [14]:

1 Ceilometer [15]: Provides a framework to meter and collect infrastructure metrics such asCPU, network, and storage utilization This tool provides alarms set when a metriccrosses a predefined threshold, and can be used to send alarm information to externalservers

2 Monasca [16]: Provides a large framework for various aspects of monitoring includingalarms, statistics, and measurements for all OpenStack components Tenants can definewhat to measure, what statistics to collect, how to trigger alarms and the notificationmethod

3 Heat [17]: Provides an orchestration engine to launch multiple composite cloudapplications based on templates in the form of text files that can be treated like code.Enabling actions like autoscaling based on alarms received from Ceilometer

As a further improvement for dynamic service orchestration and self-healing, we plan to

investigate models that are based on a graceful degradation approach for service composition,

which replace services that do not pass acceptance tests as seen in Figure 2 based on specified or context-based policies with ones that are more likely to pass the tests at the expense

user-of decreased performance

Figure 2 Service acceptance test

1.3.2 Moving Target Defense for Resiliency/Self-healing

The traditional defensive security strategy for distributed systems is to prevent attackers fromgaining control of the system using known techniques such as firewalls, redundancy,replications, and encryption However, given sufficient time and resources, all these methods can

be defeated, especially when dealing with sophisticated attacks from advanced adversaries thatleverage zero-day exploits This highlights the need for more resilient, agile and adaptablesolutions to protect systems MTD is a component in NGC project Cyber Resilient System [13].Sunil Lingayat of NGC has taken interest and connected us with other researchers in NGC

Trang 8

working in Dayton.

Our proposed Moving Target Defense (MTD) [18, 19] attack-resilient virtualization-based

framework is a defensive strategy that aims to reduce the need to continuously fight againstattacks by decreasing the gain-loss balance perception of attackers The framework narrows theexposure window of a node to such attacks, which increases the cost of attacks on a system andlowers the likelihood of success and the perceived benefit of compromising it The reduction inthe vulnerability window of nodes is mainly achieved through three steps:

1 Partitioning the runtime execution of nodes in time intervals

2 Allowing nodes to run only with a predefined lifespan (as low as a minute) onheterogeneous platforms (i.e different OSs)

3 Proactively monitoring their runtime below the OS

The main idea of this approach is allowing nodes to run on a given computing platform (i.e.hardware, hypervisors and OS) for a controlled period of time chosen in such a manner thatsuccessful ongoing attacks become ineffective as suggested in [20, 21, 22, 23] We accomplish

such a control by allowing nodes to run only for a short period of time to complete n client

requests on a given underlying computing platform, then vanish and appear on a differentplatform with different characteristics, i.e., guest OS, Host OS, hypervisor, hardware, etc Werefer to this randomization and diversification technique of vanishing a node to appear in another

platform as reincarnation

The proposed framework introduces resiliency and adaptability to systems Resilience has two

main components (1) continuing operation and (2) fighting through compromise [13] The MTD

framework takes into consideration these components since it transforms systems to be able toadapt and self-heal when ongoing attacks are detected, which guarantees operation continuity.The initial target of the framework is to prevent successful attacks by establishing short lifespansfor nodes/services to reduce the probability of attackers’ taking over control In case an attackoccurs within the lifespan of a node, the proactive monitoring system triggers a reincarnation ofthe node

The attack model considers an adversary taking control of a node undetected by the traditionaldefensive mechanisms, a valid assumption in the face of novel attacks The adversary gains highprivileges of the system and is able to alter all aspects of the applications Traditionally, theadvantage of the adversaries, in this case, is the unbounded time and space to compromise anddisrupt the reliability of the system, especially when it is replicated (i.e colluding) Thefundamental premise of proposed framework is to eliminate the time and space advantage of theadversaries and create agility to avoid attacks that can defeat system objectives by extending thecloud framework We assume the cloud management software stack (i.e framework) and thevirtual introspection libraries are secure

1.3.2.1 Resiliency Framework Design

The criticality of diversity as a defensive strategy in addition to replication/redundancy was firstproposed in [24] Diversity and randomization allow the system defender to deceive adversaries

by continuously shifting the attack surface of the system We introduce a unified generic MTDframework designed to simultaneously move in space (i.e across platforms) and in time (i.e

Trang 9

time-intervals as low as a minute) Unlike the-state-of-the-art singular MTD solution approaches[25, 26, 27, 28, 29, 30], we view the system space as a multidimensional space where we applyMTD on all layers of the space (application, OS, network) in short time intervals while we areaware of the status of the rest of the nodes in the system

Figure 3 illustrates how the MTD framework works The y-axis depicts the view of the space(e.g application, OS, network) and the x-axis the runtime (i.e elapsed time) The figurecompares the traditional replicated systems without any diversification and randomizationtechnique, the state-of-the-art systems [25, 26, 27, 28, 29, 30] with diversification andrandomization techniques applied to certain layers of the infrastructure (application, OS ornetwork) and proposed solution, which applies MTD to all layers

Figure 3 View of space and time of MTD-based resiliency solution

As illustrated in Figure 3.c, nodes/services that are not reincarnated in a particular time-intervalare marked with the result of an observation (e.g introspection) of either Clean (C) or Dirty (D)(i.e not compromised/compromised) To illustrate, in the third reincarnation round with replica

n, we detect replica1 to be clean (marked with C) and replica 2 as dirty as shown in that timeinterval entry with D We reincarnate the node whose entry shows D prior to the scheduled node

in the next time-interval

Two important factors need to be considered in the design of this framework: the lifespan ofnodes or virtual machines, and the migration technique used in the reincarnation Figure 4 shows

a possible scenario in which virtual machines running on a platform become IDLE when anattack occurs and is detected When to and how to reincarnate nodes are our main researchquestions

Trang 10

Figure 4 Moving target defense application example

Long periods of times increase the probability of success of an ongoing attack, while too shorttimes impact the performance of the system Novel ways to determine when to vanish a node torun the replica in a new VM need to be developed In [23] VMs are reincarnated in fixed periods

of times chosen using Round Robin or a randomized selection mechanism We propose theimplementation of a more adaptable solution, which uses Virtual Machine Introspection (VMI)

to persistently monitor the communication between virtual requests and available physicalresources and switch the VM when anomalous behaviors are observed

The other crucial factor in our design is the live migration technique used for the virtual machinereincarnation Migrating operating system instances across different platforms have traditionallybeen used to facilitate fault management, load balancing, and low-level system maintenance[31] Several techniques have been proposed to carry out the majority of migration while OSescontinue to run to achieve acceptable performance with minimal service downtimes We propose

to integrate some of these techniques [31, 32, 33] in a clustered environment to our MTDsolution to guarantee adaptability and agility in our system When virtual machines are runninglive services, it is important that the reincarnation occurs in such a manner that both downtimeand total transfer time are minimal The downtime refers to the time when no service is availableduring the transition The total transfer time refers to the time it takes complete the transition[31] Our main idea is to continue running the service in the source VM until the destination VM

is ready to offer the service independently In this process there will be some time where part ofthe state (the most unchangeable state information) is copied to the destination VM while thesource VM is still running At some point, the source VM will be stopped to copy the rest of theinformation (the most changeable state information) to the destination VM, which will takecontrol after the information is copied No service will be available when the source VM isstopped and the copying process has not been completed This period of time defines thedowntime

1.3.2.2 Resiliency Framework Infrastructure

Our framework will be built on top of OpenStack cloud framework [12], a widely adopted opensource cloud management software stack Figure 5 shows the high-level architecture of our

Trang 11

framework on top right and OpenStack cloud framework on the bottom left

Figure 5 High-level resiliency framework architecture

In the cloud framework, starting from the infrastructure at the bottom layer is the hardware.Each hardware has a host OS, a hypervisor (KVM/Xen) to virtualize the hardware for the guestVMs on top of it, and the cloud software stack framework, OpenStack [12] in our case Thevertical bars are some of the OpenStack framework implementation components: nova, neutron,horizon, and glance In addition, the libvmi library for virtual introspection interfaces with thelibvirt library, which is used by the hypervisor for virtualization This library allows us tointercept the resource-mapping requests (i.e memory) from the VM to the physical availableresource in order to detect anomalous behavior even in the event of VM/OS compromise

We introduce two abstraction layers: a high-level System State (top) and the ApplicationRuntime (bottom), dubbed time-interval runtime To illustrate the system state, we considerDesired as the desired system state at all times, and Undesired as the state we like to avoid (i.e.turbulence, compromised, failed or exit system state) The driving engine of these two high-levelstates is the time-interval runtime indirect outputs depicted as the dotted arrows

The Application Runtime defines the time an application runs in a VM Our frameworktransforms the traditional services designed to be protected their entire runtime (as shown on theguest VMs on the cloud framework) to services that deal with attacks in time intervals asdepicted in Figure 5 (as Time Interval Runtime) Such transformation is simply achieved byallowing the applications to run on heterogeneous OS’s and variable underlying computingplatforms (i.e hardware and hypervisors), thereby, creating a mechanically generated systeminstance (s) that is diversified in time and space The Application Runtime can vary depending

on the detection of anomalous behaviors

The System State and the Application Runtime are two abstraction layers that operate insynchrony At the application layer, we refresh and map one or more Apps/VMs (App1 .Appn) to different platforms (Hardware1 HWn) in pre-specified time intervals, referred astime-interval runtime To gain a holistic view of the high-level system state, we continuously re-evaluate the system state (with libvmi depicted in the horizontal blue arrow) at the end of each

Trang 12

interval to determine the current state of the system at that specific time interval

System state is the state of the system at any given time The state changes are dictated by theapplication runtime status For instance, if the application fails or crashes, then the system is inthe failed state Similarly, the system is in a compromised state when the attacker succeeds and is

undetected These two states failed and compromised are under the Undesired category Figure 6

shows the possible different states, where TIRE (Time Interval Runtime Execution) representsthe observation of the current state at the end of the runtime, D the Desired state, C theCompromised state, F the Failed state and E the Exit state

Figure 6 System states of the framework

The key objective of the framework is to start the system in a Desired state and stay in that state

as often as possible This technique implements resiliency in the three phases of the systemlifecycle “Start Secure Stay Secure Return Secure” [13] In the event that the system transitionsinto one Undesired state, a valid assumption in cyberspace, the system bounces back seamlesslyinto the Desired state even in the event of an OS compromise For this, applications run in aspecific VM for a pre-specified time and then are moved to a new one

1.3.2.3 Live Reincarnation of Virtual Machines

Reincarnation is a technique for enhancing the resiliency of a system by terminating a runningnode and starting a fresh new one in its place on (possibly) a different platform/OS which willcontinue to perform the tasks as its predecessor when it was dropped off of the network andreconnected to it One key question is determining when to reincarnate a machine One approach

is setting a fixed period of time for each machine and reincarnating them after that lifespan Inthis first approach machines to be reincarnated are selected either in Round Robin or randomly.However, attacks can occur within the lifespan of each machine, which makes live monitoringmechanisms a crucial element Whether an attack is going on at the beginning of thereincarnation determines how soon the source VM must be stopped to keep the system resilient.When no threats are present both source VM and destination VM can participate in thereincarnation process The source VM can continue running until the destination VM is ready tocontinue On the contrary, in case an attack is detected the source VM should be stoppedimmediately and the reincarnation must consist simply of copying the state of the source to thedestination VM and continue the tasks after the process The latter case presents a higherdowntime than the former case Our target is including adaptability to our system in a manner wecan distinguish between these two cases

Trang 13

In a clustered environment the reincarnation of a machine needs to concentrate on physicalresources such as disk, memory and network Disk resources are assumed to be shared through aNetwork-Attached Storage (NAS) so not much needs to be done regarding this Memory andnetwork are the targets of our study There is a tradeoff we need to consider to manage theseresources Stopping the source virtual machine at once to copy its entire address space consumes

a lot of network resources, which negatively impact the performance The decision for themigration should involve consideration of various parameters including the number of modifiedpages that need to be migrated, available network bandwidth, hardware configuration of thesource and destination, load on the source and destination etc We previously proposed a modelbased on mobile agent-based services [52] that migrate to different platforms in the cloud tooptimize the running time of mobile applications under varying contexts By utilizing thestatefulness of mobile agents, the model enables resuming processes after migration to a differentplatform We plan to build on our knowledge in adaptable systems and performance optimization

to design a model for live migration of services and restoration on the destination platform toenable high performance and continuous availability We will focus on the following aspects:

 Memory Management: We are interested in developing an adaptable solution that copies

the memory state to the destination VM while still running the source VM when noattacks are detected A daemon will be in charge of this task Initially the entire memoryspace is copied to the destination and later just dirty pages are copied when thedestination VM is ready to take over This guarantees a much smaller downtime whenmachines are reincarnated because of their lifespan expiry (no attacks detected)

 Network Management: The new virtual machine must be set up with the same IP address

as its predecessor All other machines in the cluster must be aware of this change, whichcan be achieved by sending ARP replies to the rest of the machines

1.3.2 Data Provenance and Leakage Detection in Untrusted Cloud

Monitoring data provenance and adaptability in data dissemination process is crucial for resilience against data leakage and privacy violations in untrusted cloud environments Our

solution ensures that each service can only access data for which it is authorized In addition tothat, our approach detects several classes of data leakages made by authorized services to

unauthorized ones Context-sensitive evaporation of data in proportion to the distance from the

data owner can make illegitimate disclosures less probable, as distant data guardians tend to beless trusted

W extended our “WaxedPrune” prototype, demonstrated at Northrop Grumman Tech Fest 2016,

to support automatic derivation of data provenance through interaction of the Active Bundle (AB) engine [34] with the central cloud monitor responsible for automated monitoring and

reconfiguration of cloud enterprise processes Extension [50] supports detection of several types

of data leakages, in addition to enforcing fine-grained security policies Some of the ideas wereproposed by Leon Li (NG) during our weekly meetings

1.3.3.1 Data Leakage Detection

In the current implementation of the secure data dissemination mechanism, data is transferred

Trang 14

among services in encrypted form by means of Active Bundles (AB) [40, 41] This preserving mechanism protects against tamper attacks, eavesdropping, spoofing and man-in-the-middle attacks Active Bundles ensure that a party can access only those portions of data it isauthorized for However, authorized service may leak sensitive data to unauthorized one.Leakage scenario is illustrated on Fig 7 Service X, who is authorized to read data d1, canleak this data item behind the scene to an unauthorized service, e.g to Service Y This leakageneeds to be detected and reported to data owner Let us denote data D = {d1, d2, … dn}; set ofpolicies P = {p1, p2, … pk} Data leakage can occur in two forms:

privacy-1) Encrypted data got leaked, i.e the whole AB (the whole Electronic Health Record)

Active Bundle data can only be extracted by Active Bundle’s kernel after authentication ispassed and access control policies are evaluated by Policy Enforcement Engine If AB is sent byService X to unauthorized service Y behind the scene, then Service Y won’t be able to decryptdata it is not authorized for When Y tries to decrypt d1 , before decryption the AB kernel willquery CM in order to check whether d1 is supposed to be at Y’s side If not then data leakagealert will be raised

In addition to CM, enforcing data obligations, there is one more embedded protection measurethat relies on digital watermarks, embedded into Active Bundles, that can be verified by webcrawlers If attacker uploads the illegal content to publicly available web hosting that can bescanned by web crawlers, then web crawler can verify the watermark and can detect copyrightviolation However, this approach is only limited to cases when attacker uploads unauthorizedcontent to publicly available folder in the network

Figure 7 Data d 1 leakage from Service X to Service Y

2) Plaintext data got leaked

If Service X, who is authorized to see d1 , takes a picture of a screen and sends it via email toService Y, who is not authorized to see d1, then our solution relies on visual watermarksembedded into data Visual watermark that will be visible on a captured image, can be used inthe court to prove data ownership

The key challenge here is that it is hard to come up with an approach that covers all possiblecases of data leakage Embedded watermarks can help to detect leakage of document, but ifwatermarks are removed then protection is gone For instance, if a service gets access to a creditcard number, then writes it down on a piece of paper in order to remember it and then leaks it viaemail to an unauthorized party In this case, there are several ways to mitigate the problem:

 Layered approach: Don't give all the data to the requester at once

Trang 15

o First give part of data (incomplete, less sensitive)

o Watch how data is used and monitor trust level of using service

o If trust level is sufficient – give next portion of data

 Raise the level of data classification to prevent leakage repetition

 Intentional leakage to create uncertainty and lower data value

 Monitor network messages

o Check whether they contain e.g credit card number that satisfies specific patternand can be validated using regular expressions

 After leakage is detected, make system stronger against similar attacks

o Separate compromised role into two: e.g suspicious_role and benign_role

o Send new certificates to all benign users for benign role

o Create new AB with new policies, restricting access to suspicious_role (e.g to all

doctors from the same hospital with a malicious one)

o Increase sensitivity level for leaked data items, i.e for diagnosis

Data Leakage Damage Assessment

After data leakage is detected damage is assessed based on:

• To whom was the data leaked (service with low trust level vs service with high level of trust)

• Sensitivity (Classification) of leaked data (classified vs unclassified)

• When was leaked data received

• Can other sensitive data be derived from the leaked data (i.e diagnosis can be derived from leaked

Định dạng
Số trang	30
Dung lượng	2,31 MB