TrustedDB: A Trusted
Hardware based Database with Privacy and Data Confidentiality
ABSTRACT:
Traditionally, as soon as confidentiality becomes a
concern, data is encrypted before outsourcing to a service provider. Any
software-based cryptographic constructs then deployed, for server-side query
processing on the encrypted data, inherently limit query expressiveness. Here,
we introduce TrustedDB, an outsourced database prototype that allows clients to
execute SQL queries with privacy and under regulatory compliance constraints by
leveraging server-hosted, tamper-proof trusted hardware in critical query processing
stages, thereby removing any limitations on the type of supported queries.
Despite the cost overhead and performance limitations of trusted hardware, we
show that the costs per query are orders of magnitude lower than any (existing
or) potential future software-only mechanisms. TrustedDB is built and runs on
actual hardware and its performance and costs are evaluated here.
EXISTING SYSTEM:
Existing research addresses several such security
aspects, including access privacy and searches on encrypted data. In most of
these efforts data is encrypted before outsourcing. Once encrypted however,
inherent limitations in the types of primitive operations that can be performed
on encrypted data lead to fundamental expressiveness and practicality
constraints. Recent theoretical cryptography results provide hope by proving
the existence of universal homeomorphisms, i.e., encryption mechanisms that
allow computation of arbitrary functions without decrypting the inputs. Unfortunately
actual instances of such mechanisms seem to be decades away from being
practical
DISADVANTAGES
OF EXISTING SYSTEM:
Trusted hardware is generally impractical due to its
performance limitations and higher acquisition costs. As a result, with very
few exceptions, these efforts have stopped short of proposing or building full
- fledged database processing engines.
Computation inside secure processors is orders of
magnitude cheaper than any equivalent cryptographic operation performed on the
provider’s unsecured server hardware, despite the overall greater acquisition
cost of secure hardware.
PROPOSED SYSTEM:
we posit that a full-fledged, privacy enabling
secure database leveraging server-side trusted hardware can be built and run at
a fraction of the cost of any (existing or future) cryptography-enabled private
data processing on common server hardware. We validate this by designing and
building TrustedDB, a SQL database processing engine that makes use of
tamperproof cryptographic coprocessors such as the IBM 4764 in close proximity
to the outsourced data. Tamper resistant designs however are significantly constrained
in both computational ability and memory capacity which makes implementing
fully featured database solutions using secure
coprocessors (SCPUs) very challenging. TrustedDB achieves this by utilizing common
unsecured server resources to the maximum extent possible. E.g., TrustedDB
enables the SCPU to transparently access external storage while preserving data
confidentiality with on-the-fly encryption. This eliminates the limitations on
the size of databases that can be supported. Moreover, client queries are
pre-processed to identify sensitive components to be run inside the SCPU. Non-sensitive
operations are off-loaded to the untrusted host server. This greatly improves
performance and reduces the cost of transactions.
ADVANTAGES
OF PROPOSED SYSTEM:
(i)
The introduction of new cost models and
insights that explain and quantify the advantages of deploying trusted hardware
for data processing,
(ii)
the design, development, and evaluation
of TrustedDB, a trusted hardware based relational database with full data
confidentiality, and
(iii)
Detailed query optimization techniques
in a trusted hardware-based query execution model.
MODULES:
1.
Query Parsing
and Execution
2.
Query
optimization process
3.
System Catalog
4.
Analysis of
Basic Query Operations
MODULES
DESCRIPTION:
Query Parsing and Execution
In the first stage a
client defines a database schema and partially populates it. Sensitive
attributes are marked using the SENSITIVE keyword which the client layer
transparently processes by encrypting the corresponding attributes:
CREATE TABLE
customer (ID integer primary key, Name char (72) SENSITIVE, Address char (120)
SENSITIVE);
(1) Later, a client
sends a query request to the host server through a standard SQL interface. The
query is transparently encrypted at the client site using the public key of the
SCPU. The host server thus cannot decrypt the query. (2) The host server
forwards the encrypted query to the Request Handler inside the SCPU. (3) The
Request Handler decrypts the query and forwards it to the Query Parser. The
query is parsed generating a set of plans. Each plan is constructed by
rewriting the original client query into a set of sub-queries, and, according
to their target data set classification, each sub-query in the plan is
identified as being either public or private. (4)The Query Optimizer then
estimates the execution costs of each of the plans and selects the best plan
(one with least cost) for execution forwarding it to the dispatcher.(5) The
Query Dispatcher forwards the public queries to the host server and the private
queries to the SCPU database engine while handling dependencies. The net result
is that the maximum possible work is run on the host server’s cheap cycles. (6)
The final query result is assembled, encrypted, digitally signed by the SCPU
Query Dispatcher, and sent to the client.
Query optimization process:
At a high level query
optimization in a database system works as follows.
(i) The Query Plan
Generator constructs possibly multiple plans for the client query.
(ii) For each
constructed plan the Query Cost Estimator computes an estimate of the execution
cost of that plan.
(iii) The best plan
i.e., one with the least cost, is then selected and passed on to the Query Plan
Interpretor for execution.
The query optimization
process in TrustedDB works similarly with key differences in the Query Cost
Estimator due to the logical partitioning of data mentioned above.
System Catalog:
Any query plan is
composed of multiple individual execution steps. To estimate the cost of the
entire plan it is essential to estimate the cost of individual steps and
aggregate them. In order to estimate these costs the Query Cost Estimator needs
access to some key information. E.g., the availability of an index or the
knowledge of possible distinct values of an attribute. These sets of information
are collected and stored in the System Catalog. Most available DBMS today have
some form of periodically updated System Catalog.
Analysis of Basic Query Operations:
The cost of a plan is
the aggregate of the cost of the steps that comprise it. In this section we
present how execution times for a certain set of basic query plan steps are
estimated.
SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:
·
System : Pentium IV 2.4 GHz.
·
Hard Disk : 40 GB.
·
Monitor :
15 inch VGA Colour.
·
Mouse :
Logitech Mouse.
·
Ram : 512 MB
·
Keyboard :
Standard Keyboard
SOFTWARE REQUIREMENTS:
·
Operating System : Windows XP.
·
Coding Language : ASP.NET, C#.Net.
·
Database :
SQL Server 2005
REFERENCE:
Sumeet Bajaj, Radu Sion “TrustedDB: A Trusted
Hardware based Database with Privacy and Data Confidentiality” - IEEE
TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 26, NO. 3, MARCH 2014
No comments:
Post a Comment