PRIVACY
POLICY INFERENCE OF USER-UPLOADED
IMAGES ON CONTENT
SHARING SITES
Abstract—With the increasing
volume of images users share through social sites, maintaining privacy has
become a major problem, as demonstrated by a recent wave of publicized
incidents where users inadvertently shared personal information. In light of
these incidents, the need of tools to help users control access to their shared
content is apparent. Toward addressing this need, we propose an Adaptive
Privacy Policy Prediction (A3P) system to help users compose privacy settings
for their images. We examine the role of social context, image content, and
metadata as possible indicators of users’ privacy preferences. We propose a
two-level framework which according to the user’s available history on the
site, determines the best available privacy policy for the user’s images being uploaded.
Our solution relies on an image classification framework for image categories
which may be associated with similar policies, and on a policy prediction
algorithm to automatically generate a policy for each newly uploaded image,
also according to users’ social features. Over time, the generated policies
will follow the evolution of users’ privacy attitude. We provide the results of
our extensive evaluation over 5,000 policies, which demonstrate the
effectiveness of our system, with prediction accuracies over 90 percent.
EXISTING SYSTEM:
Several recent works have
studied how to automate the task of privacy settings Bonneau et al. proposed the concept of privacy suites which
recommend to users a suite of privacy settings that n“expert” users or other
trusted friends have already set, so that normal users can either directly
choose a setting or only need to do minor modification. Similarly, Danezis proposed a machine-learning based approach to
automatically extract privacy settings from the social context within which the
data is produced. Parallel to the work of Danezis, Adu-Oppong et al. develop privacy settings based on a concept of
“Social Circles” which consist of clusters of friends formed by partitioning
users’ friend lists. Ravichandran et al. studied how to predict a user’s privacy
preferences for location-based data (i.e., share her location or not) based on
location and time of day. Fang et al. proposed a privacy wizard to help users
grant privileges to their friends. The wizard asks users to first assign
privacy labels to selected friends, and then uses this as input to construct a
classifier which classifies friends based on their profiles and automatically
assign privacy labels to the unlabeled friends. More recently, Klemperer et al.
studied whether the keywords and
captions with which users tag their photos can be used to help users more
intuitively create and maintain access-control policies. Their findings are
inline with our approach: tags created for organizational purposes can be
repurposed to help create reasonably accurate access-control rules.
PROPOSED SYSTEM:
We propose an Adaptive
Privacy Policy Prediction (A3P) system which aims to provide users a hassle free
privacy settings experience by automatically generating personalized policies.
The A3P system handles user uploaded images, and factors in the following
criteria that influence one’s privacy settings of images: The impact of social environment and personal characteristics.
Social context of users, such as their profile information and relationships
with others may provide useful information regarding users’ privacy
preferences. For example, users interested in photography may like to share
their photos with other amateur photographers. Users who have several family
members among their social contacts may share with them pictures related to
family events. However, using common policies across all users or across users
with similar traits may be too simplistic and not satisfy individual
preferences. Users may have drastically different opinions even on the same
type of images. For example, a privacy adverse person may be willing to share
all his personal images while a more conservative person may just want to share
personal images with his family members. In light of these considerations, it
is important to find the balancing point between the impact of social
environment and users’ individual characteristics in order to predict the
policies that match each individual’s needs. Moreover, individuals may change
their overall attitude toward privacy as time passes. In order to develop a
personalized policy recommendation system, such changes on privacy opinions
should be carefully considered. The role
of image’s content and metadata. In general, similar images often incur similar
privacy preferences, especially when people appear in the images. For example,
one may upload several photos of his kids and specify that only his family
members are allowed to see these photos. He may upload some other photos of
landscapes which he took as a hobby and for these photos, he may set privacy
preference allowing anyone to view and comment the photos. Analyzing the visual
content may not be sufficient to capture users’ privacy preferences. Tags and
other metadata are indicative of the social context of the image, including
where it was taken and why , and also provide a synthetic description of
images, complementing the information obtained from visual content analysis.
Module 1
A3P-CORE
There are two major components
in A3P-core: (i) Image classification and (ii) Adaptive policy prediction. For
each user, his/her images are first classified based on content and metadata.
Then, privacy policies of each category of images are analyzed for the policy
prediction. Adopting a two-stage approach is more suitable for policy recommendation
than applying the common one-stage data mining approaches to mine both image
features and policies together. Recall that when a user uploads a new image,
the user is waiting for a recommended policy. The two-stage approach allows the
system to employ the first stage to classify the new image and find the
candidate sets of images for the subsequent policy recommendation. As for the
one-stage mining approach, it would not be able to locate the right class of
the new image because its classification criteria needs both image features and
policies whereas the policies of the new image are not available yet. Moreover,
combining both image features and policies into a single classifier would lead
to a system which is very dependent to the specific syntax of the policy. If a
change in the supported policies were to be introduced, the whole learning
model would need to change.
Module 2
Image Classification
To obtain groups of images
that may be associated with similar privacy preferences, we propose a
hierarchical image classification which classifies images first based on their
contents and then refine each category into subcategories based on their
metadata. Images that do not have metadata will be grouped only by content.
Such a hierarchical classification gives a higher priority to image content and
minimizes the influence of missing tags. Note that it is possible that some
images are included in multiple categories as long as they contain the typical
content features or metadata of those categories. The content-based
classification creates two categories: “landscape” and “kid”. Images C, D, E
and F are included in both categories as they show kids playing outdoor which
satisfy the two themes: “landscape” and “kid”. These two categories are further
divided into subcategories based on tags associated with the images. As a
result, we obtain two subcategories under each theme respectively. Notice that
image G is not shown in any subcategory as it does not have any tag; image A
shows up in both subcategories because it has tags indicating both “beach” and
“wood”.
Module 3
Policy Mining
We propose a hierarchical
mining approach for policy mining. Our approach leverages association rule
mining techniques to discover popular patterns in policies. Policy mining is
carried out within the same category of the new image because images in the
same category are more likely under the similar level of privacy protection.
The basic idea of the hierarchical mining is to follow a natural order in which
a user defines a policy. Given an image, a user usually first decides who can
access the image, then thinks about what specific access rights (e.g., view
only or download) should be given, and finally refine the access conditions
such as setting the expiration date. Correspondingly, the hierarchical mining
first look for popular subjects defined by the user, then look for popular
actions in the policies containing the popular subjects, and finally for
popular conditions in the policies containing both popular subjects and
conditions.
_ Step 1: In the same
category of the new image, conduct association rule mining on the subject
component of polices. Let S1, S2; . . ., denote the subjects occurring in
policies. Each resultant rule is an implication of the form X ) Y, where X, Y _
fS1, S2; . . . ; g, and X \ Y ¼ ;. Among the obtained rules, we select the best
rules according to one of the interestingness measures, i.e., the generality of
the rule, defined using support and confidence as introduced in [16]. The
selected rules indicate the most popular subjects (i.e., single subject) or
subject combinations (i.e., multiple subjects) in policies. In the subsequent
steps, we consider policies which contain at least one subject in the selected
rules. For clarity, we denote the set of such policies as Gsub i corresponding
to a selected rule Rsub i .
_ Step 2: In each policy
set Gsub i , we now conduct association rule mining on the action component.
The result will be a set of association rules in the form of X ) Y, where X, Y
_fopen, comment, tag, downloadg, and X \ Y ¼ ;. Similar to the first step, we
will select the best rules according to the generality interestingness. This
time, the selected rules indicate the most popular combination of actions in
policies with respect to each particular subject or subject combination.
Policies which do not contain any action included in the selected rules will be
removed. Given a selected rule Ract j we denote the set of remaining policies
as Gact j , and note that Gact j _ Gsub
_ Step 3: We proceed to
mine the condition component in each policy set Gact j . Let attr1, attr2, ...,
attrn denote the distinct attributes in the condition component of the policies
in Gact j . The association rules are in the same format of X ) Y but with X, Y
_fattr1, attr2; . . . ; attrng. Once the rules are obtained, we again select
the best rules using the generality interestingness measure. The selected rules
give us a set of attributes which often appear in policies. Similarly, we
denote the policies containing at least one attribute in the selected rule Rcon
k as Gcon k and Gcon k _ Gact j
_ Step 4: This step is to generate candidate
policies. Given Gcon k _ Gact j _ Gsub i , we consider each corresponding
series of best rules: Rcon kx , Ract jy and Rsub iz . Candidate policies are
required to possess all elements in Rcon kx , Ract jy and Rsub iz Note that
candidate policies may be different from the policies as result of Step 3. This
is because Step 3 will keep policies as long as they have one of the attributes
in the selected rules.
Module 4
A3P-SOCIAL
The A3P-social employs a
multi-criteria inference mechanism that generates representative policies by
leveraging key information related to the user’s social context and his general
attitude toward privacy. As mentioned earlier, A3Psocial will be invoked by the
A3P-core in two scenarios. One is when the user is a newbie of a site, and does
not have enough images stored for the A3P-core to infer meaningful and
customized policies. The other is when the system notices significant changes
of privacy trend in the user’s social circle, which may be of interest for the
user to possibly adjust his/her privacy settings accordingly. In what follows,
we first present the types of social context considered by A3P-Social, and then
present the policy recommendation process.
CONCLUSION:
We have proposed an
Adaptive Privacy Policy Prediction (A3P) system that helps users automate the
privacy policy
settings for their
uploaded images. The A3P system provides a comprehensive framework to infer
privacy preferences based on the information available for a given user. We also
effectively tackled the issue of cold-start, leveraging social context
information. Our experimental study proves that our A3P is a practical tool
that offers significant improvements over current approaches to privacy.
REFERENCES
[1] A. Acquisti and R.
Gross, “Imagined communities: Awareness, information sharing, and privacy on
the facebook,” in Proc. 6th Int. Conf. Privacy Enhancing Technol. Workshop,
2006, pp. 36–58.
[2] R. Agrawal and R.
Srikant,“Fast algorithms for mining association rules in large databases,” in
Proc. 20th Int. Conf. Very Large Data Bases, 1994, pp. 487–499.
[3] S. Ahern, D. Eckles,
N. S. Good, S. King, M. Naaman, and R. Nair, “Over-exposed?: Privacy patterns
and considerations in online and mobile photo sharing,” in Proc. Conf. Human
Factors Comput. Syst., 2007, pp. 357–366.
[4] M. Ames and M. Naaman,
“Why we tag: Motivations for annotation in mobile and online media,” in Proc.
Conf. Human Factors Comput. Syst., 2007, pp. 971–980.
[5] A. Besmer and H.
Lipford, “Tagged photos: Concerns, perceptions, and protections,” in Proc. 27th
Int. Conf. Extended Abstracts Human Factors Comput. Syst., 2009, pp. 4585–4590.
[6] D. G. Altman and J. M.
Bland ,“Multiple significance tests: The bonferroni method,” Brit. Med. J.,
vol. 310, no. 6973, 1995.
[7] J. Bonneau, J.
Anderson, and L. Church, “Privacy suites: Shared privacy for social networks,”
in Proc. Symp. Usable Privacy Security, 2009.
[8] J. Bonneau, J.
Anderson, and G. Danezis, “Prying data out of a social network,” in Proc. Int.
Conf. Adv. Soc. Netw. Anal. Mining., 2009, pp.249–254.
[9] H.-M. Chen, M.-H.
Chang, P.-C. Chang, M.-C. Tien, W. H. Hsu, and J.-L. Wu, “Sheepdog: Group and
tag recommendation for flickr photos by automatic search-based learning,” in
Proc. 16th ACM Int. Conf. Multimedia, 2008, pp. 737–740.
[10] M. D. Choudhury, H.
Sundaram, Y.-R. Lin, A. John, and D. D. Seligmann, “Connecting content to
community in social media via image content, user tags and user communication,”
in Proc. IEEE Int. Conf. Multimedia Expo, 2009, pp.1238–1241.
[11] L. Church, J.
Anderson, J. Bonneau, and F. Stajano, “Privacy stories: Confidence on privacy
behaviors through end user programming,” in Proc. 5th Symp. Usable Privacy
Security, 2009.
[12] R. da Silva Torres
and A. Falc~ao, “Content-based image retrieval: Theory and applications,”
Revista de Inform_atica Te_orica e Aplicada, vol. 2, no. 13, pp. 161–185, 2006
No comments:
Post a Comment