tag:blogger.com,1999:blog-6555947.post113825138495455750..comments2024-03-14T01:32:43.610-06:00Comments on The Geomblog: Private Information RetrievalSuresh Venkatasubramanianhttp://www.blogger.com/profile/15898357513326041822noreply@blogger.comBlogger9125tag:blogger.com,1999:blog-6555947.post-1142300136898558332006-03-13T18:35:00.000-07:002006-03-13T18:35:00.000-07:00If google is forced to provide the information, th...If google is forced to provide the information, they should print the whole mess in an 8 point calligraphic font with no line terminators, then ship the 2 ton print out to Congress via US mail (postage due)<BR/><BR/>Otherwise they might be able to move the Googleplex out of US territory and tell Congress to pound salt. Someplace more 'fitting' to the mission statement... Like the moon. :) <BR/><BR/>Posted by<A HREF="http://geomblog.blogspot.com/2006/01/private-information-retrieval.html" REL="nofollow" TITLE="msd001 at gmail dot com">Mike Dougherty</A>Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6555947.post-1138343304384554402006-01-26T23:28:00.000-07:002006-01-26T23:28:00.000-07:00But Piotr, why do you say that a PIR system must s...<EM> But Piotr, why do you say that a PIR system must spend time linear in n </EM><BR/><BR/>Oops, sorry. That holds only if no preprocessing is allowed. I take it back. <BR/><BR/><A></A><A></A>Posted by<A><B> </B></A><A HREF="http://geomblog.blogspot.com/2006/01/private-information-retrieval.html#comments" REL="nofollow" TITLE="indyk at mit dot edu">piotr</A>Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6555947.post-1138339372660077882006-01-26T22:22:00.000-07:002006-01-26T22:22:00.000-07:00Piotr is too kind. Using PIR for search was his pr...Piotr is too kind. Using PIR for search was his proposal :). But Piotr, why do you say that a PIR system must spend time linear in n ? isn't this by definition what you want to beat in a query ? <BR/><BR/>A privacy/complexity tradeoff is quite reasonable: you could even imagine a business model based on this, where you get more privacy, the more you are willing to pay. <BR/><BR/>Oooh. I just said "business model": I need to go take a shower.  <BR/><BR/><A></A><A></A>Posted by<A><B> </B></A>SureshSuresh Venkatasubramanianhttps://www.blogger.com/profile/15898357513326041822noreply@blogger.comtag:blogger.com,1999:blog-6555947.post-1138333246067305402006-01-26T20:40:00.000-07:002006-01-26T20:40:00.000-07:00But privatising the search terms doesn't quite ens...<EM> But privatising the search terms doesn't quite ensure the privacy of what you search for </EM><BR/><BR/>Well, they have to locate you in the first place. Google is a one-stop-shop, while there are many ISP's out there to demand the records from.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6555947.post-1138332801255854182006-01-26T20:33:00.000-07:002006-01-26T20:33:00.000-07:00Hi, In lieu of a commentary by Google PR represen...Hi,<BR/><BR/> In lieu of a commentary by Google PR representative ;), here are a few thoughts on this issue.<BR/><BR/> Basically: Google advertising model is characterized by an extreme specificity - you can bid on any of a few hundred thousand (million?) of different words. One could argue that this is one of the reasons why this model has been so popular - instead of blowing a few million $$$ on a Superbowl Ad, you can potentially spend much less if you discover the "right" keywords to advertise on. Google is happy as well - small revenue per each user gets multiplied by millions if everyone and their dog starts to advertise.<BR/><BR/> For this to work however, you have to keep and analyze all information you can get. Once you switch to user profiles, you are in the same league as all other ad agencies, with "soccer moms", "Nascar dads" etc etc. <BR/><BR/> So I guess the challenge is: how to maintain all this information, and yet not compromise users' privacy ? As Suresh points out, PIR is one way to do that. But this is tricky, since (by definition) any PIR system must spend linear time in n (n=few billion) to answer any query. Time for privacy/complexity tradeoffs ? <BR/><BR/><A></A><A></A>Posted by<A><B> </B></A><A HREF="http://geomblog.blogspot.com/2006/01/private-information-retrieval.html#comments" REL="nofollow" TITLE="indyk at mit dot edu">Piotr</A>Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6555947.post-1138328351742776812006-01-26T19:19:00.000-07:002006-01-26T19:19:00.000-07:00But privatising the search terms doesn't quite ens...But privatising the search terms doesn't quite ensure the privacy of what you search for. The feds, if they wanted to know, would ask to see your ISP's logs. <BR/><BR/><A></A><A></A>Posted by<A><B> </B></A>Amit CAChttps://www.blogger.com/profile/14911233583375020356noreply@blogger.comtag:blogger.com,1999:blog-6555947.post-1138307799770118352006-01-26T13:36:00.000-07:002006-01-26T13:36:00.000-07:00So my sense is that for these things to be used, s...So my sense is that for these things to be used, someone has to demonstrate that they work "at scale" and don't destroy the fabled 0.13 second response time for google searches (which is now up to 2.43 seconds for "Private Information Retrieval" (measured on Fasterfox)). I am willing to believe that most theoretical solutions don't yet scale, and maybe what one needs is a couple of grad students with internshps at Google doing the needful :) <BR/><BR/><A></A><A></A>Posted by<A><B> </B></A>SureshSuresh Venkatasubramanianhttps://www.blogger.com/profile/15898357513326041822noreply@blogger.comtag:blogger.com,1999:blog-6555947.post-1138305387179893532006-01-26T12:56:00.000-07:002006-01-26T12:56:00.000-07:00Continuing on my last post, this is really an obvi...Continuing on my last post, this is really an obvious approach to dealing with the issue. So either there is an obvious reason for failure, or Google (and others) are too lazy to implement it. Can anyone clarify what the problem is? <BR/><BR/><A></A><A></A>Posted by<A><B> </B></A>ShuchiAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-6555947.post-1138304762710271032006-01-26T12:46:00.000-07:002006-01-26T12:46:00.000-07:00There are several reasons for Google to keep aroun...There are several reasons for Google to keep around search history and click-through information for individual IP addresses. But I believe this information can be summarized into a few hundred profiles (totally arbitrary estimate). Each user would fit into a profile (or a prob. distribution over profiles) and these profiles should give a reasonable approximation to a large enough fraction of users. Google can train these profiles over past data and then just throw the old data out.<BR/><BR/>Let's also say that Google doesn't keep information about which user belong to which group. This information belongs with the user. When I log in to personalized search, my computer sends my group ID to Google and its servers respond accordingly. If I wish to search for something secretly, I turn off personalized search. (Of course Google has to be trusted as far as this procedure is concerned, but I don't see why they wouldn't comply.)<BR/><BR/>As far as privacy is concerned, because Google is only keeping information that is broadly applicable (to large groups of people and not individuals), this should be acceptable (for example, w.r.t. concepts like k-anonymity, isolation, etc. that have appeared in privacy-related theoretical work recently).<BR/><BR/>Is there any reason this approach wouldn't work? I would suppose a good fraction of people would still be interested in using personalized search. Google seems to have enough data to train reasonably good models of profiles. Is there reason to believe a few profiles do not capture a large fraction of users?<BR/> <BR/><BR/><A></A><A></A>Posted by<A><B> </B></A>ShuchiAnonymousnoreply@blogger.com