[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Sanitizing google search results
- To: [email protected]
- Subject: Sanitizing google search results
- From: [email protected] (StealthMonger)
- Date: Fri, 1 May 2015 13:45:42 +0100 (BST)
- In-reply-to: <CAD2Ti2_To=C330zU45XPHXrTxMew0xi1xFOCmmMx04zhtQ1t1A@mail.gmail.com> ([email protected]'s message of "Fri, 1 May 2015 02:51:24 -0400")
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
It used to be usually really easy [1] to sanitize google search results,
stripping off all the tracking and leaving the target URL. For example,
http://www.google.com/url?q=http://bitcoin.stackexchange.com/questions/32835/safely-interrupt-reindex&sa=U&ei=70I-Vb3fHoPnaI6XgLgP&ved=0CBIQFjAC&usg=AFQjCNHjtJ6F8LTsfRiZ-bnBMjsb_HLY8A
would become
http://bitcoin.stackexchange.com/questions/32835/safely-interrupt-reindex
Now, more and more it seems, google search results are encoded in a less
obvious way. Does anybody here know how they can be sanitized?
-----
[1] sed 's,^http://www.google.com/url?q=\(.*\)&sa=.*$,\1,'
- --
-- StealthMonger
Long, random latency is part of the price of Internet anonymity.
Key: mailto:stealthsuite<>nym.mixmin.net?subject=send%20stealthmonger-key
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.9 <http://mailcrypt.sourceforge.net/>
iEYEARECAAYFAlVDVvsACgkQDkU5rhlDCl5nBACdFJ5ksGU2rpCXhdMTpGIe28pD
ecEAoJNQZIFj5iQ6cM+qRsBtxGfATdFu
=GE9k
-----END PGP SIGNATURE-----