The Great Ad Block Battle!

So, recently a reader of mine asked, on my earlier post Privacy Tools: Ghostery vs. Adblock Plus, which was the best of these two. Plus, she wanted to know what the differences between each one were.

I thought this would be a good opportunity to do a comparison of not only those two extensions, but several others as well. While, in theory, all ad-blockers would do the same thing, this is definitely not true.

For example, Adblock Plus works by using “filter lists,” which are essentially a set of rules that tell it what to filter and what not to filter. Here’s one filter list that comes to mind: FilterLists.

filterlists

If you visit the site, you’ll see specific examples of domains and types of ads that are blocked, such as banner ads, adult site ads, tracking by ad agencies, and malware domains. The downside to this is that it may end up slowing down your browser (which can happen with any ad blocker that you use).

Several of the other popular blockers also use filter lists to determine what domains to block as well.

Ghostery

ghostery_logo

Just to clarify, Ghostery is a company that has designed several different types of privacy software. The one in question, in this case, is the Ghostery Browser Extension. Ghostery, as opposed to AdBlock Plus, monitors the various webservers (in this case, trackers) that are being called by a given webpage, and gives you the option to block or allow any one of them.

It also gives you the option to “trust” or “restrict” any site that you use (or are directed to) on the web. The idea behind this, as you may have guessed, is to try to filter out malicious sites, and only allow ones that you accept.

ghostery_trackers

In addition, if you wish, you have the option of mapping the trackers through Evidon, which I assume is an affiliate of theirs. This, however, is a paid service.

Other Privacy Extensions

ublock_origin_element_picker

AdBlock Plus and Ghostery are far from the only ad-blocking browser extensions available. Several other popular alternatives are uBlock Origin, Privacy Badger, and AdBlock Fast.

A few of these are a bit more complex than AdBlock Plus and Ghostery, but it all depends on what functions you need.

In the screenshot above, uBlock Origin is active, and its “element picker” function is being used, meaning that you can highlight specific parts of a webpage (for example, an ad) and analyze the actual code to see if there’s anything malicious to be concerned about.

ublock_origin_element_code

When you select a certain element, if you believe it to be malevolent, you can permanently “remove” that element so that it won’t attack you in any way. This gives you far more control over which elements to block and which to leave alone, which probably appeals more to the tech-savvy crowd than an extension that does all of this automatically.

Privacy Badger, on the other hand, also blocks trackers, but does so in a more automated way. The extension tries to detect all the different trackers (or domains that are being linked to) on a page, and then determines whether or not they are tracking you in some way, as below:

privacy_badger_trackers

If the sliders next to the domain names are colored green, this means that they appear not to be tracking you. However, if you think that they are, you can move the slider to yellow (which blocks cookies from that domain), or red (which blocks the domain altogether).

In addition, Privacy Badger gives you the option to “whitelist”different domain names that you trust, so that it knows not to block elements on that particular site:

privacy_badger_whitelist

One aspect of Privacy Badger that some may see as a disadvantage is its automated features, which may seem too “hands-off” for users who like to know what’s going on within the extension. It’s possible that P.B. may not catch all of the trackers on a page, or may miss other malicious elements.

On the other hand, it is a user-friendly way to block trackers on any webpage, and isn’t overly complicated.

Finally, there’s Adblock Fast, who describe themselves as “the world’s fastest ad blocker.”

adblockfast-600

One of the reasons for this is that AF uses far fewer filtering rules than most other ad blockers, and thus it is quicker to launch. Also, compared to the other ad blockers we’ve discussed, it’s extremely simple.

You merely have to click the extension to turn ad blocking on or off on a particular page. There’s no element selecting, domain whitelisting, or tracker lists. For those of you who like your technology simple and to the point, I would recommend Adblock Fast as your ad blocker.

On the downside, it gives you very little control over what and how it blocks, so as I said before, if you’re more hands-on, something like uBlock Origin might be your cup of tea.

Any of these can be helpful; it’s really just a matter of preference and comfort…sort of like coffee flavors.

Speaking of which…I could really use a cup right now.

d95ec823d26cf8a6d4c1dc6bd0f027d5--funny-computer-computer-diy

Advertisements

How to Use I2P on Android Devices

by Ciphas

i2p_android

I’m well aware that not all “dark web” users prefer the Tor network (which I’ve mentioned in a few previous posts).

As I wrote about in How to Access the Dark Web with I2P!, I2P is one of the three most popular anonymity networks at the moment, next to Tor and Freenet. Out of those three, however, it’s arguably the most complicated to use.

That aside, if you already use it, and are interested in the Android app, it’s simple to download. Go to I2P – Android Apps on Google Play, and install it.

If you’re already familiar with using Tor on Android, then you may know the browser Orfox; download that first, from Google Play – Orfox.

device-2015-06-30-133152

As with the standard version of I2P, you need to configure your proxy settings to be able to connect to it on your mobile device.

Depending on which device you have, these may be in a different area, but this tutorial explains it quite well. (With the exception that the Orweb browser is outdated.)

To sum up – you’ll need to configure your proxy settings to 127.0.0.1 (localhost), port 4444 (HTTP). After this is finished, open the I2P app again and hold down the button that says “Long press to start I2P.”

i2p_longpress

Once you’ve started I2P, the app has to find peers on the network. This should only take a few minutes at most (depending on your connection, of course).

Finally, go to the “addresses” tab. There should be some default I2P sites (eepsites) listed there. You can add others if you wish. Actually, on my device, there was only one eepsite listed by default.

If you tap on the name of one of the eepsites, it may ask you which app you want to use to open it. Obviously, the tried and true Firefox is good. You can also use Orfox, as I mentioned.

Also, if you tap the “tunnels” tab, you’ll see which client tunnels and/or server tunnels are running. By default, some of the ones that run are the I2P HTTP/HTTPS Proxy, Irc2p, and smtp.postman.i2p (simple mail transfer protocol):

i2p_tunnels

You can, of course, customize it by adding your own client tunnels or server tunnels using the red “plus” button in the lower righthand corner (maybe that could be a subject for a future blog post…yesssss….).

Interestingly, the tutorial I referenced above recommends Lightning Web Browser, because it’s open-source and built for privacy, speed, and efficiency. It can also send traffic through Tor or I2P, and can be set to use DuckDuckGo or StartPage as its standard search engines. So give that one a try. If you’re curious about the source code, it’s here: GitHub: Lightning Browser.

Now, as for some other eepsites you can try out, here are some suggestions (but I haven’t vetted all of these, so some may not work):

https://sochi.i2p

https://speedie.i2p

https://sponge.i2p

https://nightfort.i2p

https://planet.i2p

https://oniichan.i2p

I hope that’s enough to get you started. Anyhow, have fun. I2P may not seem as “creepy” as Tor, but I would like to get a few more people to try it out, and maybe build more of a community on the network.

Enjoy your visit, friends!

Adblocking Adventures: Adblock Fast vs. Everyone?

7aGcffT

Good day, readers!  I have to admit that I’m going through some stressful times at the moment, but what better way to deal with them than by writing?

That being said, in a couple of earlier posts, I reviewed such privacy tools as Adblock Plus, Ghostery, Redmorph Browser Controller, and uBlock Origin.

Recently on Twitter, Adblock Fast (@adblockfast), created by Rocketship, began following me, and I thought “Why not try this one out?”

Ad-archy in the U.K.

7KB44vGC

In case you’re unfamiliar with it, that’s  Adblock Fast’s logo.  Is it just me, or is that the anarchy symbol?  Yeah, it is (according to my sources).

Anyhow, though many of these ad-blocking extensions (Adblock, Adblock Plus, uBlock) and apps have similar names, they function in rather different ways.  Some use heuristic blocking (like Privacy Badger), while many others use filter lists, like EasyList, to forbid trackers.

Adblock Fast (“ABF”) is in the latter category, like some of its contemporaries.  According to their FAQ, ABF’s ruleset is derived from EasyList and that of Bluhell Firewall.  They also say that they’re in the process of testing a new alternative ruleset to improve the app’s blocking capabilities.

I have to give credit to ABF, though – it really is one of the simplest ad blockers I’ve ever used.  (Plus it’s free and open source; you can’t really fault them for that.)

According to their official site, many of the more popular ad-blocking plugins use an excessive number of filtering rules to prevent trackers, whereas ABF only uses seven.  What??  Seven???

Well, yes, if this chart isn’t one of those deceptive graphs:

adblock_fast_chart

I can’t resist; may I just take a moment and insert an original George Carlin image macro in here?

12zcy7

If you install ABF on Chrome or Opera, you should see a little button on the toolbar with the company logo on it.  If the “A” on the button has a circle around it, like in the picture above, ads are being blocked on the site.  If not, ads are allowed.  All you need to do to block or unblock ads is to click on the A button again.

I will say that for the techie crowd, ABF may seem a bit too simple (especially compared to more advanced blockers such as µMatrix). It’s not nearly as customizable (at least to my knowledge).

On the other hand, Adblock Plus, as I mentioned in a previous post, allows you to add custom filters and whitelisted domains, as well as to add filter subscriptions from the lists I mentioned before.  And blockers like uBlock Origin allow you to select specific elements within a page and disallow them.

Thus far, on alternativeTo – Crowdsourced software recommendations, Adblock Fast has only received three “likes,” but this may be because Google had temporarily banned ad blockers from the Play Store, and recently reversed the ban. Plus, it’s relatively new to the ad-blocking competition.  So they may need a little time to get their bearings.

The Androids are Coming

resized_winter-is-coming-meme-generator-brace-yourselves-android-users-are-coming-5c1b66

I had hoped to include the Android version of Adblock Fast in this review, but apparently that requires that I download Samsung Internet for Android, and I’m almost at my data limit for the month.

Currently, ABF is also available for Opera, iOS 9 (on 64-bit devices, iPhone 5s and up, and iPad Mini 2 and up).

Perhaps this post will need sequel…hmm?  In any case, my final word is – Adblock Fast is a good blocker overall.  It does its job quickly and efficiently, and is easy to learn.  On the other hand, I don’t necessarily recommend it for people who like “manual transmission”-style privacy tools.  For those folks, I think apps like uMatrix and uBlock Origin are more appropriate!

P.S. For those of you who might ask why I haven’t reviewed any iPhone apps yet, I don’t own one…but my wife does.  Maybe she’ll let me borrow hers for one of these posts, if I bake her breakfast or something.

 

 

 

 

Privacy Tools Part 2: uBlock Origin, RedMorph Browser Controller

Believe it or not, what prompted this post was a comment on one of my older posts,  If We Built This Large Wooden Privacy Badger.  The commenter said that “…there are several other new extensions that are better than Privacy Badger. With tracker domains constantly changing and also first party websites directly loading tracker technology, Privacy Badger heuristic approach will not work.”

I have to admit that I considered this as well; how does Privacy Badger “know” which domains are safe and which aren’t?

According to the Electronic Frontier Foundation, who developed it:

…Privacy Badger keeps note of the ‘third party’ domains that embed images, scripts, and advertising in the pages you visit.  If a third party server appears to be tracking you without permission, by using uniquely identifying cookies…to collect a record of the pages you visit across multiple sites, Privacy Badger will automatically disallow content from that third party tracker.  In some cases a third-party domain provides some important aspect of a page’s functionality…[i]n those cases Privacy Badger will allow connections to the third party but will screen out its tracking cookies and referrers. [Full description available at site]

While this is all true, an algorithm can only be so smart.  I suppose you could ask that of any ad-blocking software, but there must be better options out there.

Therefore,  I realized it was time to begin exploring again.  The more I delve into this topic, the more I become aware of how many privacy tools are in existence (almost too many to count).  This does not, of course, mean that they are all effective, or even useful.

Just Because You’re Paranoid…

0a436901334bd215783ceb04563adcb4442645ab2d1435e35a9b66abc3b776a1

Previously, in Privacy Tools: Ghostery vs. Adblock Plus, I compared these two apps and their various pros and cons.  Also, in said post, I examined the app Privacy Badger, which performs similar functions (though you can use all three together).

So, when I started hunting for alternatives, I visited the site AlternativeTo.net: Privacy Badger Alternatives.  Some of the software listed provide quite different functions than the aforementioned apps.

uBlock Origin definitely has a small and easy to use interface (do you like my poorly edited screenshot?):

ublock_origin

uBlock Origin blocks ads using filter lists such as EasyList, EasyPrivacy, Peter Lowe’s Ad Server List, and Malware Domain List. You can add additional domains to the list under the “My Filters” list in Settings.

As with all ad blockers, using uBlock Origin will occasionally interfere with the functionality of a site, and will also piss off certain site owners, who may “respond” with messages like this:

Adblock_Message_1b

And yes, I get that; I know that ads are how most sites make money.  I’m willing to turn off ad blockers on sites that I trust.  But there are others that just constantly bombard you with pop-ups (and I’m not just talking about porno sites here), to the point where you can barely use the site itself. Those are the sites that apps like uBlock Origin and Adblock Plus were designed for!

Of note – uBlock Origin also features an “element picker” mode (click on the little eyedropper icon), in which you can view the code of specific elements on a page, such as buttons or intrusive ads.  If that particular element is something you want to block, hit the “Pick” button.  This would likely be considered one of the “advanced” features, but it’s quite useful once you get the hang of it.

What I’ve also noticed is that UO appears to block more ads than some of its competitors (like, uhh…Adblock Plus.).  It also has an “advanced” mode, which you can toggle by checking the box below:

ublock_advanced

The “advanced user” settings pertain to things like behind-the-scene network requests that the average user would likely be unfamiliar with.  With the advanced settings enabled, you can custom block requests from specific hostnames (e.g. “wordpress.com”) or specific object types (e.g. 1st-party scripts).  If this sort of thing is something that you understand, and would likely benefit from, then I would suggest checking it off.  If not, don’t!

5205405

RedMorph Browser Controller 

Is it just me, or does the name “RedMorph” sound like a supervillain?  Well, thankfully it’s not, although the websites that rely on ads might disagree.

RedMorph Browser Controller, unlike some of its contemporaries (uBlock Origin, AdBlock, etc.) combines several different security aspects: privacy tool, ad-blocker, parental control device, and encrypted proxy all into one app.

For example, under its “Block Trackers and Content” feature, you have the option to block cookies, trackers, images, third party trackers, and social trackers.  (You can, of course, customize the level of security which you want to use.)

You also have the option of using “Website and Word Filters,” which are generally intended for parents and schools to use for their children (although I suppose you could censor the web for yourself, too):

redmorph_wordfilter

I confess I’m rather new with this app, but it seems to work very well so far.  RedMorph also includes a feature called SpyderWeb, which can give you a comprehensive overview of what domains (and third parties) are tracking you, and how.  It’s a little intimidating when you look at the graph:

spyderweb.png

Now do you see why I’m paranoid?  (I joke.)  RedMorph does give you a fair amount of options as to which trackers and domain names you can block, which is comforting.  It also offers a proxy feature called “Make Me Invisible,” through which you can select proxies in various locations.  On the downside, you have to be a paid member to use this feature.

All in all, I do like RedMorph as well; in fact, you might say it’s better than some of the other apps.  Instead of installing a separate proxy, ad-blocker, and content filter, you can just have all them together.

I have yet to try the full version of the program, but I trust that it does its job efficiently.  Heck, even Bane approves!

67795693

Of course, there are tons of other privacy tools out there, and I have yet to try them all.  But at least I can cross two off of my list.

Let the adventures continue!!

 

 

If It Weren’t for Those Snake People…

Snake-person-face-evra-von-6374874-300-202

I thought it would be fun to just change gears for a moment and talk about one of the funniest apps I’ve come across in awhile.

It’s called “Millennials to Snake People,” (created by developer Eric W. Bailey) and it does exactly what it says. In any instance where the word “millennials” appears on a webpage, the app changes it to say “snake people.”  Or, in the instance of “millennial,” it says “snake person.”

The other funny translation it makes is that in any instance where “Generation Y” appears, that will change to “Serpent Society.”  But don’t take my word for it; look at some of these screenshots:

snakepeoplequiz

snakepeoplequiz

snakepeoplegraduate

snakepeoplemagazine.png

snakepeople_US

I don’t know about you folks, but I find this somewhat encouraging!!  I wasn’t sure if I was considered a millennial, but I guess it depends upon whom you ask.

The only disadvantage to the app that I’m aware of is that on certain sites, or in some minor instances, it doesn’t work.  That being said, this is really the exception rather than the rule.  Overall, I love it, and it brings a smile to my face.

Are Snake People going to be running the world?  Oh God, that’s a scary thought!!

In any case, for a good laugh, visit Millennials to Snake People – Chrome Web Store; you’ll thank me later.

Privacy Tools: Ghostery vs. Adblock Plus

they__re_watching_you__by_dharmainitiative2010-d34asq6.png

How many times have you heard this line?  “They’re watching you…” (A lot, I would imagine.)

Unfortunately, I’ve begun to realize that it’s true (at least with regard to the web).  Even when using the Tor network, which was created with privacy in mind, you’re still under surveillance, which is why some people have stopped using it altogether. (Although that hasn’t stopped me, the intrepid writer.)

Nonetheless, when you’re on the clearnet, there are some tools and plugins that can enhance your privacy (if not ensure it 100% of the time).

In a previous post, If We Built This Large Wooden Privacy Badger…, I discussed the plugin Privacy Badger, created by the Electronic Frontier Foundation (EFF).  For the most part, I’ve had a very positive experience with said Badger – he’s not a friend of trackers, trust me:

4935347

So, I thought it reasonable to compare some of the other popular privacy tools with Privacy Badger, to see which worked the best.

Do You Believe in Ghostery?

ghostery-logo-dark2

*ba-domp ching!*  For those who haven’t heard of Ghostery, it’s a web privacy-themed company; they’re the developers of the Ghostery browser extension.  The extension monitors the various web servers that are being called upon from any given webpage, and makes them correspond with a list of data collection tools (a.k.a. trackers).

And yes, I realize it’s already been reviewed on Lifehacker and other sites, but I still wanted to take a stab at it, and not just take everyone else’s word for it.

With Ghostery enabled, each time you visit a webpage, it searches for all the trackers connected to that site, and compiles them into a neat list, which it will display each time you access a new site:

ghostery blocklist

If you then look at the icon displayed on your menu bar, a little number should be showing next to it, indicating how many trackers have been found on that specific site.  Click that icon, and a dropdown menu (called the “Findings Panel”) will list the specific names of the trackers.  From that menu, you can choose to block or allow any specific tracker:

ghostery_trackers

Granted, as with Privacy Badger and some of the other privacy apps, if you disable all the trackers on certain sites, the sites won’t work properly.  This, of course, is why you have the option of enabling or disabling each tracker individually.

If you only want to temporarily pause blocking so that you can use all of a site’s functions, then that’s what the “Pause Blocking” button is for.  On the other hand, if you trust a site completely, you can click “Whitelist Site.”

Like this blog, right?  You trust me, don’t you??

frabz-Trust-me-Im-the-Doctor-8b7624

All in all, I’ve found Ghostery to be quite useful, but I choose to opt out of their GhostrankTM feature, which “collects anonymous data about the trackers you’ve encountered and the sites on which they were placed.” In theory, this feature is used to help businesses market themselves more transparently (and in a less intrusive way), but it’s also a way for Ghostery to make money – hey, did you think they were doing this for free?

Finally, under its options, Ghostery will show you a list of trackers that it’s blocked, in different categories (e.g. Advertising, Analytics, etc.).  You can choose to enable or disable any of these functions in order to optimize your web experience.

ghostery_whitelist

Cockblock Plus…I mean…Adblock Plus

cbp

Excuse me, little Freudian slip there!! This is what I meant:

Adblock-plus-logo

Adblock Plus is, in a sense, very similar to Ghostery. Sometimes, however, they block different trackers (or different types of trackers).

Actually, one immediate difference that I noticed between ABP and Ghostery was that Ghostery tells you which specific domains it’s blocking, whereas ABP doesn’t.  It merely tells you how many ads its blocked on that page, as well as how many in total.

As a matter of fact, this initially appears to be a disadvantage, because it’s kind of an “all-or-nothing” approach.  However, ABP has a different method for blocking specific elements on a page.

If you right-click on certain page elements, a menu like this should appear (this one’s for Chrome) :

dropdown menu

Click the option that says “Block element.”  Another window should appear, listing the specific page element – you can then add that to your “blacklist” of blocked elements.

blockelement

All in all, Adblock Plus works similarly to Ghostery, but after playing around with it a little, it seems slightly more geared toward the techies among us (me included)!  So really, which one you use (if any) is just a matter of personal preference.

That being said, these are far from the only privacy tools available – perhaps I shall save the rest for a future post.

In the meantime, I’m going to go back to hiding in my paranoia shelter.

paranoid-parrot-meme-8

 

A Chat With Jobi – Creator of Candle Search Engine

by Secrets of the Dark

Candle search

Those of you who’ve used the Tor network probably know that it can be very hard to navigate at times, even when using the different pages that share links.  In fact, I too, can relate to this – the first time I used it, I just relied on some of the link lists, which turned out to be semi-disastrous.

It does, of course, have its search engines, including not Evil, Ahmia, Grams, Sinbad, and the search engine in question – Candle, which can be accessed at Candle Search Engine.(Once again, don’t forget to access it through Tor.)

Candle’s memorable motto is “no parentheses, no boolean operators, no quotes, just words.”   I recently interviewed its creator, who goes by the name “Jobi.” If you’re unfamiliar with how search engines work in general, read on, and you’ll gain some insight!

In his words, he chose the name “Candle” because it:

  • “has the right amount of letters
  • Ends with ‘le’
  • Refers to a thing that brings light in darkness…
  • …but not a lot.” Reddit: Candle (a search engine)

46919-Candles-And-Bokeh

This is how I picture Candle – I’m visual that way.

When we spoke initially on Reddit, I had asked Jobi why he wrote Candle.  He said, “I wrote Candle because it was a challenge.  To see if I could do it and how it would turn out.  It was not designed to be a ‘dark net search engine’, just a search engine.  It could index anything.  I chose to index the Tor web for a couple of reasons.  Mostly because it is nice and small.

“Candle runs on a Macbook.  I don’t have fiber connected server farms.  For me, indexing the real web would be like sucking down an ocean through a garden hose; indexing the Tor web is like sucking down a bathtub through a straw.  Neither are ideal but the latter is not impossible.  Also, the Tor web isn’t that well indexed, so it would be more useful.”

If you happen to be on the Tor network and feel lost, I’d recommend trying out Candle; anyhow, on to the meat of the interview!

Secrets of the Dark: What is your background with regard to coding and web development? (i.e. Do you have formal schooling in programming?)

Jobi: Yes. I studied computer science, and have been coding professionally for almost 20 years.

I have very little experience in web development. I can write HTML 1.0 and…some [Javascript], but that’s it. Candle only produces very few different pages; they are pretty much identical and very simple. All self contained, no external resources.

SOTD: What have been your experiences with running a Tor node?  Have you experienced any harassment or difficulties in the process?

J: No.  It just runs by itself.  I have never talked to my ISP about it and they have never contacted me.  Some web sites block me, but none that are important to me.  My relay is not an exit.  It is just a small relay on a low power machine, a single core 16Ghz Atom.

SOTD: Prior to creating Candle, what are some software projects you have worked on?

J: I created a clickable map of the universe of some space RPG.  It uses only HTML and javascript [sic].  I created a thing where you can upload a picture and it converts it into a format suitable to Flash on phones as a boot-up screen.  It uses PHP to invoke shell scripts.  This is probably [the] most serious web development project I’ve done.

SOTD: You said that you ‘wrote Candle because it was a challenge.’  Do you think that the result you came up with was a successful answer to that challenge?

J: I came across a bunch of issues that I didn’t know before I started.  Mostly things that are a bit fuzzy, that you can not just calculate.

It took a lot of tweaking and tuning in order to prevent lots of rubbish in the index, without filtering out good data. Wikis and forums have lots of links that are just not worth crawling. [My sentiments exactly! – Ed.]

I am very conservative about what I consider a ‘word’: Anything under 3 letters is not a word.  Anything with a non-letter in it is not a word.  Anything with more than 3x the same letter in a row is not a word.  Etc…

In the end I’m quite happy with the quality of the index.

SOTD: I’ve noticed that Candle only returns the top 20 search results (as opposed to all of them). Why did you design it this way?

J: It is part of keeping it lightweight. It also prevents Candle from becoming a tool for others to just suck down the entire index.

Having a ‘next page’ button would mean I’d either have to redo the query, or cache results in ‘sessions’.

SOTD: What kind of work do you do professionally? Is it related to software development, or is that a hobby?

J: I’m a software developer. My day to day work happens in C and C++.

SOTD: Even though a developer, like a magician, might ‘never reveal his secrets,’ would you be willing to give a basic explanation of how the Candle search engine is different from other popular search engines?

J: I don’t believe that Candle is ‘more special’ than others. It is different because I didn’t use any standard framework and came up with my own solutions for things like filtering and ranking.

Also, there is nothing secret about it. I just can not open source it because it uses proprietary libraries from work.

SOTD: Would you be willing to talk about yourself a little (like your educational background)?

J: As I said in question #1, I have studied computer science.

But before that I already coded. As a kid, I got an 8 bit micro. It came with a thick manual and I was curious enough to teach myself how to program it. First in BASIC, then in assembler. This was before the Internet was a thing. Later, I got (access to) a PC and started learning Pascal and C.

SOTD: Did you work with others on this project, or was Candle designed solely by you?

J: I did it solely by myself. At first I never even told anyone it was running. At some point [it] was discovered and the number of hits slowly started to ramp up.

SOTD: Have you ever used other anonymity networks besides Tor (like I2P, Freenet, or GNUnet)? If so, what has been your experience with them? (Has it been positive, negative, or something in between?)

J: I have not. I don’t use Tor that much either, but when I do, it works well enough and I don’t have problems.

SOTD: Is there any kind of content that you try to exclude from Candle search results (such as child pornography)?

J: No. That would be a very slippery slope. Once I start filtering out one thing, I implicitly start condoning everything else.

SOTD: What sorts of changes might you make to Candle’s search algorithms so that it could improve (if any)?

J: The crawling is as good as it gets.

The search result ranking is basically good, but I do still tweak it a little bit from time to time. I do not have a very satisfactory strategy to determine the order in which I visit pages. I have way more URLs than I can visit in a reasonable time, but some URLs deserve to be on a higher rotation than others.

I might add [an] ‘onion history’ feature, where it shows when an onion was up/down, when the home page title changed, things like that. I already keep track of some of that, and I would have to look into how clean and useful that data is.

SOTD: Have people in the Reddit community given you good feedback about Candle, or about Tor in general?

J: I have had a bit of good constructive feedback, but most of it was just ‘hey that looks nice’. Nobody was negative about it, i.e. ‘You suck for making this’.

SOTD: What advice might you give to someone who says, ‘I’d like to develop my own search engine – where should I start?’

J: You can always start with a crawler: read a page with links, parse it, extract the links, add those URLs to your list.

Have it crawl for a few hours, then look at your dataset and see what’s in there that shouldn’t [be].

Come up with filtering rules for those and then restart clean. Repeat this until you are happy with the dataset.

You should also determine your feature set early on. For example, in Candle you can only search for individual words, not phrases.

For certain features it might be necessary to keep copies of the content you index. I decided I didn’t want that.

SOTD: You had told me that ‘With Candle, I try to deliver diverse results. It won’t return multiple results from the same onion, or from the same ‘identical/very similar’ onion.” Would it be possible to explain a little about how this is done?

J: When you enter some words, I look up all the URLs that have those words in it. This might contain multiple URLs from the same onion domain. If so, I only keep the ‘best’ one. It also might contain URLs from onions that are mirrors/copies/clones of each other. This is harder to determine.

Since I don’t keep copies of content, I have to base ‘identicality’ on stats and metadata like title, size, number of words, links, etc. (Have you noticed the ‘onion:…’-link underneath each result?)

Which one is the best is based on how often the words occur, how strong those words are, how many words the page has, etc.

SOTD: What projects are you currently developing, or do you plan to develop, if given the time?

J: I got an Arduino for Christmas, so currently my evening hours are devoted to making LEDs flash.

Writing Candle was really just an exercise for myself. I am still surprised about the amount of use it gets every day.

9edcab8725bed60303c07546d5931839

 

(Well Jobi, I’m glad you created it – and I’m sure millions of other Tor users are too!)