Downloading the WordPress Codex

My company is shipping me off to Japan on Saturday for a month (possibly more) to train one of our partners on how best to use our system. It’s going to be an interesting experience. Not only will I be conversing regularly with people whose language I do not speak, but I’ll be completely immersed in a different culture about which I know very little. I mean, until now Japanese culture has played only two roles in my life: Pokémon and Dance Dance Revolution. Obviously, there’s much more to it, and I’m totally excited to learn all about it.

So far, there’s just one thing I’m nervous about: that dreaded plane ride.

In preparing for this trip, I’ve been creating massive to-do lists (Astrid is an amazing tool for this, by the way!). And I’ve identified some technical tasks that would be ideal for the 12+ hours I’ll be spending in the air. One of those tasks: convert a friend’s website from static HTML into a WordPress site. It’s pretty simple on the outset… However, WordPress has some complexity when it comes to templates and plug-ins, and so I’d like to have the WordPress documentation available to me while I’m offline.

The problem: WordPress documentation is a set of HTML pages, and cannot be easily downloaded. I searched around a bit, and it seems this question’s been asked before. And the answer is usually: use a tool to download a local copy of the website. The tool of choice: HTTrack. I’m on Ubuntu, so installing was easy: sudo apt-get install httrack

Once installed, I gave it a shot:

tkelley:~/> httrack "http://codex.wordpress.org/"
Mirror launched on Wed, 09 May 2012 13:48:30 by HTTrack Website Copier/3.44-1+libhtsjava.so.2 [XR&CO'2010]
mirroring http://codex.wordpress.org/ with the wizard help..
Done.codex.wordpress.org/ (162 bytes) - 403
Thanks for using HTTrack!

…but that exited pretty quickly. Not what I was expecting for a full site download. During the execution, HTTrack generated a log. The log ended with this:

tkelley:~/> tail hts-log.txt | tail -n2
13:52:56	Error: 	"Forbidden" (403) at link codex.wordpress.org/ (from primary/primary)
13:52:56	Info: 	No data seems to have been transfered during this session! : restoring previous one!

403 (“forbidden”) error? That’s weird… I’m able to view it in my browser and via wget/curl without problems:

tkelley:~/> curl -G codex.wordpress.org --write-out %{http_code}"\n" -s -o /dev/null
200
tkelley:~/> wget codex.wordpress.org 2>&1 | egrep HTTP
HTTP request sent, awaiting response... 200 OK

HTTrack must be sending something that WordPress doesn’t like. Of the usual suspects, I’ve found that the most common is User Agent. In this case, It turns out that HTTrack passes its own user agent of “Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)”, and… wouldn’t you know… WordPress isn’t a fan:

tkelley:~/> wget -U "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)" codex.wordpress.org 2>&1 | egrep HTTP
HTTP request sent, awaiting response... 403 Forbidden

Now that we know what’s causing it, all we need to do is play make-believe. Pass a “good” user agent string (from a browser that WordPress accepts, say, Chromium) using the -F flag, and we’re good to go:

tkelley:~/> httrack "http://codex.wordpress.org/" -F "Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.19 (KHTML, like Gecko) Ubuntu/11.10 Chromium/18.0.1025.168"

I honestly don’t know why WordPress would block HTTrack. After all, WordPress is licensed GPLv2, and I imagine its documentation is as well (derivative work?). So anyone should be able to download the entire thing for their own use. Perhaps it becomes a strain on their servers? If so, there are certainly better ways of blocking repeated requests from the same IP address. Anyone out there have any ideas? I’d love to hear them!

Airside

Well, it is a sad day indeed. My very favorite creative agency ever has finally closed up shop.

Airside got its start in 1998, back when web design was in its infancy, and the idea of pursuing a career in graphic arts was capricious at best. Things have changed over the last 14 years. The web looks totally different (anyone remember what Yahoo! looked like in 1998?), and now graphic design is a great career to have…especially if you’re good at it.

…and no one is better at it than Airside. They’ve already shut down their website, so it’s not easy to find a portfolio, but here are a few things I’ve found here and there:

That last one might look familiar. That’s because it was featured on the album cover for Lost Horizons by Lemon Jelly. In fact, Lemon Jelly’s Fred Deakin was one of the founding members of Airside, and the band and agency have collaborated a lot. Basically, all of Lemon Jelly’s music videos were done by Airside. A few of my favorites:

Anyway… As it turns out, Airside didn’t call it quits because of business or personal differences. Actually, they say that business has never been better, and that they are still very good friends. Instead, they say they are closing so they can pursue their own personal projects.

So that leaves me wondering… Lemon Jelly reunion?
::crossing my fingers soooo damn tightly::
:oD

Goals for 2012

We’re almost two months in to the new year, and I’ve been meaning to write a post on New Year’s resolutions (already off to a bad start, I know…). I’ve had some interesting resolutions in the past — some I’ve attained, some I haven’t. Last year’s fell under the latter category. I had two resolutions last year: Develop an iPhone application, and read the entire series of James Bond books. In terms of iOS development, I got about two chapters into a coding book, and wrote a “Hello, world” app. In terms of James Bond, I didn’t even crack open a single book.

Any business grad worth his salt knows that goals need to be specific and measurable, and that’s how I designed those two resolutions. But I forgot something: goals also need to be attainable. These two goals were wildly ambitious, especially considering that I spend a good 70-80% of my waking hours either working, or learning things that will help me expand my career.

So this year, I’m taking a different approach. Smaller goals, and more of them. Ten, in fact. Ten specific, measurable, attainable goals. Here goes:

  • 1. Read a novel

    Ian Fleming wrote 12 James Bond books. How I ever thought I could find the time to read them all in a year is beyond me. That said, I need to start reading again. So for this year, a single novel will do.

  • 2. Weight Stasis

    I’ve come to the realization that I’m never going to really lose weight again. I don’t have the time or energy to put into it, and it’s just not high enough on my list of priorities. So for the time being, I’d like to just maintain my current weight of 220. If I lose anything, then that’s a bonus. But I’ll seriously be happy ending the year no fatter than I started it.

  • 3. Bike 200 Miles

    Ever since I moved to a company that gives laptops to their employees, my subway commute has become a valuable time resource for me. I can usually get an extra hour of uninterrupted work done on my way to and from my job. The side effect: I’ve stopped biking entirely. I think I can count on one hand the times I took my bike out last year. That needs to change.

  • 4. Mayor of the YMCA

    The last fitness-related resolution, I swear! Becoming the FourSquare mayor of my gym means that I’ll need to go more than any other FourSquare user over the course of a month. That’s doable, right?

  • 5. Escape the Continent

    Don’t get me wrong… North America is great. Love it to pieces. But I just want to leave for a week or three. In fact, I have five weeks of vacation that need to be used by the end of the year, so I really have no excuse not to.

  • 6. Build a Computer

    Something I’ve always wanted to do. And now I work with a ton of people who can guide me along the way. Plus, my old HP is totally busted. It’s time for my inner geek to shine through.

  • 7. Learn 100 words in Tagalog

    What? I want to impress the in-laws!

  • 8. Save 5%

    Ever since I moved to a company that doesn’t match 401(k) contributions, I’ve stopped saving for retirement altogether. I’m a few years behind now, so I really need to start investing again. I’ve already done my time as a Wal-Mart door greeter; I don’t want to have to do it again when I retire.

  • 9. 2,000 Reputation on Stack Overflow

    Stack Overflow is a site where coders help out other coders. I’ve learned so much from reading other people’s questions, and I’m finally getting to a point where I have enough knowledge to help others. It’s a pretty cool feeling. Anyway, the community rewards good answers with “reputation points”. I had 608 at the beginning of the year, and two months into the year, I’m at 813. Off to a good start already!

  • 10. Write a WordPress plug-in

    One of the reasons last year’s iPhone app didn’t work out was that not only was I going to have to learn the iOS SDK, but I’d have to learn an entirely new language (one that’s known for being difficult to get used to). I was in unfamiliar territory on all fronts. But WordPress? Now that’s more like it! I’ve been using WordPress for years, and I understand its API. Plus, I’m a pro at PHP. So this year, I’d like to publish a WordPress plug-in. Don’t know yet what it will be, so I’m open to ideas!

So there you have it — my 10 goals for the year. They’re all definitely attainable, so I’m really hoping for 100% success here. Just need to make sure I’m watching the calendar. Otherwise, you’ll be sure to see me at the gym on December 31, darting my eyes back and forth between the Casino Royale book in my left hand, and the WordPress API documentation on the phone in my right.

How Secure is Your Password?

Just saw this site on a friend’s Google+ feed:

http://howsecureismypassword.net/

No surprises as to what it does… You enter a password, and it tells you how long it would take a modern desktop computer to crack it using a brute force attack.

(Side note: I had some initial reservations about entering my password into a site like this, where it could so easily be captured. However, I checked Firebug for asynchronous network calls, and it turns out that everything is happening client-side, so the password actually never leaves your browser.)

Anyway, as it turns out, I’ve got a pretty strong password… It’d take about 6,000 years to crack. 6,000 years ago was when civilizations were first popping up in Mesopotamia. So I feel pretty safe.

But I’m curious… What if my password were as strong as a SHA or MD5 hash? I’m sure these calculations have been made a million times before, but here goes…

Start with the most common password of them all: “password”. The site tells me that it would be hacked “almost instantly”.

Apply an MD5 hash to “password” to get “0cc175b9c0f1b6a831c399e269772661″
…and already “almost instantly” becomes “About 8 decillion years”. Wow.

SHA1(“password”) = “5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8″
…about 22 quattuordecillion years.

The SHA-2 hash comes in four flavors: SHA-224, SHA-256, SHA-384, and SHA-512.
SHA2(“password”, 224) = “d63dc919e201d7bc4c825630d2cf25fdc93d4b2f0d46706d29038d01″
About 180 duovigintillion years

After this, the site’s JavaScript starts crapping out… It tells me it’ll take 302 quattuorvigintillion years to crack each of the 256-, 384-, and 512-bit hashes. Just to provide some color here, that’s
30,200,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 years.

But let’s be honest, given that I won’t live to see the 22nd century, I’m certainly not concerned about the 30 x 10^75′th one. I think I can rest easy with my 6,000 years. Want to hack me? Go ahead and start now… I’ll check back with you in 8011.

Metropolis II


I might have to plan a trip to Los Angeles next year just to see this…

Can’t Hug Every Cat (+ Remix)


Yes, it’s completely fake. And she’s really annoying.
But I can’t legitimately call myself an internet nerd without posting this…

That was good. This is better:

Bubble Sort in Hungarian Dance

Know your algorithms!

Bach Cantata in Wood

Famous Objects from Classic Movies

Definitely just wasted a good hour or so on this game. The idea is: you’re given the silhouette of an object that relates to a classic movie. Then you guess the title Hangman-style.

I stopped at 100.
72% is a passing grade, right?

7 Billion