Spaghetti Programming

Litecoin

Hardware

I decided to jump on the cryptocurrency bandwagon and start mining. Here are the parts I selected for my rig:

  • Processor: AMD Athlon II X2 270 - Amazon Link

  • Motherboard: ASRock 970 Motherboard - Amazon Link

  • Memory: Corsair XMS3 4 GB DDR3 - Amazon Link

  • GPUs: 4 of the ATI Radeon 7970 - Amazon Link

  • Power Supply: Seasonic SS-1250XM - Amazon Link

  • Hard Drive: Spare 2.5 inch laptop SATA drive that I had laying around

  • 1x Riser Cables: 2 16x to 1x PCI-E Riser Cables - Amazon Link

  • 16x Riser Cables: 2 16x PCI-E Riser Cables - Amazon Link

  • Case: Old milk crate sitting in my basement

  • Operating System: GNU/Linux (is there really any other choice?)

The ATI 7970 is probably not the most cost effective choice from a dollar vs. hash rate perspective. As of now the 7950 seems to be the way to go, but I didn't find that out until I bought the hardware. The 7970 takes a good bit of tweaking to get running at optimal hash rate. People claim about 750Kh/s, but I was never able to get my cards to clock high enough to get those numbers. Still I was able to get reasonable performance out of them and am seeing 715Kh/s ish from each card. This results in about 2.85Mh/s total. This setup is hungry and pulls about 1100 watts as measured by a P3 Kill A Watt when crunching hashes.

Software

The following software runs the rig. Once this is all installed mining can begin.

  • Xubuntu 12.04
  • AMD APP SDK 2.7
  • AMD Catalyst Driver 13.4
  • cgminer 3.0.0

Unfortunately X must be running for mining to happen. I'd rather not run the window manager, but that is just the way the AMD driver works. You must have the relevant section in your xorg config file to load the fglrx driver. AMD provides a tool to generate the relevant section:

sudo aticonfig --adapter=all --initial

You can see what adapters the system has detected with:

sudo aticonfig --list-adapters

Once all cards are detected mining can begin. I use this simple script to start cgminer:

#!/bin/bash

export GPU_MAX_ALLOC_PERCENT=100
export GPU_USE_SYNC_OBJECTS=1
export DISPLAY=:0

/home/crypt/Downloads/cgminer-3.0.0-x86_64-built/cgminer
Here is the cgminer.conf which controls all the cgminer settings. The cgminer config file is just a JSON file. Make sure you put plenty of pool servers in the pools array in case of pool outages. I find it much easier to manage everything using the conf file rather than a command line a mile long to specify all the options.
{
"pools" : [
    {
        "url" : "stratum+tcp://server00:port",
        "user" : "worker.1",
        "pass" : "xxx"
    },
    {
        "url" : "http://server00:port",
        "user" : "worker.1",
        "pass" : "xxx"
    }
]
,
"intensity" : "13,13,13,13",
"vectors" : "1,1,1,1",
"worksize" : "256,256,256,256",
"kernel" : "scrypt,scrypt,scrypt,scrypt",
"lookup-gap" : "2,2,2,2",
"thread-concurrency" : "10240,10240,10240,10240",
"shaders" : "2048,2048,2048,2048",
"gpu-engine" : "1045,1045,1045,1045",
"gpu-fan" : "100,100,100,100",
"gpu-memclock" : "1500,1500,1500,1500",
"gpu-memdiff" : "0,0,0,0",
"gpu-powertune" : "0,0,0,0",
"gpu-vddc" : "0.000,0.000,0.000,0.000",
"temp-cutoff" : "95,95,95,95",
"temp-overheat" : "85,85,85,85",
"temp-target" : "75,75,75,75",
"api-port" : "4028",
"expiry" : "120",
"gpu-dyninterval" : "7",
"gpu-platform" : "0",
"gpu-threads" : "2",
"hotplug" : "5",
"log" : "5",
"no-pool-disable" : true,
"queue" : "1",
"scan-time" : "60",
"scrypt" : true,
"temp-hysteresis" : "3",
"shares" : "0",
"kernel-path" : "/usr/local/bin"
}
The rig is managed remotely via ssh with cgminer running in a screen session. Output:
 cgminer version 3.0.0 - Started: [2013-05-08 19:18:30]
--------------------------------------------------------------------------------
 (5s):2.856M (avg):2.858Mh/s | A:10305  R:99  HW:0  U:80.7/m  WU:2609.0/m
 ST: 2  SS: 0  NB: 74  LW: 2533  GF: 0  RF: 0
 Connected to stratum.server00.com diff 32 with stratum as user worker.1
 Block: 26e37d2565fc4d97...  Diff:30.8M  Started: [21:25:55]  Best share: 445K
--------------------------------------------------------------------------------
 [P]ool management [G]PU management [S]ettings [D]isplay options [Q]uit
 GPU 0:  71.0C 3138RPM | 717.3K/717.3Kh/s | A:2555 R:29 HW:0 U:20.00/m I:13
 GPU 1:  63.0C 3099RPM | 710.1K/711.0Kh/s | A:2564 R:28 HW:0 U:20.07/m I:13
 GPU 2:  72.0C 3070RPM | 714.9K/716.1Kh/s | A:2602 R:20 HW:0 U:20.37/m I:13
 GPU 3:  69.0C 3053RPM | 713.7K/714.6Kh/s | A:2585 R:22 HW:0 U:20.24/m I:13
--------------------------------------------------------------------------------

Postgresql

A while ago I had to set up streaming replication in Postgres for redundancy purposes. This guide is my notes on how to set up Postgres 9.1 on Ubuntu 10.04 LTS (Lucid Lynx) and configure a master with a single slave for database replication.

Installing Postgres 9.1 on Ubuntu 10.04

Add this to /etc/apt/sources.lst

deb http://ppa.launchpad.net/pitti/postgresql/ubuntu lucid main
deb-src http://ppa.launchpad.net/pitti/postgresql/ubuntu lucid main
Add Martin Pitt's (the author of backported packages) key from his site https://launchpad.net/~pitti/+archive/postgresql

Expand the “Technical details about this PPA” panel.

Under Signing Key, copy everything to the right of the slash.

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 8683D8A2 
sudo apt-get update
Install postgres 9.1 packages:
sudo apt-get install postgresql-9.1 postgresql-client-9.1 postgresql-contrib-9.1
postgresql-plperl-9.1

Configuring Replication on Master

Create a role for yourself as postgres user.

createuser -s -e -d my_user
On master change the postgresql.conf to have these settings:
listen_addresses = '*' # to accept a connection from the slave 
 # Leaving ssl renegotiation turned on seems to cause issues when 
 # streaming a lot of wals.  The results are broken connections between 
 # master and slave which causes the slave to fall behind in writing 
 # wal segments.  Turning this off alleviates the problem. 
ssl_renegotiation_limit = 0     # amount of data between renegotiations 
 # My server's disk is capable of writing at ~ 70MB/sec. 
 # 70MB * 30 seconds = 2,100MB written in 30 seconds.
 # The checkpoint_segments value needs to be set high enough
 # to control the rate of checkpoints.
 # Setting it to 128 here give us 128 * 16 = 2048MB between checkpoints.
 # At worst this would be one checkpoint every 30 secondsish.
checkpoint_segments = 128   # in logfile segments, min 1, 16MB each
wal_level = hot_standby     # to generate WAL records for streaming replication.
max_wal_senders = 5         # to specify the max number of the slave(s).
 #Setting the wal_keep_segments to 
 # 512 will allow the slave to fall behind by eight gigabytes which should be a 
 # generous buffer.  The downside is that it will take up eight gigabytes
 # of disk space.
wal_keep_segments = 512         # in logfile segments, 16MB each; 0 disables

Depending on how trusting you are, on the master configure pg_hba.conf to allow trusted connection for replication for postgres user

host    replication     postgres        192.168.1.11/32         trust

Configuring Replication on Slave

On slave server configure postgresql.conf and pg_hba.conf exactly the same as master.

On slave server in addition to the same settings as master add the following to postgresql.conf

hot_standby = on       # to allow read-only queries on the slave (standby) node.

On slave server create recovery.conf file in $PGDATA dir like such:

  # Note that recovery.conf must be in $PGDATA directory.
  # Specifies whether to start the server as a standby. In streaming replication,
  # this parameter must to be set to on.
  standby_mode          = 'on'
  # Specifies a connection string which is used for the standby server to connect
  # with the primary.
  primary_conninfo      = 'host=192.168.1.10 port=5432 user=postgres'
  # Specifies a trigger file whose presence should cause streaming replication to
  # end (i.e., failover).
  trigger_file = '/pg_failover'
  # Specifies a command to load archive segments from the WAL archive. If
  # wal_keep_segments is a high enough number to retain the WAL segments
  # required for the standby server, this may not be necessary. But
  # a large workload can cause segments to be recycled before the standby
  # is fully synchronized, requiring you to start again from a new base backup.
  # restore_command = 'cp /var/lib/postgresql/9.1/main/pg_wal/%f "%p"'

Initial Data Sync and Starting Replication

Stop db on slave server.

Stop db on master server.

Copy data files from master to slave. Conf files and server keys are not in the default data dir of Debian based packages, but if installed from source or another method then they may be. Excluding those items here will not hurt. The below command is assumed to be run locally on the master with the destination being the slave (192.168.1.11).

  rsync -a -v -e ssh /var/lib/postgresql/9.1/main/ 192.168.1.11:/var/lib/postgresql/9.1/main/  --exclude postmaster.pid --exclude postgresql.conf --exclude pg_hba.conf --exclude server.crt --exclude server.key --exclude recovery.conf --exclude pg_ident.conf --exclude pg_ctl.conf --exclude start.conf

Note - Not covered here is the setup of ssh keys for postgres users on master and slave servers for rsync.

Start db on slave.

Start db on master.

Verify replication is working by checking log files.

Congratulations. You're done.

Monitoring Replication Lag

It is important to keep tabs on replication lag so that the slave does not fall too far behind the master. If the master removes a WAL segment that the slave has not yet processed then the two clusters will become out of sync and the slave will need to be reloaded from a full backup. There is a Nagios plugin called check_postgres which is rather robust and can monitor the replication lag for us. This section will describe two ways to monitor replication lag using the check_postgres plugin for Nagios.

Monitoring Lag by Data Volume

One method of measuring the lag would be to look at the offset that the master is working on in the WAL vs what the slave is working on in the WAL. This is possible using the action parameter called hot_standby_delay. You can pass in two host names, and two ports to monitor the lag between master and slave. The documentation on this is rather sparse and does not explicitly state this, but I believe the warning and critical parameters are in bytes. Thus if you want to warn when the slave becomes 2 gigabytes behind the master and critical when the slave becomes 4 gigabytes behind the command would look like this:

./check_postgres.pl --host=192.168.1.25,192.168.1.24 --dbport=5435,5432 --action hot_standby_delay -w 2139095040 -c 4278190080

In the above example 192.168.1.25 and port 5435 are used to denote the master instance and 192.168.1.24 with port 5432 are used to denote the slave instance.

Monitoring Lag by Time

Another way to measure lag is by the amount of time it takes for the slave to replicate a change made on the master. This is done using the replicate_row action in the check_postgres plugin which works by making a change to a replicated table in the master and then measuring how long it takes that change to appear in the slave. A check row example would look like this:

/usr/lib/nagios/plugins/check_postgres.pl --host=192.168.1.25,192.168.1.24 --dbport 5432,5432 --action replicate_row --warning='10 seconds' --critical='60 seconds' --dbname my_database_name,my_other_database_name --repinfo=replication_check,id,3,mvalue,wax_on,wax_off

This command takes parameters for the master and slave hosts, master and slave ports, master and slave database names, warning and critical thresholds, and the repinfo parameter. Repinfo contains the table name, primary id column name, primary id value, data column name, and two alternate data values which will be used for the replication test. In the above example the check will generate some sql like this for the test:

update replication_check set mvalue = 'wax_on' where id = 3;

This sql will run on the master and the plugin will measure the time it takes for that change to propagate to the slave. The next time the test is run it will use the alternate value for the test like this:

update replication_check set mvalue = 'wax_off' where id = 3;

It will continue to alternate between those two values upon each iteration. My database uses a small table explicitly for the purposes of testing the replication in this way. Here is its dump:

CREATE TABLE replication_check (
      id integer,
      mvalue text
  );
  
  COPY replication_check (id, mvalue) FROM stdin;
  1 do not modify or remove this
  2 used by nagios for checking the replication status
  3 wax_on
  \.

OK. This is just hilarious and scary at the same time. Hilarious because people actually go around "liking" things on facebook which could be embarrassing if it was ever known publicly that they liked them and scary because facebook is collecting and amalgamating all this information on people. actualfacebookgraphsearches.tumblr.com