category test

Some random text


i’m alive

After a massive absence, I’ve decided to try and republish my blog.


who stood next to who(m)

I came across some old class photos one day. I was curious who stood next to who the most, and if there was a pattern that showed up.

So, using this as an excuse to play with graphviz and the dot graph language. In this case graph is used in the relationship sense, not the bar and pi graph sense, which is correctly called a chart.

The results are show in the following graph. Blue nodes (the circles) are males, pink nodes are females. Blue edges (the lines) are relationships that stood in the back row, green stood in the middle row, and pink sat in the front row of the photo.

A few things stand out:

  1. The front row normally cosists mostly of females
  2. There is a lot more mix between the back and middle rows than the front row
  3. The strongst pairing that I can see is between M05 and F15

classphoto classphotoproximity

Tagged , ,

2013-01-29T1740 syntax error unknown user puppet in statoverride file

I used apt-get to remove the puppet packages, but I removed the puppet user by hand. Same happened for mediatomb.

After doing this, I kept seeing this error:

Extracting templates from packages: 100%
Preconfiguring packages ...
dpkg: unrecoverable fatal error, aborting:
 syntax error: unknown user 'puppet' in statoverride file
E: Sub-process /usr/bin/dpkg returned an error code (2)

and also

Extracting templates from packages: 100%
Preconfiguring packages ...
dpkg: unrecoverable fatal error, aborting:
 syntax error: unknown user 'mediatomb' in statoverride file
E: Sub-process /usr/bin/dpkg returned an error code (2)

Here’s how I fixed it after the tips in “HowTo: Ubuntu unknown user in statoverride“:

sudo sed -i '/puppet/d' /var/lib/dpkg/statoverride


sudo sed -i '/mediatomb/d' /var/lib/dpkg/statoverride

backup to tastes.

Tagged , ,

siracusa on application management in mac os x lion

[John Siracusa][siracusa] on app management in mac os x lion.

Well put:

The jump in complexity from the Dock to the Finder, I think, needs less explanation. As a general rule, novice users just don’t understand the file system. They don’t understand the hierarchy of machines, devices, and volumes; they don’t grasp the concept of the current working directory; they don’t know how to identify a file or folder’s position within the hierarchy. Fear of the file system practically defines novice users; it is usually the last and biggest hurdle in the journey from timid experimentation to basic technical competence.

Tagged , , ,


How to work out percentages, I always forget:


gawd sometimes i’m stupid



I was looking for a word that would sit at the end of a list of tasks similar to the way “preparation” sits at the start.

It appears that “postparation” isn’t really a “real” word, but I think it’s perfectly cromulent.

So from now on I’m going to try and use postparation and embiggen the english language.

Tagged ,

linux date command

this is cool, i used to do all sorts of hacks to do this at one time:

davidmarsh@server:~$ date +%A --date today-1days

davidmarsh@server:~$ date +%A --date today+1days

davidmarsh@server:~$ date +%A --date yesterday

davidmarsh@server:~$ date +%A --date "2 days ago"

works as expected for other variables

davidmarsh@server:~$ date +%Y --date "last year"

davidmarsh@server:~$ date --date "next monday"
Mon Mar 19 00:00:00 EST 2012


The —date=STRING is a mostly free format human readable date string such as “Sun, 29 Feb 2004 16:21:42 -0800” or “2004-02-29 16:21:42” or even “next Thursday”. A date string may contain items indicating calendar date, time of day, time zone, day of week, relative time, relative date, and numbers. An empty string indicates the beginning of the day. The date string format is more complex than is easily documented here but is fully described in the info documentation.

damn you info docs!

Tagged , ,

large files

A small script to find large files under current mount point, usefull in linux, probably works in bsd and generic unixes:

#20091201:dmarsh:find large files under current mount point
#20100413:dmarsh:changed echo cmd to print -u2 so message goes to stderr
set -e
set -u
cmd="/usr/bin/find . -type f -size +1048576c -exec /usr/bin/du -sk {} \; 2>/dev/null | /usr/bin/sort -n"
print -u2 ${cmd}
eval ${cmd}
Tagged , , ,

how to use putty and ssh

How to use Putty and SSH:

  1. Download the latest version of putty from the putty download page. You’ll want putty.exe from the release version for Windows on Intel x86 section (direct download link)
  2. Put putty.exe somewhere in your $PATH, I normally just use c:\windows
  3. Run putty.exe
  4. Fill in the “Host Name (or IP address)” field with the username and hostname or IP address (eg username@hostname or username@ipaddress):
    putty configuration options
  5. Make sure the Connection tye is set to SSH
  6. Click the Open button. If it’s the first time connecting to the host, you’ll see a dialog box like this:
    putty security alert
  7. If everything has gone well, you’ll connect as username and be able to type in your password:
    putty login
  8. You should now be connected to the server
Tagged , ,

flexget and pdf files

I’ve recently started looking at the awesome flexget and thought it would solve a problem with my sons school newsletter.

Like most schools, the one our son attends has a weekly newsletter to inform the parents of upcoming activities and events at the school. When he started the school gave us the option to either have a physical printout of it sent home with our child1, or we could just download the pdf from their website.

We opted-out of the printed newsletter and were happy to check the website for the pdf version of the newsletter.

As we’re both busy and sometimes forget to check the website we’d occasionally miss things.

I wanted to download the pdf automatically when it appeared on the website and email it to us so we wouldn’t have to remember to check for the latest version.

Initially I used wget called from cron to check the website like this:

/usr/bin/wget -r -l1 -N --no-verbose --continue --no-parent \
--no-directories --no-host-directories --reject html,htm,txt \
--accept .pdf -o /var/log/newsletters.log \
--directory-prefix=/srv/samba/newsletters \

I used the --continue flag so that it wouldn’t download the same pdfs over and over. Even taking this into consideration this method still felt like a brute force approach.

(I won’t go into it here, but I then use incron to look for changes to the /srv/samba/newsletters, which calls another script and emails the file as an attachment)

I like how flexget remembers what it has seen in a database and not download that file again. I thought this would solve the problem very eloquently and as I couldn’t find much info about getting files from a URL automatically I thought I’d share my config here so others looking to do the same could benefit.

Here’s my newsletter.yml config:

      path: /srv/samba/newsletters
      space: 1 #make sure there's Xgb free before downloading more
    domain_delay: 10 seconds
      active: True
      from: davidmarsh
    interval: 6 hours
      title_from: link
        - quakers_whisper*
      rest: reject
    download: /srv/samba/newsletters

This will:

  • In the global section
    1. Check there’s 1gb free on /srv/samba/newsletters (which is on the same disk as /)
    2. Wait 10 seconds between checking (even though there`s only one check, I wanted this here if I add more later)
    3. email me if it downloads something
  • In the feeds section:
    1. wait 6 hours between checks (if called sooner it will not run the check)
    2. use the url for checks
    3. name the downloaded files from their link title
    4. accept any file starting with “quakers_whisper” (the name of the newsletter)
    5. reject any other links it finds that don’t match the above
    6. download the matching links to the /srv/samba/newsletters directory

I call it from cron with this command:

/usr/local/bin/flexget --cron -c /home/davidmarsh/.flexget/newsletter.yml

It doesn’t really matter how often it runs as it will only actually hit the website every 6 hours due to the interval: 6 hours option in the yml file.2

(like before, I’m still using incron to call scripts to email the files)

Now we get an email with the newsletter attached within 6 hours of a new newsletter appearing on the schools website.

  1. Which may or may not make it home scrunched up in a small ball in the bottom of a school bag 

  2. Of course I could make it more frequent, but 6 hours seemed good enough. 


crossing a threshold

On episode 73 of the talk show, Dan asked John if the time he’d been with his (current) wife was longer than the time before marrying her.

I thought I’d work out when that boundary crossing was for me and how old I will be when I cross that line with Sam:

  • My Birth: Thursday, March 11, 1976
  • Married: Saturday, April 19, 2003

Using the awesome wolfram alpha which makes this sort of calculation insanely easy:

  • Birth to Marriage (Thursday, March 11, 1976 to Saturday, April 19, 2003): 27 years, 1 month, 8 days
  • Marriage to Current (Saturday, April 19, 2003): 8 years, 9 months, 3 days

I’ll will have been with Sam longer than being not-married on (Saturday, April 19, 2003 + 27 years 1 month 8 days): Monday, May 27, 2030.

I’ll be (Thursday, March 11, 1976 to Monday, May 27, 2030): 54 years, 2 months, 16 days old.

Tagged ,

os x lion

finally upgraded to lion:

about this mac

Tagged ,

calepin tips

I like using calepin; using markdown + dropbox is a great combination.

The below is a few tips on things that I found work well:


Needs some work:

  "default_date_format": "%Y-%m-%d",
  "google_analytics": "UA-27986535-1"

remember to remove the last “,” or it doesn’t work


I add a footer to my posts like this:

<footer><div class="metadata">...</div></footer>


I add ads to my site like this:

<div style="z-index: 2; position: absolute; right: 10px; top: 165px;">

I need to fix this so it works on mobile sites better.


My file naming convention is yyyy-mm-ddTHHMM, so I use this script to check that the title and the date matches between the file name and blog post:

Note: this is a work in progress!


set -e
set -u


for post in *.md ; do

    ### check dates
    #todo, check for valid dates
    file_date="$(echo "${post}" | cut -d" " -f1)"
    blog_date="$(grep '^Date:' "${post}" | cut -d: -f2- | sed -e 's/^ //' -e 's/ /T/' -e 's/://')"

    if [[ "${file_date:=""}" != "${blog_date:=""}" ]] ; then
        echo "----------------------------------------"
        echo "dates are different: ${post}"
        echo "1. file date: ${file_date}"
        echo "2. blog date: ${blog_date}"
        echo -n "which one is correct [1,2]:"
        read choice
        case ${choice} in
                #file date is correct
                echo "you chose 1"
                #convert file date to blog format for use later
                new_date="$(echo "${file_date}" | sed -e 's/T/ /')"
                echo "so im gonna change ${blog_date} to ${new_date}"
                #blog date is correct
                echo "you chose 2"
                #convert blog date to file date for use later
                new_date="$(echo "${blog_date}" | sed -e 's/ /T/' -e 's/://')"
                echo "so im gonna change ${file_date} to ${new_date}"
                echo "skipping, doing nothing for ${post}"

    ### check titles and slugs
    file_title="$(echo "${post}" | cut -d" " -f2- | sed -e 's/\.md$//')"
    blog_title="$(grep '^Title:' "${post}" | cut -d: -f2- | sed -e 's/^ //')"

    if [[ "${file_title:=""}" != "${blog_title:=""}" ]] ; then
        echo "----------------------------------------"
        echo "titles are different: ${post}"
        echo "1. file title: ${file_title}"
        echo "2. blog title: ${blog_title}"
        echo -n "which one is correct [1,2]:"
        read choice
        case ${choice} in
                #file title is correct
                echo "so im gonna change ${blog_title} to ${new_title}"
                #blog title is correct
                echo "so im gonna change ${file_title} to ${new_title}"
                echo "skipping, doing nothing for ${post}"


#find drafts
echo "----------------------------------------"
echo "drafts:"
for post in "$(grep -l "^Status: " *.md)" ; do
    echo "${post}"


I publish using this script:

#copies md and settings files from blog to calepin
#copies /images to /Public/images

set -e
set -u


if [[ "${uname}" == "Darwin" ]] ; then



function publish_posts() {
  rm "${postdir}"/*.md
  cp "${blogdir}"/*.md "${postdir}"

  cp "${blogdir}/settings.json" "${postdir}"

  cp -R "${blogdir}/images" "${publicdir}"

function append_posts() {
  for post in ${postdir}/*.md ; do
  echo "${post}"
  echo >> "${post}"
  echo "----" >> "${post}"
    for append in "${footer}" "${adfile}" ; do
      echo >> "${post}"
      cat "${append}" >> "${post}"


if [[ "${uname}" == "Darwin" ]] ; then