Hiltmon

On walkabout in life and technology

Sanity Saver: Detox Expands t.co Links

If you use Twitter and click on t.co links, or use Google search and follow their links, chances are your browser history is a mess of t.co/C*R@A%P& or google.com/S$H@I*T#. Which means there is no way to use your browser history to find that page you just saw (or saw a few days ago) and accidentally closed. And autocomplete does not work either.

Shaun Inman (of Mint and The Last Rocket fame) created a Safari extension called Detox a while ago that automatically expands those pesky t.co links and funky Google links back into their original URLs and titles. Making your browser history and autocomplete useable again.

Download it from here and double click the Detox.safariextz file to install it. And fuggedaboudit!

Highly recommended for all OS X Safari users.

Thank you Shaun for a great Sanity Saver.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

Rails Tricks - Sharing the Model

I am building a series of Rails applications for different users and use cases, but they all hang off the same database schema. Using canonical Rails, that means a single, massive rails app with a bunch of controllers, a heap of views and a complex security model.

I prefer small, focussed apps, so I decided to share the model instead. Here’s how it works.

Sharing the model

Lets call the the first rails project Master. Generate it as usual and create a few models and migrations to set up the database. In my case, the Master rails application is used to do nothing more than import, export and allow users to view the data in the model’s database tables.

rails new Master --database postgresql
cd Master
rails generate model model1 ...
...
rake db:create
rake db:migrate

Lets now create a reporting app, called Reporting that will use the same model as Master.

cd ..
rails new Reporting --database postgresql
cd Reporting

Do not create any models or migrations in this project, you do them all in Master.

To copy the models, create a new rake task in a new file in Reporting called lib/tasks/sync.rake:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
namespace :sync do

  desc 'Copy common models and tests from Master'
  task :copy do
    source_path = '/Users/Hiltmon/Projects/Master'
    dest_path = '/Users/Hiltmon/Projects/Reporting'

    # Copy all models & tests
    %x{cp #{source_path}/app/models/*.rb #{dest_path}/app/models/}
    %x{cp #{source_path}/test/models/*_test.rb #{dest_path}/test/models/}

    # Fixtures
    %x{cp #{source_path}/test/fixtures/*.yml #{dest_path}/test/fixtures/}

    # Database YML
    %x{cp #{source_path}/config/database.yml #{dest_path}/config/database.yml}
  end
end

Change the source_path to point to the root of the Master project and the dest_path to point to the root of the Reporting project. The run it:

rake sync:copy

The script copies the app/models, test/models, test/fixtures and database.yml files from Master to Reporting. As far as Rails is concerned, these were created via genuine rails commands and the Rails engine will make these models available in your reporting views and controllers. The database.yml file will also make Rails point to the same database as Master.

Two rules need to be followed to make this work:

  1. All models or migrations are done in Master and only in Master. That includes any changes to model files.
  2. Run a rake sync:copy every once in a while to bring the Reporting project up to date after changes in Master.

Top Tips:

  • If you run rake db:migrate in the Reporting project, no harm will be done (because there are no migrations in it). But the schema.rb file will be regenerated. Which is cool.
  • You can add additional projects that use the same shared model, I have 5 at the moment.

Rejected Options

I could have just symlinked the model and db folders between the projects, and therefore not needed the rake task. But that only works for a single development system and single developer. I would have to relink for each new computer or developer that needs to work on these projects.

I also could have used git submodules to provide a common repository for the copied code, but that would mean that I could not use the regular rails and rake commands for models and migrations, which defeats the purpose of using Rails.

Result

As a result of sharing (ok, copying) the models this way, I have several Rails applications running off different servers providing different services to different users all running off the same back-end database. Each application is smaller, simpler, easier to manage and easier to code and maintain. As long as discipline is maintained in that the models and migrations are only performed in the Master project, this trick works great.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

Why Microsoft Word Must Die

Charles Stross, yes, that Charles Stross writes in Why Microsoft Word must Die:

The .doc file format was also obfuscated, deliberately or intentionally: rather than a parseable document containing formatting and macro metadata, it was effectively a dump of the in-memory data structures used by word, with pointers to the subroutines that provided formatting or macro support.

He focuses on the usability issues, the compatibility issues, the inconstancies and the forced need to upgrade just to read other people’s documents - and the idiocy of using this incomprehensible, short lived format in publishing.

And that’s the key, the file format is a disaster. If you use Word, these days you cannot open Word files you created more than 10 years ago! I tried! Then again, those files may just be corrupted from years of copying from device to device as I upgraded and changed systems. But that’s the point, a parseable format would still enable me to get some content back!

Like Charles, I use Scrivener for the big stuff and plain old Markdown files for everything else. Like Charles, I’m not worried about being able to read my documents 5 years or 50 years into the future anymore. And like Charles, if someone wants a file in Word format, I’ll deliver a copy and keep the original in a manageable format.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

Using Mac Navigation Keys in Visual Studio

As a developer, sometimes I am forced to use Visual Studio to code for Windows. Just like a lot of Mac developers, I therefore run Visual Studio in a VMWare Fusion VM. And it works great.

Except it drives me nuts that keyboard navigation in the Visual Studio editor does not use the same keys as on the Mac. Which means I am constantly seeing the window move around when I am trying to get to a line end. And even if I did remap my brain to Visual Studio’s keys, it assumes I have a Home and an End key to use, which the Mac mini wireless keyboard and laptops do not have.

I tried changing keys in Tools / Options / Environment / Keyboard but Visual Studio studiously ignores any shortcuts which use the Windows key which just happens to be the Command key on the Mac keyboard which just happens to be a commonly used shortcut for navigation on the Mac.

The solution I found is to install AutoHotKey on the VM and remap the Visual Studio keys you need. Here is my AutoHotKey script file so far:

; AutoHotKey Script

SetTitleMatchMode, 2  ; Move this line to the top of your script

#IfWinActive, Microsoft Visual Studio
#Right::End
#Left::!Left
#Up::^Home
#Down::^End
!Right::^Right
!Left::^Left
#b::^B
#K::^M
#/::!/

#IfWinActive

I have set this up so the changes only apply to Visual Studio. All other Windows applications will remain unaffected. The SetTitleMatchMode is needed to make it so that the matcher sees the right title.

The #IfWinActive, Microsoft Visual Studio line means that all changes that follow until the next #IfWinActive apply only to Visual Studio windows.

My mapping changes are as follows:

  • ⌘→: Edit.LineEnd (Default windows key End, original behavior: Snap window to right of screen)
  • ⌘←: Edit.LineStart (Default windows key Home, original behavior: Snap window to left of screen) but I had to map this to ALT-← and change the mapping in Visual Studio (see below). This is because Windows 8 intercepts the Home key, minimizes all other windows and does not pass it on to Visual Studio.
  • ⌘↑: Edit.DocumentStart (Default windows key CTRL+HOME)
  • ⌘↓: Edit.DocumentEnd (Default windows key CTRL+END)
  • ⌘/: Edit.CommentSelection (Visual Studio CTRL+K,CTRL+C) mapped to ALT+/ and remapped in Visual Studio.
  • ⌥→: Edit.WordNext (Default windows key CTRL+RIGHT) but somehow this combination does not work on my system natively or adjusted.
  • ⌥←: Edit.WordPrevious (Default Windows key CTRL+LEFT) also not working.
  • ⌘b: Build.BuildSolution (Visual Studio CTRL-SHIFT-B) to match Xcode.
  • ⌘K: Build.CleanSolution (No key) Manually mapped to CTRL-SHIFT-M.

Some of these mappings require you to change the Visual Studio keys. To do so, go to Tools / Options / Environment / Keyboard.

First, find the key mapping you would like to change (See 1). In this case, I am remapping Edit.LineStart. You should see Home (Text Editor) under Shortcuts for selected command: (I have already changed mine).

To set the new shortcut, see 2:

  • Make sure that Text Editor is chosen under Use new shortcut in:
  • Press the shortcut key in the blank space and make sure that it gets shown correctly.
  • Click Assign to save it.

I have only just started using this solution, but already its 100% easier for me to navigate around my Visual Studio code files (and the window no longer jumps around).

If you try this out and find any other tricks, let me know in the comments and I’ll add the best ones here.

Back to coding in Visual Studio on a Mac!

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

Homebrew Happiness

If you are expecting an article about beer, this is not it. This is about the best product that helps install and manage the Open Source software on the Macintosh computer that Apple decided not to include in OS X.

In short, I use a lot of Open Source products for work, like postgresql, redis, mongo, node, boost libraries and rbenv. Installing and managing them natively on a Mac was a pain. Homebrew makes installing and maintaining these easy, safe and pleasant without messing up my system. To skip the fascinating introduction and get to the information about the product directly, click Take me to Homebrew Happiness.

Before Homebrew

Since OS X is really a Darwin flavored FreeBSD UNIX, which is pretty much Linux Standard Base compliant, installing Open Source software on it was never really that hard. The recipe is as easy as a frying an egg:

Step Command
1. Download the tar file curl -O <path to source tarfile>
2. Unpack it tar zxvf <downloaded file>
3. CD to it cd <unpacked folder>
4. Configure it ./configure --prefix=/usr/local
5. Make it make
6. Install it sudo make install

Most Open Source packages just worked. But many did not. They may have needed some dependencies to be installed first or the configure needed special settings to work. Fortunately, the internet was full of fine folks who tried and created findable pages that explained the dependencies or settings. Follow their recipes, and you could too.

And that is the way I installed and maintained Open Source packages in my Mac for a decade.

Aside: MacPorts and Fink

Two major projects sprang up to solve for this mess, MacPorts and Fink. They took the standard recipe or the tweaked ones and make easy-to-use to use installers for Mac. They did became very popular, to the point that tweaked recipes started to become scarce.

But I did not use them.

I tried both, for a while. But while I may be a messy person in real life, I am OCD tidy on my computer. And both of these projects would basically mess up my system. They’d install products in non-standard locations, and leave a mess when removing products. Which meant that when I tried new things on my OS X install, sometimes things that should have worked did not because of this mess. In my humbly correct opinion, /opt is not where you install core products. So I blew my machine away and went back to native build and installs.

The Negative of Native

Native installs meant I always used the vendor or developer’s version of the code (which was good) and installed it where I wanted it to go. But that too led to other messes.

  • Updates were a pain. Not all Open Source products would overwrite older versions cleanly. And updates often relied on undocumented dependency updates which led to install hell.
  • The installation process was time consuming. Download, wait, unpack, configure, wait, make, wait, install, wait. And that assumes it all worked the first time. Usually the configure step would fail and then time needed to be spent figuring out what setting to change or dependencies to install.
  • Products that depended on the products I natively installed would often make assumptions about the location of the dependency, making their installs more hairy too.
  • Matching the server versions of products was nasty. I run CentOS on all my servers and the yum install for each Open Source product was simple and easy, not so much when you do it yourself.
  • And half the time I would have no idea what was installed and where it was on my system.

Homebrew Happiness

Enter Homebrew by “Splendid Chap” Max Howell stage left.

It solves all the problems mentioned above, including:

  • Dependency management is in each install formula, so dependencies get installed automatically, by Homebrew, where they should be.
  • The installed applications get installed safely out of the way (but still in /usr/local), so the system remains clean, and then symlinked into the right place in /usr/local where your deity expects it to be.
  • Installs just work.
  • Updates just work.
  • Removals just work, and leave no mess or trace.

Using Homebrew

To get started with Homebrew, you will need a Mac with OS X and Xcode installed (for the developer tools). Then go to the Homebrew web site and copy and paste the command into a terminal (or use this)

ruby -e "$(curl -fsSL https://raw.github.com/mxcl/homebrew/go)"

The installer will explain what it does and set up the Homebrew environment. Once it’s done, do a

brew update

To ensure all the formulas are the latest. I heartily recommend you do this before any Homebrew changes on your system every time.

To install a product, just

brew install <formula>

To see what formulae are available, use:

brew search

If you have an idea what you are looking for, use:

brew search <formula>

To see what you have installed:

brew list

To update an installed project:

brew upgrade <formula>

If you ever want to know where the settings files for a project are, just use

brew info <formula>

To cleanly uninstall a project:

brew uninstall <formula>

And if you are OCD clean like me, run this to be sure your Homebrew stuff is all copacetic:

brew doctor

The only negative I have found is that Homebrew does not have every Open Source project or product in it. In some cases, the Homebrew brewmasters have even removed formulae which were not being used or made no sense. In my case, I use QuickFix and the C++ libmongoclient.a which I still had to manually install. But this point is really moot as creating a new Homebrew formula to install these is child’s play.

With Homebrew, I have an OCD clean Mac OS X installation running all my favorite and necessary Open Source packages in a clean, consistent and reliable way. And I hope I never need to deal with manual build and install madness again.

Homebrew is yet another indispensable tool in my Mac toolbox.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

User Experience Shmecsperience

We designers of software spend a lot of our time thinking about user experience. We try to eliminate complexity, simplify interactions and to create delightful experiences.

Reality: Most users don’t care about experience.

We, who make and live in software, value such software, value our time and seek out such experiences. We understand how hard it is to make them. We take the time to learn products and find ways to integrate them into our lives.

Reality: Most users just want to get something done and go home.

We communicators of software take the time to make our icons pretty, our layouts typographically correct, our documentation structured and our charts clear. And we seek out others who do the same. And deride those who do not.

Reality: Most users do not RTFM1 or even see the changes.

We seek out the best products that suit our needs, continually testing others to see if they are better for us. We read about and discuss the benefits and nuttiness of products to help us choose. And we refine the processes we use to make us more efficient or just happier.

Reality: Most users use what they are given or is cheapest or what they learned first.

In the real world, users use Windows and Excel for everything and Powerpoint for everything they cannot do in Excel. Or they use whatever they get on their phone or tablet that the carrier salesperson recommended.

They are not interested in the experience of using the software or in having it change. They want to do what they do, get it done and do something else. They do not want to spend time learning or experimenting, they want to get trained and then jump right in and have that training last a lifetime.

That does not mean we designers and developers and communicators should stop doing what we do. Our work has made phones and computers and amazing new services available to these users, and created the initial experiences that they now understand and expect. Users now know how to use touch screens and swipe around and communicate in short sentences.

But they still expect mail clients to be mail clients (and as much like Outlook as possible), browsers to have the URL bar at the top (just like Netscape did), spreadsheets to be Excel like, to do everything else in Powerpoint and to find whatever they are trained in to remain constant.

We may have adopted the cool new mail client, Markdown and Soulver, but they will not. Because they do not care. They can get what they want done they way they do now, no matter how long it takes or how convoluted the process.

For most of our customers, as long as it works without crashing and they know how to use it, it’s good enough.

That does not mean that it should remain good enough for us. We do care. And even though these real users will not normally notice it (and put up a fight when they do), our work does make things easier, faster and better for them. And that’s why we do it.

They may think “Experience Shmecsperience”, we know better. And I am glad we do.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.


  1. Read The Ferblenzende Manual

SonicWall NetExtender for OS X Mavericks

UPDATE: 10.9 or above users, use the Sonicwall Mobile Connect app on the Mac App Store (or learn more at Sonicwall Mobile Connect for OS X Mavericks).

TL;DR: Download from https://sslvpn.demo.sonicwall.com/cgi-bin/welcome. Follow the admin login instructions, then look for NetExtender / Client Downloads.

UPDATE: Saved a copy of the DMG at https://hiltmon.com/files/NetExtender-7.5.757.dmg as the normal login seems to be disabled. (WARNING: Link will eventually get stale).

UPDATE 2: Your company can also register and get the latest versions from MySonicwall.com.

Many of us corporate drones need SonicWall’s NetExtender for remote access to our company networks. And the way we get it is to go to the company IP address IT gives us and download it. And then install the Java plugin. And then install the java runtime.

Unfortunately, the version provided by most of these sites is out-of-date as most SonicWall VPN devices never get updated.

In my case, the version of NetExtender for Mac, 6.0.719, on my company SonicWall works on 10.8 Mountain Lion, but fails on OS X 10.9 Mavericks.

One solution is to upgrade all the company SonicWalls. I may as well pack my snowboard for a lovely eternity riding the frozen volcanoes in hell. Yes, and I am the CTO! Still not going to do it.

The solution that works is to somehow install the latest copy of the NetExtender application without upgrading the SonicWall and I finally found a place that actually allows you to download it.

Their demo site.

Just go to https://sslvpn.demo.sonicwall.com/cgi-bin/welcome and log in using the provided demo password. Then click the big NetExtender button to install the latest client version, just like you did on your IT provided site. I got 7.0.752. Which works on OS X Mavericks.

Be aware though that once that is installed and running, you will find yourself connected to the demo site, not your company network. Simply disconnect, hit the dropdown arrow to choose your old NetExtender settings and connect happily/slavishly to your company network.

Back to work.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

Setup ODBC for R on OS X

Hi Folks, Please do not follow this advice, its an old article that somehow has stayed way past its welcome in the search engines. Use the RPostgresSQL package that connects directly using the C drivers, its a lot faster to set up and run. ~Hilton.

At work, we use R to analyze data and calculate risk. The data is in a PostgreSQL database, so we use the RODBC package to access the database from R.

But this does not work under Macintosh OS X.

For two reasons.

One, it uses the iODBC libraries that come with OS X and these do not work out of the box. And two, even if you do install the ODBC Administrator tool and configure iODBC, it does not work with unicode drivers and databases.

So here’s how to set up R with unixODBC on OS X to access a PostgreSQL database.

Prerequisites

It is assumed you have the following installed and running:

  • Xcode with command line tools installed (needed to compile the ODBC drivers and RODBC)
  • R
  • RStudio (optional, but so much better than the standard R GUI)
  • The database (in my case PostgreSQL, with developer libraries, so that the compiles will work. I installed mine using Homebrew)
  • Homebrew

Install unixODBC

This turns out to be easy, just use Homebrew:

$ brew update
$ brew install unixodbc

Install the PostgreSQL ODBC Driver

The best way to install this is to download and compile it manually. Download the latest driver file from the postgres file browser and then (only commands shown):

$ tar zxvf psqlodbc-09.02.0100.tar.gz
$ cd psqlodbc-09.02.0100/
$ ./configure
$ make
$ sudo make install

Setup the ODBC Driver

Establish the driver in /usr/local/etc/odbcinst.ini:

1
2
3
4
5
6
[PostgreSQL]
Description     = PostgreSQL ODBC driver (Unicode 9.2)
Driver          = /usr/local/lib/psqlodbcw.so
Debug           = 0
CommLog         = 1
UsageCount      = 1

And set up a connection in /usr/local/etc/odbc.ini:

1
2
3
4
5
6
7
8
9
10
11
12
[ODBC Data Sources]
database1 = My Cool Database

[database1]
Driver      = PostgreSQL
ServerName  = db.domain.local
Port        = 5432
Database    = database1
Username    = hiltmon
Password    = itsasecret
Protocol    = 9.2
Debug       = 1

You could also set the connection in ~/.odbc.ini but I find system level connections work better, especially when moving code into production.

Test the connection using isql

$ isql -v database1

and you should see

+---------------------------------------+
| Connected!                            |
|                                       |
| sql-statement                         |
| help [tablename]                      |
| quit                                  |
|                                       |
+---------------------------------------+
SQL>

Ok, ODBC is working and set up.

Compile and install RODBC

Warning: If you install RODBC the usual way, i.e. packages.install("RODBC"), you will get a version compiled with the non-working iODBC libraries. If you have already done this, remove.packages("RODBC") in R to get rid of it.

The goal here is to link RODBC with the installed unixODBC library to make it work.

Download the source code of RODBC from CRAN RODBC and choose the package source link (refers to the 1.3-8 version).

To compile and install it properly, first do:

$ DYLD_LIBRARY_PATH=/usr/local/lib

This tells the compiler to use the unixODBC ODBC libraries installed by Homebrew in /usr/local/lib and not the iODBC ones installed by Apple in /usr/lib.

Then manually compile and install RODBC (you need the full path to the downloaded file for it to work):

$ R CMD INSTALL /Users/Hiltmon/Downloads/RODBC_1.3-8.tar.gz

After a pile of configure and compile messages, you should see:

** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
   'RODBC.Rnw'
** testing if installed package can be loaded
* DONE (RODBC)

Test

All the installer does is test that the library loads. To see if it actually works, launch RStudio and type:

library("RODBC")
odbcDataSources()

The response should be: database1 “PostgreSQL”

If you get an error message that contains [iODBC] in it, or a message that says named character(0), it means you are using the wrong library version (the default and not the newly compiled one). Remove RODBC and start again.

Let me know if this works for you too.

Off to write some database queries now.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.

A Reliable Script File Layout

I write a surfeit1 of Ruby scripts that I use every day at work, everything from sending, transforming and retrieving files to monitoring systems to glueing platforms together. Keeping these under control, forgettable2 and maintainable is very important to me. So over the past few years, I have migrated to a file layout and programming pattern that enables me to spin these up quickly, forget about them and yet get back and maintain them when things change.

In this post, I’ll share this format and the thinking behind it as well as a few Ruby tricks that I rely on. And yes, it works just as well for Python or whatever other language you prefer to use.

The Script File Layout

Traditional no layout version

Most people write scripts the old-fashioned linear way. For example, a script to get a file from an FTP server and upload it into a database would look something like this (not real code):

Linear Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#!/usr/bin/env ruby
  
require 'net/ftp'
require 'pg' # PostgreSQL Database
  
conn = PG.connect(dbname: 'd1')
  
Net::FTP.open('source.com') do |ftp|
  ftp.login('hiltmon', 'itsasecret')
  ftp.chdir('pub/data')
  ftp.gettextfile('the_data.csv')
end
  
CSV.foreach("the_data.csv") do |row|
  conn.exec("INSERT INTO t1 (c1, c2) VALUES ('#{row[0]}', '#{row[1]}';")
end

In short, connect to the database, get the file, parse the file and brute force insert it into the database. Simple, linear and it works!

The output is … there is none.

There is nothing wrong with this. Unless you have an excessive number of these to support and maintain. And you can remember the file format (what does row[0] and row[1] contain?) and know the assumptions implicit in this script (the retrieved file is saved in the same folder as the script) and know if it ran or not (there is no visible output or logging).

Layout Version

Instead, I over-engineer all of these scripts and lay the code out logically. It requires more lines of code, but when things change, I find them a lot easier to find, read and maintain.

In short, all my Ruby scripts have the following common traits:

  • They are created as classes where the class name is the Camel Case version of the file name (the Rails standard). That way the logic can be reused elsewhere. get_data_file.rb becomes GetDataFile.
  • All constants appear at the top of the file where it’s easy to find and change them.
  • All classes have a run method that contains the main loop and each step is a function call, even if it is a single line step.
  • All assumptions are documented in the file, but I prefer to make them explicit (for example, declaring where a file is saved).
  • Called functions are always higher up in the file that the caller (the old C model still works) so navigating is easier.
  • All classes are liberally festooned with puts statements which can be redirected to a log or be used when testing or manually running to see what is happening.

So lets look at the same above script using my standard format:

Standard Model
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
#!/usr/bin/env ruby
  
require 'net/ftp'
require 'pg' # PostgreSQL Database

class GetDataFile

  SOURCE_URL = 'source.com'
  SOURCE_PATH = 'pub/data'
  FTP_USER = 'hilton'
  FTP_PASSWORD = 'itsasecret'
  FILE_NAME = 'the_data.txt'
  DEST_PATH = '/tmp/'

  DATABASE = 'd1'

  def initialize
    @conn = PG.connect(dbname: DATABASE)
  end

  # File name is overwritten every day
  def get_file
    puts "  GetFile #{FILE_NAME}..."
    dest_file_path = "#{DEST_PATH}#{FILE_NAME}"
    Net::FTP.open(SOURCE_URL) do |ftp|
      ftp.login(FTP_USER, FTP_PASSWORD)
      ftp.chdir(SOURCE_PATH)
      ftp.gettextfile(FILE_NAME, dest_file_path)
    end
    puts "  File Saved to #{dest_file_path}..."
    dest_file_path
  end

  # 2010-01-01,99.0
  # 2010-01-02,99.17
  # 2010-01-03,99.51
  # 2010-01-04,98.73
  # 2010-01-05,99.23
  # 
  # File contains 2 columns, price date and the price
  # There is no header
  def save_file_database(file_path)
    puts "  Loading into Database..."
    count = 0
    CSV.foreach(file_path) do |row|
      @conn.exec("INSERT INTO t1 (c1, c2) VALUES ('#{row[0]}', '#{row[1]}';")
      count += 1
      if count % 1000 == 0
        print "  Loading #{count} rows...\r"
        STDOUT.flush
      end
    end
    puts "  Loaded #{count} rows..."
  end

  def run
    puts "GetDataFile Starting..."
    file_path = get_file
    save_file_database file_path
    puts "GetDataFile Done..."
  end

end

app = GetDataFile.new
app.run()

That’s a lot more code, 66 lines vs 16, but it is so much more readable and maintainable.

The output is more explicit too:

1
2
3
4
5
6
GetDataFile Starting...
  GetFile the_data.txt...
  File Saved to /tmp/the_data.txt...
  Loading into Database...
  Loaded 5016 rows...
GetDataFile Done...

Lets look at the key features:

  • The constants are explicit and at the top. If the password changes, for example, it is easy to find (even easier with OS X Spotlight).
  • The script is a class, with explicit methods for each step, making it easier to find which step has failed and fix that.
  • Each step prints out when it starts, keeps you posted on that is happening and when it is finished. Debugging is built right in.
  • The output also explicitly states where things are so you can find them, for example, where the file is saved.
  • The order of the steps is explicit in the run function.
  • A sample of the data being processed is placed in the comment above where it is used so I do not need to open the file to see what is there. I know what to expect.
  • The code at the very bottom creates the class and kicks it off.

This may seem like overkill for such a simple 2 step script, but when you get to 5 or 10 step scripts, and a profusion of them, this pattern starts to make a lot more sense. As does the ability to add or remove steps as needed.

File Naming Conventions

There are only two hard things in Computer Science: cache invalidation and naming things.

Phil Karlton

When you have only a few scripts, naming is quite easy. When you have a plenitude of them, not so much.

I follow the following approach for script names wherever possible:

keyword - source - data - transform - action - destination

Where

  • keyword: What is the script doing?
  • source: From where does it get its data
  • data: What is it working on?
  • transform: If it does any additional work, what is it?
  • action: What does it do with the data?
  • destination: Where does it put it?

For example:

  • load_yahoo_prices_into_d1: Loads data from yahoo that happens to be prices into database 1.
  • start_risk_server: Kicks off a risk server daemon.
  • run_calculate_profit_and_loss: A script to perform a single task.
  • load_city_temperatures_as_celcius_into_d1: A load script with a transform.
  • send_d1_prices_to_freddy: A script to get prices from database d1 and sends them to whomever freddy is.

The keywords I use to indicate behavior include:

  • start… implies kicking off a daemon
  • kill… implies terminating a daemon
  • run… implies a task that starts and finishes
  • load… implies an import of data
  • send… implies delivery of data

With this pattern, I do not have to remember what a script is called, I can guess its name based on what I expect it should do. Also, the name of the script tells us all what it does.

Tips and Tricks

Some tips and tricks I use a lot in these scripts:

  • Starting and Done: The use of starting in a print statement indicates that a step is commencing. I often precede that with the function name to make it easy to see where the process is or where it failed. I use the done word to indicate successful completion.
  • Indentation: I indent step messages by 2 spaces, and sub steps by an additional 2 spaces. This makes the depth of the message also explicit and yet I can see where a step starts and finishes, just like functions in code.
  • Color: For more complex scripts, I also use terminal color output. Warnings are in yellow, errors in red, info in white and success in green.
  • Displaying Progress: Look at the print statement followed by the STDOUT statement above. For long running steps, it’s really nice to see progress, but it sucks if that progress causes the terminal to scroll. The Ruby print statement presents the text to the console without a new line (puts does that). Since there is no new line, terminal does not display the text yet. The STDOUT.flush command causes it to be shown. Note the \r at the end of the text string causes causes the terminal caret to return to the start of the line so the next print overwrites. So instead of seeing

    Processing 1000 rows…
    Processing 2000 rows…
    Processing 3000 rows…
    Processing 4000 rows…

    you get the same line being overwritten instead:

    Processing 4000 rows…

Properly Named and Laid Out Scripts

There are a lot of reasons for writing scripts, but I feel there is no reason not to do them properly and in a maintainable way. It does not take more than a few moments more to name them properly, code the structure and self-document the script, which will save you hours later on when things change. And they always do.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.


  1. excess, overfill, abundance, bellyful, bucketload, glut – a lot!

  2. No matter how good or smart you are, there is no way you can keep all of the file names, functions and purposes in your head over the long haul. Especially as things change often. Knowing you can forget something yet find it again later is far more valuable.

Not the Clicky Keyboard

It’s pretty popular these days for popular bloggers, writers and programmers to use mechanical-switch, or clicky, old-school keyboards like the DAS Keyboard, old IBM keyboards or the original Apple Extended Keyboard. So much so that Jeff “Coding Horror” Attwood went as far as to design the CODE Keyboard with Weyman Kwong of WASD Keyboards which looks droolworthy.

To be fair, these keyboards are amazingly good, reliable, feel great and are the best you can get. I really love them too. All the promises made in their advertising regarding your performance and pleasure are true.

But they are not for me.

For one reason.

Muscle memory.

I work all day in desktop mode at work and spend all my evenings and weekends in laptop mode at home.

I used to use the old Dell mechanical switch keyboards when I was a Windows programmer, and the pre-aluminum Apple white keyboard with the crumb-catching clear sides at work, but the key positioning and spacing differed radically from whatever my then laptop’s keyboard had. Which meant I was continually pressing the wrong keys when I switched from desktop to laptop, and then again when I switched back to laptop. And that became annoying.

This issue was partially resolved when I purchased my first wired Apple Aluminum keyboard. The keys (except for a very few) were pretty much in the same place and my error rate was reduced when switching from desktop to laptop daily.

Apple Wired Desktop Keyboard Overlaying the Macbook Air Keyboard

As you can see in the above overlaid image, most of the keys were the same size and in the same position, but the Caps Lock was a tad smaller, and the bottom row was all different. Since I program a lot, I use the (Command), (Option) and (Control) keys a lot, and having them in different places was quite annoying. I would continually hit the fn key when aiming for control. And it took a long time to get used to the soft, mushy, short-travel keys.

So I switched to the Apple Wireless Keyboard.

Fade from Macbook Air (black keys) to Wireless Keyboard (white keys)

These keyboards match each other exactly. The keys are the same size and are exactly in the same relative locations. Which means that the muscle memory to use them remains the same, and switching from desktop to laptop and back causes me fewer problems.

Of course, the mushiness of the ‘chicklet’ keyboard, the lack of keycap shape and texture, the fact that the wireless keyboard is so light and tends to drift around the desktop, and the different key travel depth between devices do provide the worst of all keyboarding worlds.

But for a self-taught typist who programs all day and night, having the keys in the same place when I use the wrong fingers to press them makes jumping between devices just so much easier.

So go on using your big, heavy, reliable, great-feeling, seriously productive mechanical-switch keyboard. I truly am jealous that you can jump between that and the laptop mushy chicklet keyboard with ease.

I’ll stay with perfectly matched “not the clicky keyboard” for now.

Follow the author as @hiltmon on Twitter and @hiltmon on App.Net. Mute #xpost on one.