Category: Tech & CS

Mongo Native TTL

Turns out MongoDB supports a native TTL mechanism, natively removing documents at or after a certain amount of time, similarly to Redis’s auto-expiry. Here’s an example usage flow:

> db.foo.insert({time: new Date(), name: "Jones"})
> db.foo.find()
{ "_id" : ObjectId("53ce96f03fdf121aa2e36622"), "time" : ISODate("2014-07-22T16:53:04.001Z"), "name" : "Jones" }
> db.foo.ensureIndex( { "time": 1 }, { expireAfterSeconds: 5 } )
> db.foo.find()
{ "_id" : ObjectId("53ce96f03fdf121aa2e36622"), "time" : ISODate("2014-07-22T16:53:04.001Z"), "name" : "Jones" } //still there, less than 5 seconds have passed
> //wait 5 seconds
> db.foo.find()
> //empty

Mongo Awesomeness FTW. 

Advertisements

Mongo Console BI syntax essentials

When messing around with the Mongo console I’ve found that Mongo is not just JavaScript, and the JS-foo and standard APIs you might be used to in the browser or in Node are not always available – where’s console.log, for Console’s sake?

Here’s a head start on how to perform some basic manipulations, assuming prior familiarity with JavaScript and Mongo themselves.

So suppose:

  1. You’ve got your data spread around mutiple collections (supposed they’re named ’employees1′, ’employees1′, ’employees1′),
  2. You want to query all the fields of a nested document (say, the field ‘city’ of nested document ‘address’, which not all the users have).

You might want something like this:

mongo > res = []; 

mongo > for(i=0;i<4;i++){ res=res.concat(db.getCollection('coll'+i).find({"address": {$exists: true}}).toArray()); }

mongo > res.length //number of employees that have an 'address' sub-document.

mongo > res.forEach(function(x){print(x.address.city)}) //print out all the cities

Hooray for science.

Node Background Processing Awesomeness

Unless I’m misunderstanding something, a Node.js callback can replace a full system architecture of handling asynch background processing.

Suppose you have a webapp route that initiates a heavy background task: POST ‘/do_heavy_lifting’

You don’t want your users to wait forever while you process; that’s sucky UX and blocks your app’s precious resources. So, you might drop an event down to some messaging queue (say, Rabbit), and write up a worker to listen to that queue and then perform the heavy lifting in the background. Now you’ve got to maintain the worker, the queue, the event, and synchronize all the different parts together… This is only an issue once you reach a certain scale, but then — it’s an issue, once you reach a certain scale.

However, there is a better way.

As you surely know, you can use Node.js’s callback awesomeness to perform heavy background processing (codename “I/O”) without blocking the code execution flow. However, if you respond to the request without waiting for the I/O to finish, you can even respond to the user – thus freeing the connection – while allowing the background processing to finish in its own time!

This allows you to handle arbitrarily heavy background tasks while immediately freeing the response, both clearing the HTTP connection and responding immediately to the client. Holy smokes!

Sample code, using writing content to a file as an example of heavy lifting (observe the response.end outside the callback):

// Require what we need
var http = require('http');
var fs = require('fs');

function now(){
    var d = new Date(); 
    return d.getHours()+d.getMinutes()+':'+d.getSeconds()+'.'+d.getMilliseconds();
}

var app = http.createServer(function(request, response) {
  response.writeHead(200, { 'Content-Type': 'text/plain'});

  fs.writeFile('baz', 'baz contents: '+now(), function(err) {
    console.log('created file. Now running arbitary background code');            
    console.log('before Second Reponse:'+now());
    response.end('Redundant response: '+now()+'\n'); 
  }); 

  console.log('before First Response :'+now());
  response.end('First Response : '+now()+'\n');
});

app.listen(1337, 'localhost');
console.log('Server running at http://localhost:1337/');

Copy the above into foo.js, $ node foo.js, and (separately) $ curl localhost:1337. Notice:

  1. You will always get ‘First Response’, which is expected.
  2. The ‘redundant response’ is never used. In fact we can comment it out; same behavior. (It’s redundant in our case because it’s a background operation, and we don’t need it to finish in order to respond to the user.)
  3. The I/O operation does get executed, as well as its callback (which can be arbitrarily extended).
  4. Under ‘arbitrary background code’, we can now execute whatever we want — the client has already receiving its response.

You have just implemented a full-blown background asynch processing paradigm, with about a line and a half of code and some Node.js-Fu.

Copy Local Files into Remote Server through n>1 SSH Hops

Copy your local files into a remote environment with an arbitrary number of hops using the following:

$ rsync -have "ssh -A -t user@middleman ssh -A -t user@destination" source_path :target_path

As in, copy your local directory ‘foo’ to the root path of your staging server using:

$ rsync -have "ssh -A -t user@middleman ssh -A -t user@destination" foo :

Notes:

  1. If you are copying a directory, remember the target path should be the target directory’s parent folder.
  2. If you are using this (‘hot-deploying’) instead of a traditional ‘deploy’ from a version-controlled repository, exercise the appropriate caution: verify your files have been copied to the correct location, remember they haven’t necessarily been source-controlled, and so on.
  3. I learned this from: http://mjbright.blogspot.co.il/2012/09/using-rsync-over-multi-hop-ssh.html

Hooray for Science!

Ruby source_location

I was having a very hard time finding how/where a specific legacy method was defined in our Ruby codebase. Every conventional method failed to find it, including searching the entire project for the method name.

To my rescue came an in-built method of locating the source definition on a method in run-time:

Foo.instance_method(:bar).source_location

Available in Ruby 1.9+.

(Indeed, the method name is dynamically defined by a grep on /in_cents/ columns using dark Ruby such as ​class_eval. Whodathunk.​)

My Git Book & Sandbox

tl;dr: I am not satisfied with the Git tutorials I have seen so I have written one myself. You can see its draft here: bit.ly/1iryAMW.

We have had 14(!) new developers join our team(s) in the last 8 weeks. Every single one of them will be using Git from day one. We have an aggressive deployment process using a centralized workflow – no dictator/lieutenants or any of that, every single developer pushes and pulls to and from the main repository. Our new devs are new – often new to Unix, new to Web development, and certainly new to Git.

Git is hard. Git is complicated. All the more so as the team size grows. Every single dev is expected to handle the complications of branching, staging, diffing, committing, merging, remotes, resets, rebasing, conflict resolution, and so on. HEAD. Index. Stash.

Git follows the Unix philosophy – everything is possible, but nothing is obvious. Git’s “porcelain” or ‘surface’ commands would be considered ‘plumbing’ anywhere else. One solution would be to use a GUI, but that would just be hiding the problem (though perhaps this option should not be dismissed so easily).

Regrettably, it has been my experience that most Git documentation and tutorials are either of the ‘Hello Branch’ variety or of the ‘git plumbing internals’ persuasion, and fail to hit the balance of robustness and accessibility new devs need. This is par for the course with many complicated/technical topics: those who understand it well, struggle to explain it to those less comfortable. You can always tell people to read the man pages, but when it’s your devs and your code on the online, fixing the problem is more important displaying your hard-coreness. Otherwise, new devs just pick up the minimum they can/need in order to survive, and (ab)use Git to their own (and our) detriment.

Lastly, explaining something is always the best way to learn, and explaining Git has been a great for me to improve my Git understanding.

So anyway, I am undertaking a humble effort to create a better Git tutorial. This will include:

  • A single-command setup for a local Git sandbox, so devs can experiment with advanced command without fear of destroying their (or others’) work.
  • A step-by-step walk-through of important Git commands and settings, explaining the reasoning and practicing together in said tutorial.
  • An (admittedly) opinionated dictation of the correct workflow when working on a feature.
  • A (hopefully) reference-style index for those just coming for a quick ‘how do I do that’.

Necessarily, many of the issues will be simplified or approximated, to facilitate understanding over exactness.

I have personally found myself asking about and later explaining these topics over and over. It took me some time to come around to creating my own Git sandbox to practice. Hopefully this will at least help some people, some of the time.

Naturally, this is not the first or last Git tutorial, sandbox, or even book. That’s fine – I am not trying to compete with anyone else, just fill the void my personal way, which I felt has been lacking.

Further drafts will hopefully be prettified and strengthened. Contributions will of course be welcome.

Optimal Notes Strategy For Productivity Freaks

I’ve got lots of stuff to remember so I use notes and I needed a system that was better than any existing tool that I knew of, so I spent a some time thinking about it and now I have a system and it is BOSS.

Use-cases are:

  1. Find a note by name (like ‘git’ or ‘deploy_process’ or ‘electricity bill’)
  2. Find a note by content
  3. Edit, save, sync note to cloud
  4. Access anywhere, anytime, any device
  5. Port all my notes somewhere else in 2 years

The requirements (must be…):

  1. Super-fast to find a note by searching for name.
  2. Super-fast to find a note content.
  3. Super-fast and easy to edit a note and sync to cloud, all without using mouse.
  4. Available online via web, all platforms
  5. Easily exportable (No lock-in)
  6. Complete revision history

Failed attempts: Evernote and GoogleDocs. Both are too heavy, cumbersome, and slow. GoogleDocs is bottlenecked by the network and is in-browser, which is crowded enough. Evernote has a lock-in, no revision history, and is basically über cumbersome. They might work for you if you’re infinitely patient but I am kind of the opposite.

Best solution:

  1. Open a folder in your Dropbox folder. Call it ‘notes’.
  2. Use your favorite text editor (which should be Sublime Text) to find, add, and edit simple text notes.

Shazam, you’re done. Let’s examine how this answers the requirements and use-cases.

  1. Find a note by name: ST’s fuzzy-search for file names is famously boss.
  2. Find a note by content: Ditto, ST’s search-function is famously sweet.
  3. Editing notes: By definition, it’s already your favorite text editor. Saving syncs it automatically to the cloud (dropbox).
  4. Access anywhere, anyhow: Dropbox.com, and you’re golden.
  5. Exportable: it’s just a folder of text files! Evernote, Dropbox, and Google[Docs] may come and go (not to mention other app solutions like SimpleNote, etc.), but text files will be here for a long time. Whenever you want you can copy the whole folder off to your Box.com folder, or your SugarSync magic briefcase, and edit the files with vim instead of Sublime. No lock-in, ever.
  6. Dropbox saves all your revision history.

As you can see, these cover the requirements perfectly. To me, this clearly outclasses using GoogleDocs or Evernote for any personal use. I’ve been riding this for about 4 months at time of writing, and it’s been a dream. Anything I need to remember can easily be jotted down to the relevant note and retrieved at the speed of light (well, at the speed of Sublime Text, but it’s pretty damn close). Zero upkeep. Totally free. And I know that when I grow tired of Sublime (and/or switch computer, OS, polar meltdown, etc.) my notes will be there for me. This very post’s draft was written in my ‘blog’ note, where I keep a list of topics and first drafts of posts.

Basically this is has just been riding on the awesomeness of Sublime Text and Dropbox. You lose tiny things like clickable links and rich-text formatting, but those are for girls anyway.

Extra points for using ST+Dropbox for notes::

  • Use a specific Sublime theme for your ‘notes’ folder, so you can easily tell the ‘notes’ Sublime instance from code instances. I use Solarized (Light) because I like the light/dark contrast with code’s Monokai.
  • Set a syntax highlighting that complements free-text notes. I use AppleScript for no reason at all.
  • Organize your files into folders, if you feel like it, but with Sublime’s great search capabilities – it doesn’t really matter. I’ve got 150+ notes by now on everything from Apple to ZenDesk and don’t need the folders.

So: never remember anything again. You’ve got your computer to remember, now that you’ve got your llama-whooping notes mechanism set to retrieve and update notes at whiplash speed.

Now, go forth and multiply. Or at least use efficient notes.