On notes, and why I like to write in outline format

  1. I write in outline format because:
    1. It helps to organize my thoughts, which hopefully helps others understand them.
    2. It enforces concision; I try to stick to one line per.
    3. If 1.A and 1.B are done right, it’s easier and faster to consume than a paragraph.
  2. I use numbered, not bulleted outline format, because
    1. Things are easily referenced if needed, e.g. 1.C above.
  3. Gmail : the best tool for starting a new set of notes in, because:
    1. It’s always with you
    2. It’s easily and quickly searched, including built-in timeline
    3. It takes the least number of extra steps to share, compared to other tools.
    4. Quite often, notes should be shared.
    5. It’s a tool you have to know very well anyway.
    6. It’s got nifty keyboard shortcuts, which make things faster, which is critical. While composing an email on Mac:
      1. Command + Shift + 7
      2. Command + ]
      3. Command + [
  4. I sometimes take notes, send to myself, and then just archive the email. Rare.
    1. But just the act of taking them helps me remember important points.
  5. More often, I review them before archiving them, which helps with recall and CI WTF
  6. More often than that, I refactor them and send to someone else.
  7. Talking with one or more people for more than ~15 minutes about something you’re trying to get done?
    1. Face it : you’re in a meeting.
    2. Meetings that don’t produce anything are the worst kind of wasted time.
    3. So at the very least, I try to send to the participants.
  8. Playing back conversation P0′s…is a P0.
    1. More disconnects than you’d expect are discovered.
    2. You get bigger ears, critical for happy customers.
  9. Act on them or archive them – they’re either work in progress, or DONE.
    1. Lean : minimize work in progress, stop starting and start finishing.
  10. I use a common coding best practice, standard searchable notation to highlight needed action:
    1. TODO means “this thing should have action taken on it”
    2. VERIFY means “I’m not sure I know WTF I’m talking about; do you?”
    3. P0 means “do this highest priority next, along with the other highest priorities”
    4. WTF e.g. “why the f*** would someone care enough to write notes about notes?”
      1. Minutiae : it’s not only just the small and trivial; small and precise fits the definition also.
      2. Tequila: it’s not just liquid sitting beautifully in a bottle on the shelf.  Just a reminder.

Birdie

  1. “birdie” : a term Jacquie has me hooked on, that she uses to describe the act of changing the subject of discussion.I’ve started using it to describe any distraction that partially or fully changes the direction of the conversation.
    1. Casual conversation is like this, organic, and highly so between her and her friends.
    2. Meetings, engineering discussions, etc typically disallow birdies by design.
      1. Birdies are distracting context switches, when you’re trying to drive towards specifics, draw lines, or define things.
    3. I’m really starting to love the term, for various reasons.
    4. Watching the barn swallows outside my house, they appear to dive and weave randomly over the lake
      1. as if they raided some Rainier Beach crystal meth lab
    5. But the reality is that they’re eating insects with missile-guided accuracy
      1. organically, by nature.
    6. It’s fun to riff, to improvise, to birdie.
      1. Far more so, and easier than sticking to a structured path to a goal (i.e. “work”).
    7. Listen to the legendary Charlie “Bird” Parker’s Ornithology and you can’t miss it.
    8. Shooting birdies sucks, and it’s more exhausting than herding cats, if you know what I mean.
  2. Gazing out the window at a stoplight this morning after experiencing some synchronicity on this topic with Jacquie, I was thinking about the beautiful ways in which nature “just works”with apparent randomness
    1. like the way tree limbs weave their way towards the sky
    2. or like the way birds weave their way through it
    3. and how much effort people put into structure and control
    4. then wondering how much of it is unneeded, or worse, preventing desired outcomes
    5. and how hungry I am
  3. Then a van pulls up next to us at the stop light
    1. I can only see one woman’s arm, the passenger
      1. she’s making a weaving motion
    2. Then she drops her arm, and the driver’s arm appears
      1. she points to the sky
Tagged

5 Attributes Of My Favorite Developers

I gave a talk on this subject to the first ever Code Fellows class, in Seattle last Friday.  Here’s the slide deck, hope you like it.

The wrong way to backup, store and protect customer data

Years ago, before great services like Dropbox were available, I purchased a NAS device for my home network.   If you aren’t familiar, the idea is essentially to hook up a storage device that’s available while you’re at home, so that

  1. All your home computers can access any files or data that you’ve stored there, over an a fast local network.
  2. All of your data and files are backed up redundantly, to multiple disks.  These devices mirror (copy and synchronize) files across two or more hard disk drives.  They’re designed so that if one disk goes bad, all of your data is still OK, and you can replace the bad drive while the device is running and fully functional.
  3. You get a lot of headroom in terms of storage capacity.  Many of these devices allow you to store petabytes (a shiz ton) of data.

I set up my NAS with two 500G drives, and proceeded to copy over my 20-year old CD collection (in MP3 format), all of my photos, and a lot of other important data.  I have multiple computers, on various operating systems, in various rooms at my house, and they could all access my music and photos over both wireless and wired connections – it worked pretty well for a while.

Then one day, one of my drives went bad.  I got on Amazon, searched for one of the drives on the very specific compatibility list the vendor published, found, purchased and installed it.  It worked. I hot-swapped the drive for the bad one and everything continued to work great.

Being a generally paranoid person, and having experienced pains in this arena before, I also wanted all of my stuff backed-up somewhere outside of my house.   So I shopped around, and decided to use the Memopal service.   There were a number of nascent services coming on to the market at the time, and theirs seemed the most reputable and solid.  I set things up so that the Memopal software was synchronizing files from my NAS to their cloud, set the account to auto-renew yearly, and for the most part left it alone.  There were a few times when I checked on it, or when I had to fuss with their software to get it to work again, but otherwise it seemed to work pretty well.

Years went by, and in the last three years I got more and more busy with the start-up I helped co-found, BigDoor.  My personal email queue was the last thing on the list to maintain, unfortunately.

A couple of weeks back, a very unlikely event occurred: both of the 500G NAS disks had problems.  The manufacturer of the device, Infrant, was purchased by Netgear in 2007. I’ve been working with their support on this issue, and have high confidence that I’ll be able to recover my data after spending more money.   Their support has been pretty good so far, and I’ll report back here if it goes south.

On the flip side, when I contacted Memopal support, I learned that even though they’d taken my money last year and this year via auto-renew, since I didn’t enter the licence code into their software last year, all of my data was deleted and can’t be recovered.   This approach is new to me;  I’m used to the standard “if you pay for something, we’ll at the very least not destroy what you’ve paid for”  vs. “to prevent the irreparable deletion of all your data, it’s not enough just to pay us, you have to put the code we emailed you into our software”.  Below is the support thread, in case you’re as incredulous as I continue to be.

With product decisions like this, it’s no wonder they’re getting their asses kicked by Dropbox and other new services.  I’m curious to hear if you can think of a good reason why an online data back-up service would collect payment, but then delete your data (without reasonable warning), because you didn’t enter a licence code.

memopal1

memopal2

memopal3

 

2/1/2013 Update : fairness in reporting; below is how their customer service responded.  Amazing.

memopal4

Tagged , , , , ,

Database Sharding : dbShards vs. Amazon AWS RDS

A friend was recently asking about our backend database systems.  Our systems are able to successfully handle high-volume transactional traffic through our API coming from various customers, having vastly different spiking patterns,  including traffic from a site that’s in the top-100 list for highest traffic on the net.   Don’t get me wrong; I don’t want to sound overly impressed with what our little team has been able to accomplish, we’re not perfect by any means and we’re not talking about Google or Facebook traffic levels.  But serving requests to over one million unique users in an hour, and doing 50K database queries per second isn’t trivial, either.
I responded to my friend along the following lines:
  1. If you’re going with an RDBMS, MySQL is the right, best choice in my opinion.  It’s worked very well for us over the years.
  2. Since you’re going the standard SQLroute:
    1. If your database is expected to grow in step with traffic, and you’re thinking about sharding early – kudos.  You’re likely going to have to do it, sooner or later.
      1. Sooner vs. later if you’re in the cloud and running under its performance constraints.
      2. Do it out of the gate, if you have time, after you’ve figured out how you’re going to do it (i.e. whether you’re going to leverage a tool, DYI, etc).
        1. In other words, if you have time, don’t “see how long you can survive, scaling vertically”.
          1. Sharding while running the race : not a fun transition to make.
      3. I realize what I’m saying is counter to popular thinking, which is “don’t shard unless you absolutely have to”.
        1.  Without the assumption that your data set is going to grow in step with your traffic, I’d be saying the same thing.
    2. Designing your schema and app layer for sharding, sharded on as few keys as possible, ideally just one, is not future-proofing, it’s a critical P0.
  3. Since you’re going to be sharding MySQL, your options are relatively limited last I checked.
    1. Ask for input from folks who have done it before.
    2. The other sharding options I started looking at over two years ago all had disallowing limitations, given our business model.
    3. At quick search-glance just now, it also does appear that dbShards is ruling this space at this point.
  4. So barring any other options I’m missing, your best options that I’m aware of:
    1. dbShards
      1. Definitions we/they use, to help clarify discussion  :
        1. global tables : tables that contain the same data on every shard, consistency managed by dbShards.
        2. shard : two (primary and secondary) or more hosts that house all global table data, plus any shard-specific data.
        3. shard tree : conceptually, the distribution of sharded data amongst nodes, based on one or more shard keys.
        4. reliable replication : dbShards proprietary replication, more details on this below.
      2. pros
        1. The obvious : you’ll be able to do shard-count more reads and writes that you’d otherwise be able to do with a monolithic, non-sharded backend (approximately).
          1. Alternatively, with a single-primary read-write or write-only node, and multi-secondary read-only nodes you could scale reads to some degree.
            1. But be prepared to manage the complexities that come along with eventual read-consistency, including replication-lag instrumentation and discovery, beyond any user notifications around data not being up-to-date (if needed).
        2. It was built by folks who have only been thinking about sharding and its complexities, for many years
          1. who have plans on their roadmap to fill any gaps with their current product
            1. gaps that will start to appear quickly, to anyone trying to build their own sharding solution.
              1. In other words, do-it-yourself-ers will at some point be losing a race with CodeFutures to close the same gaps, while already trying to win the race against their market competitors.
        3. It’s in Java, vs. some other non-performant or obscure (syntactically or otherwise) language.
        4. It allows for multiple shard trees; if you want (or have to) trade in other benefits for sharding on more than one key, you can.
          1. Benefits of just sharding on one key include, amongst other things, knowing that if you have 16 shards, and one is unavailable, and the rest of the cluster is available, 1/16th of your data is unavailable.
            1. With more than one shard tree, good luck doing that kind of math.
        5. It provides a solution for the auto-increment or “I need unique key IDs” problem.
        6. It provides a solution for the “I need connection pooling that’s balanced to shard and node count” problem.
        7. It provides a solution for the “I want an algorithm for balancing shard reads and writes”.
          1. Additionally, “I want the shard key to be based on a column I’m populating with the concatenated result of two other string keys”.
        8. It has a distributed-agent architecture, vs. being deeply embedded (e.g. there are free-standing data streaming agents, replication agents, etc instead of MySQL plugins, code modules, etc ).
          1. Provides future-proofing, scalability and plug-ability.
          2. Easier to manage than other design approaches.
        9. Streaming agents allow you to plug into the update/insert stream, and do what you like with changes to data.
          1. We use this to stream data into Redis, amongst other things.  Redis has worked out very well for us thus far, by the way.
          2. Other dbShards customers use this to replicate to other DBMS engines, managed by dbShards or not, such as a column store like MonetDb, InfoBright, even a single standalone MySQL server if it can handle the load.
        10. It supports consistent writes to global tables; when a write is done to a global table, its guaranteed to have been done on all global tables.
        11. It doesn’t rely on MySQL’s replication and its shortcomings, but rather on its own robust, low-maintenance and flexible replication model.
        12. Its command-line console provides a lot of functionality you’d rather not have to build.
          1. Allows you to run queries against the shard cluster, like you were at the MySQL command line.
          2. Soon they’re releasing a new plug-compatible version of the open source MyOSP driver, so we’ll be able to use the same mysql command line to access both dbShards and non-dbShards managed MySQL databases.
        13. Its web console provides a lot of functionality you’d rather not have to build.
          1. Agent management and reporting, including replication statistics.
          2. Displays warning, error, diagnostic information, and graphs query counts with types.
          3. Done via the “dbsmanage” host, which provides centralized shard node management as well.
        14. It’s designed with HA in mind.
          1. Each shard is two (or optionally more, I think) nodes.  We put all primary nodes in one AWS availability zone, secondaries in a different one, for protection against zone outages.
          2. Write consistency to two nodes; in other words DB transactions only complete after disk writes have completed on both nodes.  Secondary writes only require file-system (vs. MySQL) disk writes.
          3. Managed backups with configurable intervals; MySQL EC2/EBS backups aren’t trivial.
          4. Web-console based fail-over from primary to secondary; this is very helpful, particularly for maintenance purposes.
        15. Proven to work well in production, by us and others.
          1. We’ve performed 100K queries per second in load-testing, on AWS/EC2, using m1.xlarge instances.
        16. Designed with the cloud and AWS in mind, which was a great fit for us since we’re 100% in AWS.
        17. “dbsmanage” host
        18. Drivers included, of course.
          1. In addition to MyOSP, they have JDBC, PQOSP (native Postgres), ADO OSP (for .NET), and soon ODBC.
        19. Go-fish queries allow you to write standard SQL against sharded data
          1. e.g. sharded on user.id : SELECT * FROM user where FirstName=’Foo’;
            1. will return all results from all shards performing automatic aggregation
              1. sorting using a streaming map-reduce method
        20. Relatively easy to implement and go live with; took us about six weeks of hard work, deadline-looming.
        21. It’s the market-leading product, from what I can tell.
          1. 5 of the Top 50 Facebook apps in the world run dbShards.
        22. It supports sharding RDBMSs besides MySQL, including Postgres, DB2, SQL Server, MonetDb, others coming.
        23. Team : top-notch, jump-through-their-butts-for-you, good guys.
        24. Ability to stream data to a highly performant BI backend.
      3. cons
        1. As you can see, some of these are on the pro list too, double-edged swords.
        2. Cost – it’s not free obviously, nor is it open source.
          1. Weigh the cost against market opportunity, and/or the additional headcount required to take a different approach.
        3. It’s in Java, vs. Python (snark).  Good thing we’ve got a multi-talented, kick-ass engineer who is now writing Java plugins when needed.
        4. Doesn’t rely on MySQL replication, which has its annoyances but has been under development for a long time.
          1. Nor is there enough instrumentation around lag.  What’s needed is a programmatic way to find this out.
        5. Allows for multiple shard trees.
          1. I’m told many businesses need this as a P0, and that might be true, even for us.
          2. But I’d personally prefer to jump through fire in order to have a single shard tree, if at all possible.
            1. The complexities of multiple shard trees, particularly when it comes to HA, are too expensive to justify unless absolutely necessary, in my humble opinion.
        6. Better monitoring instrumentation is needed, ideally we’d have a programmatic way to determine various states and metrics.
        7. Command line console needs improvement, not all standard SQL is supported.
          1. That said, we’ve managed to get by with it, only occasionally using it for diagnostics.
        8. Can’t do SQL JOINs from between shard trees.  I’ve heard this is coming in a future release.
          1. This can be a real PITA, but it’s a relatively complex feature.
          2. Another reason not to have multiple shard trees, if you can avoid them.
        9. Go-fish queries are very expensive, and can slow performance to a halt, across the board.
          1. We’re currently testing a hot-fix that makes this much less severe.
          2. But slow queries can take down MySQL (e.g. thread starvation), sharding or no.
        10. HA limitations, gaps that are on their near-term roadmap, I think to be released this year:
          1. No support for eventually-consistent writes to global tables means all primaries must be available for global writes.
            1. Async, eventually consistent writes should be available as a feature in their next build, by early October.
          2. Fail-over to secondaries or back to primaries can only happen if both nodes are responding.
            1. in other words, you can’t say via the console:
              1. ‘ignore the unresponsive primary, go ahead and use the secondary’
            2. or:
              1. ‘stand me up a new EC2 instance for a secondary, in this zone/region, sync it with the existing primary, and go back into production with it’
          3. Reliable replication currently requires two nodes to be available.
            1. In other words, if a single host goes down, writes for its shard are disallowed.
              1. In the latest versions, there’s a configuration “switch” that allows for failing-down to primary
                1. But not fail down to secondary.  This is expected in an early Q4 2012 version release.
          4. dbsmanage host must be available.
            1. dbShards can run without it or a bit, but stats/alerts will be unavailable for that period.
          5. Shard 1 must be available for new auto-increment batch requests.
          6. go-fish queries depend on all primaries (or maybe all secondaries via configuration, but not some mix of the two as far as I’m aware) to be available
    2. DYI
      1. I can rattle off the names of a number of companies who have done this, and it took many months longer than our deployment of dbShards (about six weeks, largely due to the schema being largely ready for it).
      2. Given a lot of time to do it, appeals to me even now, but I still wouldn’t go this route, given the pros/cons above.
    3. The latest release of MySQL Cluster may be an option for you, it wasn’t for us back with MySQL 5.0, and not likely now, due to its limitations (e.g. no InnoDB).
    4. AWS RDS was an option for us from the onset, and I chose to manage our own instances running MySQL, before deciding how we’d shard.
      1. For the following reasons:
        1. I wanted ownership/control around the replication stream, which RDS doesn’t allow for (last I looked) for things like:
          1. BI/reporting tools that don’t require queries to be run against secondary hosts.
            1. This hasn’t panned out as planned, but could still be implemented, and I’m happy we have this option, hope to get to it sometime soon.
          2. Asynchronous post-transaction data processing.
            1. This has worked out very well, particularly with dbShards, which allows you to build streaming plugins and do whatever you want when data changes, with that data.
              1. Event-driven model.
              2. Better for us than doing it at the app layer, which would increase latencies to our API.
        2. Concern that the critical foundational knobs and levers would be out of our reach.
          1. Can’t say for sure, but this has likely been a good choice for our particular use-case; without question we’ve been able to see and pull levers that we otherwise wouldn’t have been able to, in some cases saving our bacon.
        3. Their uptime SLAs, which hinted at unacceptable downtime for our use-case.
          1. Perhaps the biggest win on the decision not to use RDS; they’ve had a lot of down-time with this service.
        4. Ability to run tools, like mk-archiver (which we use extensively for data store size management), on a regular basis without a hitch.  Not 100% sure, but I don’t think you can do this with RDS.
        5. CloudWatch metrics/graphing is a very bad experience, and want/need better operational insights to what it provides.  Very glad we don’t depend on CW for this.
      2. All of these reasons have come at considerable cost to us as well, of course.
        1. Besides the obvious host management cycles, we have to manage :
          1. MySQL configurations, that have to map to instance sizes.
          2. Optimization and tuning of the configurations, poor-performance root-cause analysis,
          3. MySQL patches/upgrades.
          4. maybe more of the backup process than we’d like to.
          5. maybe more HA requirements than we’d like to; although I’m glad we have more control over this, per my earlier comment regarding downtime.
          6. maybe more of the storage capacity management than we’d like to.
        2. DBA headcount costs.
          1. We’ve gone through two very expensive and hard-to-find folks on this front, plus costly and often not-helpful, cycle-costing out-sourced DBA expertise.
          2. Currently getting by with a couple of experienced engineers in-house and support from CodeFutures as-needed.
      3. As I’ve seen numerous times in the past, AWS ends up building in features that fill gaps that we’ve either developed solutions for, or worked around.
        1. So if some of the RDS limitations can be worked-around, there’s a good chance that the gaps will be filled by AWS in the future.
        2. But it’s doubtful they’ll support sharding any time soon, there’s too much design and application-layer inter-dependencies involved.  Maybe I’m wrong, that’s just my humble opinion.

This was originally posted last week here, but I wanted to re-post here and will be updating with our latest status and learnings, if there’s any interest.  Let me know.

Gluecon 2012 and the Conference Non-Con Postulation : how to measure the ROI of a conference

I’ve never been a big fan of meetings, so naturally conferences were on my no-fly list for a long time: a big building with a big meeting in the early AM, followed by a mitosis into smaller meetings, followed by more and more meetings, all gradually shrinking throughout the day, finally to be absorbed by the nearest bar once some sort of conferential Hayflick limit is reached.
I’m happy to say that I was wrong about them; over the last few years I’ve attended some fantastic, rewarding conferences.   I’ve also attended some anti-awesome ones.  Being keen on agile retrospectives and data-driven decision-making, I’ll posit the following as a formula for measuring the efficacy and ROI of any given conference, spec-style.  You may be privy to my Ruger Fault Equivalency, this is my Conference Non-Con Postulation:
  1. notes : total line count of notes taken during a conference.  A good conference causes me to write furiously; even though there may be slides or notes offered online afterwards, this is the best way for me to internalize, to any degree.
  2. refactor index : number of minutes I spend cleaning up my notes, so that I can share them.  A good conference will cause me to review my notes, clean and boil some of the salient moments up into a handful of takeaways.
  3. note virality : number of people I share my notes with afterward.  A good conference will inspire me to inflict my notes on my team at work, at which point they will be thankful for a high note refactor index.
  4. players : people I had the pleasure of spending time with, who also impressed, inspired, or gave me a laugh at the conference.  Expressed as a quality score, 1 to 3.  A good conference will even have a few folks that exhibit all of these traits (e.g. Keith Smith, Brad FeldRyan McIntyre), that would incur a score of 3.
  5. injuries : number of bodily injuries incurred at said conference, not caused by dude-hold-my-beer moments or other virtuous activity.   Good to have a low count of these.
  6. bullshit : number of times I look at the ceiling, for reasons other than math or loudspeaker brand detection.
Given those input definitions, the Conference Non-Con Postulation is as follows:
A few things to note:
  1. The number of seconds spent refactoring my notes provides diminishing returns.  Also, it’s no coincidence that the refactor index summation will result in a harmonic number.
  2. Note that virality has an exponential effect.
  3. Player count wraps everything with an even greater exponential effect, and it only takes one great player to make a huge difference.
  4. While bullshit and injuries ultimately decrease overall ROI, they will only have material effect when other inputs (e.g. notes, virality) are low.
If I apply this to GlueCon 2012, I get the following:
  1. notes : 310
  2. refactor index : ~3.318 (15m)
  3. virality : 20 (prior to this post)
  4. players : 3
  5. injuries : 0
  6. bullshit : 1
Which results in a CNCP score of 5.4236E+180, which I believe is an impossibly gigantic number.
That maps perfectly to my previous anecdotal, non-scientific assessment that Gluecon is one of the world’s best conferences.
Hopefully at this point you’re laughing, but buying that last claim.
Here are my net-notes, 2012 best-of list, let me know if you want to see the more detailed ones:
  1. Best technology I’ve never heard of but would love to explore (Dan Lynn from FullContact): storm
  2. Best new-to-me buzz phrase (James Governor from RedMonk): “quantified self”
  3. Best well-worn technologist strategy (Mike Miller from Cloudant)  : “so what’s next? let’s see what Google’s up to, reading their latest white-papers”
  4. Best game-related quote  (James Governor from RedMonk) : “I’d rather give my son Angry Birds than Ritalin”
  5. Best presenter that I wish we could hire as a DBA but who already works for one of our awesome partners : Laurie Voss from awe.sm
Tagged ,

Throwing GoDaddy Under The Startup Bus

Ever since we registered our startup’s domain with them  years ago, we’ve been anxious to get off the free DNS provided by GoDaddy at the least, and ideally change registrars as well.  With all the issues other companies have had with them + their political positioning … we just want out.   It’s actually embarrassing to admit we were in this situation for so long, but I’m swallowing my pride in hopes that this will help others out – open-sourced embarrassment (O-ASSMENT).  Until recent, we really haven’t had the time/resources to tackle it without affecting product development efforts and higher priorities.  One of our senior guys has been exploring options for weeks, and we thought we were in a good position to make a change.

There are two parts of this puzzle that need to be fit: GoDaddy is (was) the registrar of our domain, and they also are hosting DNS for us.   That’s a typical set-up when you first register your domain these days; most registrars also offer managed DNS.  But it’s not a good practice to leave your DNS hosted with your registrar – it’s better to separate them right when you register the domain, if you can.

The two parts (registrar and managed DNS) are intertwined; I’m trying to avoid DNS details for the non-technical, but essentially/simply/horribly put: DNS is much like a big phone book that yourDomain.com has a page in, that page maps IP addresses to friendly names like http://www.yourDomain.com and api.yourDomain.com.   One particularly critical mapping provides the IPs pointing to our authoritative name servers.  This mapping is also stored in the index of the phone book, by a higher DNS authority…like Elvis.  Servers that need to know where http://www.yourDomain.com is (in other words, its IP address) look in the index if they need to, and then get the IP from our page in the book.  This is where the registrar comes in – you can only change the IP of the authoritative name servers through the registrar of the domain.  Otherwise, with regard to DNS/WHOIS records, the registrar is just a text string, a name without a number.

But this makes registrars ultimately all-powerful; you can make all the DNS changes you want, but if the authoritative name servers are changed and pointed to hosts that don’t have our DNS information, or don’t have the right information  - you’re totally FUBARD.

We shopped around for a different registrar, and at one point were ready to sign an expensive deal with MarkMonitor, who from all accounts is the market leader in terms of locking things down from a security standpoint.  But they couldn’t seem to get their act together fast enough and were too expensive for our growth stage anyway.   We decided to go with NetworkSolutions, the “first” registry operator and registrar for the com, net, and org registries.

GoDaddy offers free DNS when you register your domain with them, but they also offer Premium DNS.  We upgraded to premium weeks ago, to get a better idea for our DNS traffic and to price out competitors.  To be totally clear, at this point in the story we’re paying GoDaddy for their premium DNS hosting option.   GoDaddy offers this to their customers as a stand-alone service; in other words, you can use GoDaddy just as a managed DNS provider (as long as you have a domain or two registered with them, I’d assume ).

So, given that we wanted to move our registrar (because we didn’t want GoDaddy to own the gate to our authoritative name servers), and our DNS, we had a few options:

  1. Try to move both at once.  Not a good-feeling option for probably obvious reasons.
  2. Move to a different managed DNS provider, then once that’s complete, move registrars.  Moving DNS is more complicated and in theory (or logically) more risky than moving registrars.
  3. Move registrars, and once that’s complete move to a different managed DNS provider.  This seemed like the lowest risk option, given all the inputs at the time, and it’s what we tried to do.

Here’s the relative timeline, what happened, and what we expected/should have happened:

  1. Our senior engineer talked on the phone with folks at GoDaddy and our new registrar, NetworkSolutions, both of which confirmed our understanding and expectation that during and after the change, the name server addresses would remain pointed at GoDaddy’s name servers until we took action to change them.  The only thing that was supposed to change was the registrar’s name.  We reiterated with them that downtime wasn’t an option for us, and they reassured.
  2. Our engineer initiated the transfer song-and-dance.  The first thing he noticed was that we couldn’t get to any DNS information in GoDaddy anymore, including the NS records.  OK….kinda makes sense to prevent changes while the transfer is happening I suppose, but we should at least have read-access to the current records, right?
  3. So he called GoDaddy, who pointed us to a page where we could access the current DNS records if we did a ‘view source’ (!), and also pointed us to a ‘pending transfers’ section of their site that would expedite the acceptance process.  No email or other instructions about this bit were previously given; this whole process normally takes place mostly via automated email, and registrar documentation on all of this is sh** across the market.
  4. Then we took a step that I’m quite glad about : we saved all of our zone file information and DNS records to a spreadsheet.   Go do this now for yourself, if you are in the same kind of situation.  Seriously.
  5. As instructed by someone at GoDaddy, we then ‘accepted’ the registrar transfer on their site.
  6. At this point, we’re thinking that our premium DNS is going to sit there untouched, and that it’s going to be five days before the registrar is transferred.  5 days because the registry operator – the root authority for the .com domain –  has that as a grace period before making the change, in case any party cancels the transfer.  Wrong on both counts.
  7. Shortly after accepting the transfer in GoDaddy’s web interface, they deleted our DNS records. We had a short time-to-live setting on the records, so after 30 minutes, hosts aren’t able to look up what IP to use for any ourDomain.com services.  The name server entries weren’t changed of course, because GoDaddy is no longer in a position to do so.  But the information sitting on those name servers that pointed IPs to our services was gone.   That meant that slowly, across the net, customers stopped being able to access services onourDomain.com - including email.
  8. Our engineer called them immediately, described what happened, and asked why our DNS records disappeared.   Answer : “because you moved your domain to another registrar“.
  9. OK, can we get that re-instated?  Answer : “I can give you access to the DNS manager page again for this domain, but you have to put all the information in yourself.”  I’m pretty sure they’re required to keep this information for some period of time, to be in compliance with their registrar agreement with ICAAN.
  10. So our engineer gets out our trusty spreadsheet, and manually copies the information back in.  Shortly thereafter we start to see a gradual recovery, as clients start to be able to resolve hostnames to IP addresses again.
  11. Hours later, the registrar transfer actually happens and is verified via email from NetworkSolutions.

Fortunately our end-user-facing product fails gracefully under these circumstances and customer impact was minimal.

That whole escapade pretty much escalated the priority of us getting off their managed DNS, which we did in the next week.   After looking at various (mostly expensive) options, we moved over to Amazon’s AWS Route53, which went relatively seamlessly.  The nice thing about Route53 is that it’s accessible programmatically and can be managed via scripts just like the rest of our AWS resources.

I totally get that the herky-jerky that comes with WHOIS-on-first; name server and DNS transfer of ownership puts registrars in an odd situation, one that requires competitors to coordinate if they’re going to act in the best interest of their soon-to-be/just-cancelled customers.  But there’s got to be a better way than this ridiculous bullsh** we just went through.  Registrars who offer DNS hosting as a service have an obligation to publish the ‘how do I get out without getting ass-f*****’ instructions at the very least. Better yet, for a grace period, leave DNS the way it is until an NS record gets changed at the root level, messaging their customers about what’s coming in the meanwhile.   I know that some registrars do provide a grace period like this.

I’m obviously not a registrar,  and admit that my proposed solutions may not be tenable.  But there’s got to be a better way.

We’re not the only startup in the bus that’s running over GoDaddy, there’s pretty much wide agreement on this topic.  I’m glad we’re over that speed-bump and the startup bus is barreling forward at high-speed as usual.

I’m tempted to turn this into an ICANN complaint – any input on whether that would hold up, or be worthwhile?  ( to comment you have to be on this post’s page, rather than the blog home page)

Update : in case it’s helpful for anyone, I’ve started gathering some numbers on what some other friend’s startups are using (without major complaint)  for registrar and hosted DNS, and will update here for now.   Please email me directly if you’d like me to add something to this list.

  • Registrars (count of companies using them)
  • Hosted DNS providers
    • EasyDNS.com (2)
    • Zerigo (2)
    • AWS Route53 (2)
Tagged , , , , , , , , ,

The Brightest Fires

This morning before our board meeting (I guess it was officially yesterday morning, now) I had the wonderful good fortune of giving my friend Andy a big congratulatory hug, in celebration of his beating cancer – he had just gotten the confirmation an hour or two before. Fantastic. Incredible, for so many reasons. I’m thankful and happy for him.

Hours later after leaving the office late I was still working in bed, typically banging away at email on my phone around 2:30 AM, when I got an email from a girlfriend I had in college, a very close friend who I hadn’t talked to in years. She thought I already knew, but it came as a total shock : that’s how I found out that one of my best college friends, roommate and co-conspirator in mischief of all kinds, had passed away in February at the young age of 43. I was of completely stunned.

The first thing I did was start typing his name into Google on my phone, which started auto-completing before I could finish his last name:

My heart sank when I saw the word ‘cancer’, and my mind went into an odd state of surreal incredulity.  I got out of bed, put a sweatshirt on, and went to sit in front of the Mac in my studio.  I did the same Google search, which right away produced a full-page of results specifically about my friend Adam Adamowicz, headlined by a Wikipedia entry.  Just now, wondering how deep the relevant results went, I opened page 13 of the results and it was still a full-page of links to sites discussing and mourning his death.

I opened all of the search result links on that first page, and started looking through the pages – then my mac crashed.  Already straining to see clearly through teary eyes, now swearing at this inanimate object that was acting the proxy for my dead friend, I moved to a PC laptop which was sitting on the floor and started again.

The New Funeral for old out of touch friends in the virtual age. Strange but wonderful way to say goodbye. Really, I feel lucky to be able to do it.

Adam was a concept artist, and if you play video games (Fallout, Skyrim), more than likely you’ve lived in worlds that were born from his insanely quick, twisted, wonderful, hilarious mind.

Clearly it was easy for people who loved Adam to paint their version of the artist.  This page and the quotes from his friends really nails it.   Here’s a link to the New York Times article about him.  Awesome Robo created a wonderful post, if you get a chance to view the video on that page you’ll see him at 1:29, sitting among his vast creation.   The Reddit post about him is long. It looks like his artwork is on display at the Smithsonian.  It goes on and on, he touched so many people – truly amazing and wonderful.

I first met Adam when I was a freshman in college at CU Boulder.  I’d been assigned to an unusual room on the basement floor, in the corner of the building, with two other guys.  There was a door on the back wall of the room, which I discovered with great happiness could be opened with the appropriate amount of cajoling, leading to a room roughly an eighth of the size of our regular dorm room.  It was totally over-ridden with dust and spider webs, pipes running everywhere – I fell in love with it instantly, moving all of my loaned CU furniture out of its clean and safe haven to the dark windowless hovel that became my home for the next two years (I successfully petitioned to stay in the same room the next year).  One roommate hated me, the other was amused.

One of my roommates was the quiet, introverted, artsy music-head type of guy that I could totally get along with. I think he was impressed with my obsession with living in a dank closet-like hovel and making odd senseless contraptions, and that’s what likely led to him introducing Adam and I. That first year for Halloween, after painting his face a gruesome green and donning his black leather biker jacket that we bike-less artist types consistently wore those days, my roommate took me to the next dorm over to meet him.  When we walked into his room, (probably Skinny Puppy blaring),  I recall being instantly wowed by the costume makeup he had self-administered.  It was an odd mix of hollywood horror steam punk, and it garnered instant respect.

Those costumes got more and more amazing every year.

I remember taping cheap manila paper up on the walls of my hovel, buying a bunch of pens, and luring Adam over with a 12-pack to get him to draw art on my walls. He got wise to that, but not until after I got a few drawings and after he’d laughed his ass off at my ‘thinking chair’ (my loaned CU lounger up on cinder blocks, with an attached clothes-hanger frame suspending a beer-can mobile powered by a fan motor which caused Bud aluminum to spin around your head at high speed). I remember the beer-ish night we were walking around campus and I introduced backwards-man to him, who comes out as my alter ego sometimes after putting foot in mouth; I did a full audio and physical rewind that made him roar and take up.

My buddy Joel and I got our own place the next year. We didn’t have enough money for a real apartment, so we rented the attic of a house.

The fascist owner of the house had turned the attic into a lie of an apartment, a rubes cheap trick that we had to stoop to walk around in unless we were walking the center line. We were glad to have a cheap roof on our heads. One person was cramped in that hot cubby, two people were constantly bumping into each other. Adam needed a place to stay, so of course he started staying with us. There were only two beds, divided by a make-shift wall made out of unwanted propped-up doors and cardboard, so we took turns sleeping on the floor.

I’d just discovered audio sampling, and had bought a new keyboard, so there were many drunken nights of tempo-shifted bodily noise hilarity. We had a scorpion living in a gutted TV set, and Joel had his homemade weight-lifting rack up there in the attic, for those late-night bench-press contests that our neighbors loved. I’d always wanted a gargoyle at my castle’s entrance, so I made one out of a motion detector, some duct tape and broomsticks (fail). Adam was drawing for comic books and he had me pose in my overcoat for a cover of 2,000 maniacs, brandishing a knife. I later framed it, it’s in my master bath and I look at it every day – so he’s never quite left me.

Then Adam brilliantly found an incredible warehouse in north Boulder to live in, right next to the strip club, auto shops and welding studios.  Here he is in a junk car in his backyard, back then.

The warehouse was perfect for him, and he turned it into an artist’s wonder home. He had fantastic parties there; I remember lighting Joel’s pant leg on fire with lighter fluid, him returning the favor (pre-planned of course, we wore layers), golfing off the roofs of junked cars, and how he took a saws-all and cut the roof off of his already debilitated Saab because he wanted a convertible. At one point he was making great money creating art for raves; crazy huge foam stuff that hung from the ceiling for the trippers.

He worked as a bartender at a local nightclub downtown; friends, brother and I would go and visit and play pool past closing time. Some weekend nights we’d go to Denver, and Adam and I climbed a fair number of fire escapes to be on the roofs of buildings in the middle of the night, with not a small amount of beer driving us skyward. Adam was full of fire, an adventurer. He was the kind of guy you always wanted to hang out with, it was always fun.

I lived in Denver for a while, and these shots were supposed to tell a story, a ludicrous art collaboration, something along the lines of:
“Whoah, that’s a pretty tall tree, Adam. Think you could climb it, and lower yourself on a rope, with a 10-speed bike?”

“Hmmm….let’s seeeeeee here…”

“Yup, wheeee”

“Ha ha!  I was hiding in a large pile of leaves with the hose, to shoot you down!”

Then it came time to move away from Colorado, him to San Fran and I to Seattle.  One one of the various trips back and forth to the coast, we packed his stuff into a cargo van I had at the time and drove west, stopping at Lake Powell for an awesome time.  This is an amazing place that you really should check out sometime.

I dropped him off in San Fran and headed to Seattle to start a new life.

I’ve never known anyone to write letters regularly, except my brother and Adam.  And he wrote hilarious, entertaining, endearing letters.  It may sound odd, but I couldn’t throw them away – I’ve kept them in a box with other important personal stuff I’ve collected over the years.  They’re that great.

Some quotes:

  • I can magically turn a twelve pack of beer into streams of hot pee.  It’s all part of the great circle.  -Twelvpac
  • Facebook…Sucks. Twatter…Sucks. Thanks for the old fashioned email, like we all used to do in the days of yore when computers were carved… from logs.
  • My head is like a treeshredder to any beer unfortunate to come within an armspan, and brother, I got a reach like a goddamn albatross

After a few more years we’d largely lost touch, and even a significant falling out (my fault) , but he continued and still continues to remain on my short list of all-time favorite people ever.

I wrote this largely for myself, to record all of the memories flooding back in an instant after hearing the word.  But I also wanted to share my perspective on the man who left us, and I hope you find it as entertaining as the memories I have of our friendship.

I am entirely shocked, saddened, and full of regret.  My heart goes out to all the people who were close to him and loved him, to our mutual friends wherever you are, and most of all to his family.

Remarkably, one of the last things I said (wrote) to him was “I love you, man”, via email back in March of 2010.   Lucky.

Adam, you made the world a better place.

Jeff Malek

April 26, 2012

Fuck cancer with a flaming barge pole.

Tagged ,

Getting back to it

Wow, since I blogged last I chose Python/Django and we rocked the gamification world with it.  Lots of stuff I should have been chronicling.  Trying to get back to it…but there’s a single event that prompted the return, I’ll be posting shortly on that.

Python vs. Ruby, Rails vs. Django

I’ve been learning the basics of these programming languages and development platforms lately. I’m impressed with the huge Rails movement, and Django seems like a very cool project. Both platforms allow web sites to be built at an incredible pace.

Both Ruby and Python appear to be relatively straightforward. But I must say that at this still-early stage in my learning, Python wins for simplicity and ease. Ruby’s “code blocks” are a bit obtuse, as example reasoning.

From what I currently understand about Rails and Django, their strengths lie in auto-generated code whether the implementation be application-supporting code derived from database schema (Rails) or database schema derived from application code (Django). Microsoft’s LINQ, which we’ve been using a bit at work, is similar but not foundational to the platform. It’s all cool stuff that makes DBAs anxious, which is always fun.

After trying to test out Google’s AdWords client API using their supplied code samples, finding that those code samples require old libraries that the latest Python release doesn’t support, and realizing that Django itself won’t support the latest version of Python (3.0) for a year or more to come, I’m wondering what it would be like to have a wonderfully supported and rich platform like Rails tooled with a simple and evolved language like Python.

Now I’m wondering what I’m going to have for dinner. I’ll probably get more mileage out of this wonderment.

Follow

Get every new post delivered to your Inbox.

Join 1,211 other followers

%d bloggers like this: