Written by Luther Rochester on — 15:58 reading time
Updated 2019-02-08 to reflect newer Sonar version config changes
A while back I wrote about integrating our Windows systems and SQL Server into our Nagios implementation. These days we’re looking to replace Nagios (and Ganglia) with Prometheus for metrics collection, monitoring, and alerting.
While exporters already exist for most of our Linux systems, it seems like not too many people are integrating their Windows metrics yet. There look to be two primary exporters as of this writing: a WMI exporter, and a package called Sonar that can export both WMI and Windows Performance Monitor counters. I chose to implement Sonar because we were already using several Windows perfmon counters that can’t be duplicated with WMI, including custom SQL Server counters, and I appreciate the flexibility of being able to use either metric type. I found the maintainers of Sonar to be helpful and responsive when I had questions. It also seems to have nice Docker integration capabilities that we aren...
Written by Luther Rochester on — 05:51 reading time
As we’ve moved increasing portions of our infrastructure to AWS, we’ve found more use cases for stashing things in s3 that we used to use traditional backup software to store. We keep our SSIS code in version control (currently TFS, grumble grumble), but we aren’t storing a copy of each database object definiton there. Recently we undertook setting up a job that would script out each object’s DDL and store it in an s3 bucket. We’re already doing this for our Postgres servers using pg_dump; we found a great little Python app called mssql-scripter that works similarly to script out the objects for us. This runs in a batch script, along with an aws-cli command to upload the resulting files. The batch script is then called from a SQL Agent job and run on a schedule. The AWS bucket has a lifecycle policy to handle retention for us.
You’ll need a few tools
The server that will run the code will need a few things installed on it. It’s easiest if you use the same server that SQL Agent...
Ansible inventory management is generally very simple and intuitive, but there are some infrastructure configurations that are difficult to express and configure with the built-in inventory and configuration variable functionality. Specifically, it is difficult to configure lists of a certain resource type (like users) across machines that are in different but overlapping groups (like QA environments overlapping with datacenters).
These situations are certainly possible to manage in Ansible, but we at Leapfrog wrote an Ansible plugin that offers a different way of declaring variables that we think can sometimes be more clear. That plugin is available here:
When you are new to Scrum, it’s easy to focus on the concept of finishing everything you have committed to within an iteration. Because when the process instructs that you write user stories of manageable size and deliver small increments of functionality, it’s unsatisfying when the work is not done by the Sprint Review or the next Planning session, right? That is how we define carryover – work that was not done in the sprint but is still valuable, so the story needs to be carried into the next sprint for completion.
Let’s be honest. It’s not just people who are new to Scrum who feel this way. It’s veterans, too. Carryover tends to create uncertainty because it doesn’t seem to fit cleanly into the Scrum framework. But what we are learning is that carryover is as much a part of Scrum as the Daily Stand (if it weren’t, would it have ever received a name?). There are a number of reasons that carryover can happen, these are just a few we witness at Leapfrog:
All software has bugs. From Microsoft Office to joeschmoe.com. Bugs are a reality. And a major part of representing a product is being a first line of defense against “bugs.” If users are allowed to, they will go and tell a developer every time they think they’ve found a bug. That’s the fastest way to a resolution. If what has been found is indeed a bug, the fastest way to fix it is to have someone who writes code be aware of it and write code to fix it. And in a perfect world, where users were wildly competent at diagnosing a system, this behavior might be okay. But this is not a perfect world. And it isn’t fair to assume that someone who casually uses software is going to know the ins and outs of each piece of functionality. And if every user had access to developers, nothing would ever get done. So if it’s your responsibility to own a product, it’s your responsibility to be the first responder to “bugs.”
The first and most important thing to establish here, is you need users and...
Last week a few in the Leapfrog team attended the Anyone Can Learn To Code (ACLTC) showcase at 1871 Chicago. Our goals in attending the showcase were to: share with these graduates who Leapfrog is, increase the pool of candidates for Leapfrog engineering job openings, and get an early chance to talk to people as they transition to their software engineering careers. The ACLTC showcase was from 11am-1pm, during which we had the opportunity to converse with almost all the graduates. The conversations started with the graduates individually sharing a demo of their project at their tables, our team asking questions, and us sharing who Leapfrog is and what we do.
Demos from graduates were generally succinct and well communicated. The demo format allowed the audience to understand the purpose of the project, its functionality, and, in some instances, features that the graduates were planning on adding. The best demos gave us an insight into the graduates’ development process: the development...
After some failed attempts at propping up a satisfactory WSUS server to manage patching our Windows hosts, we finally achieved a non-right-clicky Windows Update solution using Ansible and Powershell. Like any marriage of Linux and Windows, it wasn’t without its frustrations. Thankfully we continue to make use of the painfully achieved scaffolding by ansible-izing tasks such as SQL Server ChatOps and Hyper-V VM pause and resume.
Ansible + Windows: much pain, great reward
A while back Ansible announced support for Windows and provided some example scripts to do things like install software and run Windows Updates. We were already using Ansible to script maintenance operations on our Linux servers so we were excited to use the same tool in our Windows environment. Unfortunately these example scripts did not fully work as advertised. Despite great documentation from Ansible and plenty of blogposts about Powershell and Windows Updates, we still spent a fair amount of time gluing all...
The practice of estimating work in software development is a challenge that we often encounter. It is not always easy to estimate effort, especially when the complexity and uncertainty of the story increases. The purpose of estimation using story points is to agree on the effort and complexity involved of a user story order in order to better plan and execute product development. Some other goals of estimating using story points are as follows:
To be prepared to answer “What can be completed in two weeks?” during sprint planning
To increase the accuracy and precision of ^ that answer
To estimate against the team and not an individual
To set expectations and align as a team
To be able to forecast and prioritize the backlog
What is a story point and why do teams use them instead of measures of time?
A story point is a relative measure used by teams to evaluate the effort and uncertainty involved with implementing a story. Story points make...
Written by Morgan Delagrange on — 05:54 reading time
TL;DR Hubot Test Helper works great with the most recent versions of Hubot and Node. With older versions, YMMV. You can find sample tests here and here. They use some convenience methods found here.
Recently our team decided to upgrade to the latest version of Hubot. We’re running version 2.10.0, pretty far behind the current version (2.16.0). We had attempted to upgrade previously, but we discovered that later versions of Hubot broke our test suite. It’s time to find a new way to test!
Let’s start with some research. I found an issue on GitHub called “Define and document testing patterns for scripts”. There are pointers to two testing options: Hubot Mock Adapter and Hubot Test Helper. To me, Hubot Mock Adapter looks like a dead end. It hasn’t been updated for well over a year, and based on a comment from willdurand it won’t work with Hubot 2.14+. So that leaves Hubot Test Helper. Let’s try it out.
[I’ve set up a GitHub repository with all the steps described in this...
I was inspired by a blog post I read the other day about a junior software engineer’s experience of being hired by a company, knowing she was junior, and being let go because she was “too junior”. In her blog post, Jenn identifies several issues a company should consider before hiring junior engineers. As a company that regularly hires junior engineers I thought it would be a good idea to share our experience.
For all of my time at Leapfrog (going on seven years at this point) the technology department has been referred to as a teaching hospital. We hire junior engineers and give them the environment and tools they need to succeed as software engineers.
We view our junior engineers (test and software) as the latest batch of interns to our fine hospital under the tutelage of our other software engineers. We give the junior engineers the opportunity to apply what they have learned academically on the job and expose them to technologies, tools, and situations that they probably would...
A while back, Kris wrote about some of the interesting ways redis configuration choices can harm the performance of your applications. The investigation he described was the most illuminating so far in our efforts to understand and tame the microservice beasts we have bred, but I’d like to talk a bit about my own pet preoccupations and how I came by them: specifically, how in translating the library we use to build Python workers into Clojure, I learned a few new things about the code that had been under my nose all along.
A Pattern for Distributing Synchronous Tasks Using Redis
So Kris mentioned PSH, which stands for Psomething Service Hub. (Well, almost—the Psomething part is Top Psecret!) To build it, we wrote a couple of “tasklib” libraries, one in Ruby and one in Python, to enqueue synchrononous1 “tasks"—remote function calls that return all the disparate pieces of context we need to build our pages—in redis. On the other end, worker processes use one...
NOTE: This is somewhat of a departure from our normal nerdery, but a pretty decent indication of how and why we’re building our Conversion Platform and the reason why we think it’s going to provide some meaningful value.
The term programmatic appears often in discussions of digital marketing and advertising. It’s most commonly deployed in compound form as programmatic media or programmatic buying, referring to the collection of information about consumer preferences, publisher inventory and pricing, advertiser goals, and the use of software programs to automate the dependent processes of data collection, purchasing, ad trafficking and performance management.
The big idea here is about efficiency, both in the sense that the software can perform calculations and make decisions at a scale and speed that eclipses that of human media buyers, and in the sense of efficient markets. As of the first half of 2015, the use of programmatic media, or programmatic buying, has become not just all...
Postgres has supported the ltree datatype as far back as 8.3 (potentially even further back but 8.3 is the oldest version of docs that were available). The ltree datatype is Postgres’ implementation of materialized paths, allowing the storage of label trees (hence the name) that represent the path to a record in a hierarchical structure. Materialized paths, adjacency lists and nested sets represent the three primary methods for modeling hierarchical data in a relational database. I’ll give a brief overview of those methods but will not delve too deeply as there are plenty of online resources that discuss the varying approaches at length. Instead, I’ll talk about how and why I decided to use ltree for FOX, our content management system.
Modeling a Hierarchy in a Relational Datastore
A problem that every software engineer will encounter in their lifetime is how to model hierarchical data in a relational database. They will have to make the choice between three modeling methods: adjacency...
Redis has become an increasingly important part of our production stack at Leapfrog. We use it in two critical functions: as a session store and as the transport medium for our microservice implementation (PSH). The throughput of Redis was a big draw when looking at technologies to use for PSH and the list data types are also very useful for feeding tasks to workers. However, Over time we started to identify some performance issues with task throughput. Overall, the performance was great but we would occasionally see spikes of 20+ seconds when pushing a new task (a small JSON blob with parameters) onto the end of a list. Thus began the great redis investigation of 2015.
Under The Microscope
The entire ordeal began as we prepared to migrate datacenters. Every component of the infrastructure was under the microscope to make sure that the new data center was running properly. As our primary platform (FOX) was being tested, we started taking note of these latency spikes. We noticed the...
I’ve had the luxury of having a lot of great managers in my career. I’ve also had the luxury of having some really awful ones. And while I’ve learned a lot from my great managers, I’ve also learned a lot from the bad ones. In fact, I’ve probably learned a whole lot more from the bad ones.
The great managers I’ve had were strong proponents of the one on one. The bad managers didn’t know what a one on one was.
The Manager Tools site has a great Podcast on the subject that still resonates even though it’s almost 9 years old. Their description:
The single most effective management tool.
If Podcasts aren’t your cup of tea, or if you stopped reading once you saw the word management then maybe Rands’ take on management is more your style:
First, those [people] don’t work for you; you work for them. Think of it like this: if those [people] left, just left the building tomorrow, how much work would actually get done?
The deeper I’ve gotten into management, the more I appreciate...
I remember flipping through the channels one Saturday morning, trying to find something good to watch, when I suddenly came across a goofy looking man with a paintbrush and one of the best damn afros you’ve ever seen. That man was none other than Bob Ross, artist and host of the PBS series, “The Joys of Painting.”
It’s very easy to get distracted by the physical appearance of Mr. Ross, but all joking aside, there was something quite magical about his style of painting. I distinctly remember being being mesmerized by his methodical, yet simplistic approach. Ross was an amazing teacher with this innate ability to take something that most people viewed as complex and make it approachable. This trait, along with patience and passion for his craft are what made him so great. In fact, the more I reflect, the more I can’t help but think he would have made a fantastic developer.
That’s a great story, but what does this have to do with animation? Well, as someone still relatively new to the...
While the recipe for JSON-SOAP interfaces with CXF, Jackson, and Spark seems great on paper, it’s important to keep in mind that the majority of time spent working on such a SOAP integration is likely to be in configuration. Consider the scenario where you already have base classes to handle run-time configuration, client set-up, and service routes. Now it’s just a matter of setting up data structures to accept JSON inputs, and passing the data along to the SOAP port, right? Wrong.
Here we list some issues we’ve run into with SOAP integrations, and barebones solutions.
An endpoint with a self-signed certificate
Consider a development endpoint provided by a vendor that uses HTTPS but has a self-signed SSL certificate. Access to the endpoint is only available through a VPN between you and the vendor. In this circumstance, you probably don’t care too much about the validity of the development endpoint’s HTTPS certificate.
There are multiple solutions to this problem, but one is to simply...
Often we don’t have a choice over the systems we integrate with. Not everything can be a simple, self-documented, “RESTful” API.
Consider a SOAP API that requires a standard but uncommonly used feature. The average library in your programming language of choice may not support every SOAP feature under the sun. Even Leapfrog’s own SOAP library for Python is far from feature-complete.
Now consider a diverse stack made up of multiple distinct systems, written in multiple programming languages, for both consumer-facing and internal business procedures.
The promise of SOA through SOAP solutions becomes less and less appealing. At Leapfrog, we’ve found adopting JSON as a lightweight alternative to SOAP in a SOA-inspired platform meets the best of all worlds.
So what about all your vendors that use SOAP?
Fear not! There’s a recipe to solve all your problems:
“If you want to make God laugh, tell him about your plans.” Woody Allen
Happy Friday the 13th.
The single incontrovertible fact in anything is that things are going to go wrong. Oh, you can try and plan for it. You can develop resiliency, elasticity, scalability and all the other ty’s you want, but things are still going to go wrong. Hypervisors are going to fail, deployments are going to botch things even though CI said it wouldn’t, message queues are going to go haywire, and things are going to crash.
Brent Chapman gave a great talk at O'Reilly’s 2008 Velocity conference titled “Incident Command for IT: What We Can Learn from the Fire Department.” After looking at this presentation, our team started using ICS anytime things went south. At first, ICS can seem a little heavy weight especially in the heat of the moment. You might honestly think “Do we need to use ICS for this? Let’s just figure out why the message queue isn’t draining!” And sometimes, that’s the right thinking ...
When I was first introduced to Unix systems, I was very happy to just type away not caring much about the interface. I thought in a very classic 13 year old, naive manner: It’s just the command line, what’s there to change?! I was content with the bare minimum shell and prompt. I didn’t have any knowledge that I could change anything, nor did I have any experience as to why I would want to change it. It’s the command line after all.
In the beginning, and my recollection is hazy, it probably looked something like this:
Over time, after logging into different systems and seeing different shells and prompts, I came to the realization that I could change the interface and that I had actual reasons to change it. But really the primary reason to change it was that I could change it!
I started changing not just my default shell but my default PS1. But little stuff. Add a \u here, maybe a \W and so on and so forth. I never thought much about it, it was very fickle and it never really felt...
Written by Luther Rochester on — 05:45 reading time
Our web systems live on UNIX-y hosts, and we’ve got a robust Nagios implementation to monitor and alert us for all those systems. However, our BI platform is Windows and SQL Server based, and we didn’t want to have a separate monitoring system for those servers and databases. We came up with some tricks that have worked well for us to integrate Nagios into our Windows ecosystem.
Install Nsclient++ on Windows boxes
In order to get monitoring stats into Nagios, we’ve installed the Nsclient++ application on all our Windows machines. This is a very lightweight, handy client that enables all sorts of monitoring data. For our purposes, we’ve got it configured to pass Nagios checks to Windows Performance Monitor counters via the check_nt protocol.
Nsclient++ utilizes a simple ini file to set the basic configuration. Our Nagios server was added to the Allowed Hosts section, and “NSClientserver” was set =1 to allow the check_nt command to flow through.
RESTfulAPIs have become commonplace in the world of the web.
Given that “REST” itself is a concept and not a protocol, RESTful APIs must only obey a minimal set of rules that can be abused by developers in any way they like.
While the transport protocol might often be HTTP, the rules for interacting with the API (i.e., the API’s protocol) are largely defined by the API itself. I almost want to label these as “simple HTTP services”, because many APIs that label themselves as “RESTful” really aren’t. Consider a JSON-over-HTTP API that happens to maintain state. It’s not RESTful, but it uses elements commonly found in RESTful APIs.
The simple HTTP service has gained popularity for a few reasons. It uses the same transport protocol as traditional webpages, and if it’s a RESTful service, it’s modeled on how the web works: stateless and resource-oriented. An HTTP service that uses JSON as the payload format enables quick and easy interoperability with a web browser—and nearly every modern...
In Part 3 of this series, we had a look at the find command and the set builtin, two tools that have a lot to offer and come with some lengthy manuals.
Commands dealing with file permissions have short manuals. They shouldn’t!
Unix file permissions seem to be another thing about the command-line that annoy people. You have these commands, chmod, chown, chgrp, etc, and while the latter two seem reasonable enough, the first can accept numbers or letters as arguments; neither technique is all that intuitive. When Terry asked us during a lightning talk, “do you use numbers or letters?”, most responded, to his astonishment, with “numbers.”
I like both methods for different reasons. But let’s back up first. The most basic file permissions dictate if a file is readable, writeable, or executable (or any combination of them). A directory is a type of a file, but it behaves a little differently than what you might expect depending on its readable, writeable, and executable status...
In Part 2 of this series, we had a look at looping strategies to help us determine which strategy might be best for us under particular circumstances.
One of the other options that we didn’t touch on was using the find command to help us operate on a group of files. We’ll look into find here, and also the set builtin: both tools that have many options, are a source of confusion, and can also be very useful.
Stop fearing find
The find command always seems to bother people. When using find, think, “I’m looking somewhere for something.”
Right now, I have three files in the directory called “stuff” within my home directory: fobar, foobar, and fooxbar. Let’s say I want to get a list of all files matching “foo” in there:
As a computer-something student, one of the things I felt that was under-emphasized in my college education was the importance of the command-line in “real life.” Fire up an IDE with auto-completion, code away, and deploy the source to production—erm, professor—for evaluation by paper. Pray for low concentrations of red ink in the result set. That blasphemous blob of junk you delivered ran at least once (your classmate swears she saw it run, too); whether it runs again or not doesn’t matter so much. What matters more, in this setting, is the student’s interpretation of a narrowly-defined problem and the demonstrated application of theory to solve it.
This method of learning might be considered harmful. We jump to the meat of the “problem” before we get our feet wet in the “meta-basics.”
Why is it that, when I paired with a fellow classmate, we would often have ideas for how to approach a problem, but lacked a sense for where to begin?
It’s been a little over 3 years since Ethan Marcotte coined the term “Responsive Web Design”, and I think we can all agree that it has proven to be more than just a fad. As the Web continues to evolve into a larger part of our everyday lives, it will become increasingly more important to deliver optimal user experiences for the growing number of Web-enabled devices and contexts.
“Day by day, the number of devices, platforms, and browsers that need to work with your site grows. Responsive web design represents a fundamental shift in how we’ll build websites for the decade to come.” Jeffery Veen
Clearly, the days of designing fixed-width websites are in the rearview mirror. Creating a unique experience for every device is hardly an efficient long-term solution, so where does that leave us? How do we navigate an evolving landscape with problems that are constantly in flux? Furthermore, how do we address these concerns in an efficient and cost effective manner? These are all difficult...
Written by Morgan Delagrange on — 09:10 reading time
In this article, I’ll show you how to configure Boxen to work with GitHub Enterprise, sharing some of the tricks I’ve picked up by using Boxen here at Leapfrog Online.
On February 15th, 2013, GitHub launched a cool configuration management system for Macs called Boxen. If you have a bunch of developers, designers and operations folks in your organization, Boxen is a great tool for getting software installed on their workstations quickly and consistently. Under the hood, Boxen uses two key technologies: 1) Puppet, which installs software on the workstations, and 2) GitHub, which stores the Puppet configurations, propogates configuration changes to users, and handles user authentication and authorization.
At launch, Boxen worked really well for users with github.com accounts, but you couldn’t configure it to work with GitHub Enterprise. Over the following months, the Boxen community worked to add support for GitHub Enterprise installations, and today Boxen works equally...
“The internet needs another blog!” Never Said by Anyone, Ever
Everyday we’re bombarded by a sea of what someone who used to be cool called too much information. It’s the nature of the internet. It’s a strength and a weakness. It doesn’t matter if you’re trying to decipher the meta-nature of True Detective or if you’re trying to divine a way to get Docker containers to work at scale; there’s an answer out there or someone has a point of view that they’d like to share.
The point of this blog is two-fold:
We hope that these series of posts and articles can help out others.
We hope that the writing of these posts and articles can help us.
The content contained in this blog will be a combination of tips and tricks, short and long form articles with a decided bent towards all things technology. We will share comparisons, guides, insights, investigations, observations and tips that we have found useful on our collective journeys. We’ll try to avoid metal gifs, well at least this author...