Redis TTL is no Free Lunch

redis-logo

Do you know how Redis cleans up memory? If not, keep reading. You see, I know much more about how Redis cleans up memory, more than I ever wanted to know. Recently, I built a system that had a variable load and was hammered during the fall foliage season and we ran into a little memory issue with my favorite database. We had to keep scaling it up and it was getting expensive.

TTL and a Red Bird

doughnut_cookie_desert_sweetsThe REDBird system uses Redis and writes and reads a lot of time series data. Truth be told, the system creates a lot of keys and leaves them on the system for Redis to clean up. Kind of like that brother that doesn’t clean up after himself in the kitchen and thinks mom will clean it up for him. There is only one problem, we write and expire keys faster than Redis can clean up in its default configuration.

Keys in Redis are expired in two ways. There is a passive, and an active way. When a client tries to read an expired key, it is deleted in the passive way. The active way entails the system looking up 20 keys at random and testing them for an expired Time To Live (TTL). If it finds that more than 25% are expired, it will run again.

stopwatch_timer_running_timeThis is where things get a little murky for me. The Redis documents say that the cleanup function will run 10 times a second but that it will stop if it does not find greater than 25% of keys are expired. I have some questions. Stop for how long? What does start again if it finds greater than 25% are expired mean?

hz and Frequency

The answer to most of my questions was inside the redis.conf file. In the file, under the hz configuration settings, we get this little gem of a comment.

Redis calls an internal function to perform many background tasks, like closing connections of clients in timeout, purging expired keys that are never requested, and so forth.

Wow, and when we keep reading, we get this bonus.

By default “hz” is set to 10. Raising the value will use more CPU when Redis is idle, but at the same time will make Redis more responsive when there are many keys expiring at the same time, and timeouts may be handled with more precision.

settings_panel_equalizer_preferencesThere it is. The magical 10 number. Now I know what you are thinking. Adjust the “hz” configuration value and REDBird will fly again. The problem is, Amazon Elasticache does not give you access to this configuration setting. I wonder why? Not really, I know why. Moving “hz” to 100 or 500 would reduce the total number of Redis instances that Amazon can deployed on the same hardware with a “hz” of 10. Ok, that’s fair, but sucks. Moving on.

Fly REDBird Fly

send_post_paper_planeSo how did we fix this? We took some inspiration from this and another blog post and started asking Redis to lookup random keys in a fire and forget pipeline batch. We started out with batches of 50K every second and eventually found that a 10K batch every couple of seconds gave us a good balance of memory cleanup and system load. Today REDBird is flying even during the fall and we were able to reduce the size of our cluster saving more money for the chipmunks and lumberjacks retirement fund.

GraphConnect 2018

One of the best parts of my job is how much I get to learn about topics that I have previously only heard of. One of those things was graph databases and how they can be used and abused for connected data.

gcny-2017-logo-pw@2x

I was lucky enough to go to this year’s GraphConnect 2018 in New York City this September 20-21 at the Marriott in Times Square. As a person who grew up on Long Island and worked in New York City, this was an interesting homecoming for me.

The Conference

The conference was a well put together affair and I did learn a ton of stuff about how graphs are used in other industries, the intersection of graph databases, Machine Learning, Deep Learning and AI in general.

Talks

  • State of the Graph: Opening Keynote
  • The Present and Future of Artificial Intelligence and Machine Learning
  • Neo4j Graph Platform for Demographics Master Data at Citigroup
  • Predictive Analysis from Massive Knowledge Graphs on Neo4j
  • Neo4j Morpheus: Interweaving Table and Graph Data in Spark
  • Large-scale real time recommendations with Neo4j
  • DeepWalk – Turning Graphs into Features via Network Embeddings
  • How Graph Technology is Changing Artificial Intelligence and Machine Learning
  • Infinite Segmentation: Scalable Mutual Information Ranking on real world graphs
  • Machine Learning Algorithms in Neo4j

Classes

  • Neo4j for Data Science and Machine Learning
  • Graph Modeling Clinic

Taking a Walk

mountaineering copyI had a lot of fun at the conference, meeting new people and vendors selling everything from graph based analytics to ML training systems. The best talk of the whole conference was the “DeepWalk – Turning Graphs into Features via Network Embeddings“. This talk was a standing room only talk from one of the authors of the paper and was a lot of fun going through the paper and discussing how graphs can be used for feature extraction for our connected datasets and ML training grounds.

line_structure_relations_hierarchyThe first day was all talks while the next day was all learning. I took the Graph Modeling clinic to help me get over my propensity to design a data model like a RDBMS ERD diagram and the “Data Science and Machine Learning” course for the fun of learning more about Machine Learning and Model training.

All in all, a nice conference with interesting speakers and training without a lot of pushy sales people. I have to hand it to Neo4j, they put on a nice conference.

The Hyper Logger

thin_wood_axThe other day while working at Chipmunks & Lumberjacks, I was asked to create a computer system to count the distinct set of leaf shapes, sizes and colors of every tree in the forest. I know what you are thinking! Why would anyone want to know this information? For C&L it’s really important, they use it when scoring whether a tree can be harvested.

thin_leaf_tree_forestThe naive implementation of this system would be to add all of the leaves into a database and do distinct queries. This system might work for a while but what happens when the total number of trees and their associated leaves climb into the billions or trillions? Will it still perform? The short answer is no; the longer answer is, absolutely not, are you kidding me!

Probabilistic Data Structures

So, what is a geek working for C&L to do? Enter the probabilistic data structure HyperLogLog. This sleek and beautiful data structure will tell you the distinct count of items entered into its structure with a 2% error rate. Ok now, come on, 2% is not that bad, especially when you can fit 10^9 items in just 1.5K of memory. HyperLog might be the best thing to come out of the 80’s besides Nintendo and the Sony Walkman.

leaf_tree_forestHyperLogLog data structures have an Add, Count and Merge function. Add will insert a new value into the data structure. Count will tell you the distinct number of items in the data structure. Merge will take two HyperLogLog data structures and return a new combined data structure with only the distinct set of the two data structures.
Add and Count are the meat and potatoes of the HyperLogLog data structure but don’t discount Merge. Some very interesting features can be created with Merge. For C&L we needed the 1, 7 and 30 day counts for our tree leaves. To save some space we just used one data structure per day and merged the days to find the week and month values. Neat, right?

Benefits

There are some significant side benefits of using this type of data structure over the typical row in a database. Three of them are GDPR compliance, easy replay and size.

maple_leaf_canada_treeFor GDPR, if you are just exchanging the data structures, there is no recognizable information stored inside the data structure due to the hashing of the original value and flipping a bit from 0 to 1 during the Add command.

Replay is all about recovering from a distributed system failure and that trying to insert the same value into a HyperLogLog data structure has no effect on the count. It’s basically a no-op.

You have to admit that storing a bazillion items in 1.5 kilobytes is pretty neat. The data size makes sharing and transmitting the structure between data-centers or servers in the same center relativity easy.

A Red Bird

bird_house_twitter

The new system named REDBird is built and deployed to multiple forests around the world and C&L is very happy with the new system. If you have a cardinality challenge at work, you might want to learn more about probabilistic data structures and HyperLogLog.

Learning Clojure

clojure-logoIt’s that time of year again at work. Code Freeze. The time of year where the code in production needs to be highly stable and predictable as opposed to the rest of the year where it needs to be highly stable and predictable 😜.  The code is not truly frozen, it’s kind of slushy but the benefit is that I can focus some of my extra energies into learning something new or something old like Lisp.

Yes, Lisp

Those of you who know me know that I had a ball learning from “The Land of Lisp” with it’s catchy show tune like videos and from “The Realm of Racket” book with all of its interactive games. My next fun coding book is going to be “Clojure for the Brave and True”. The book looks like a lot of fun and as a bonus I will get more exposure to Java and the JVM.

The learning plan is to work through the book during “code freeze” and hopefully do some fun side projects with Clojure. I always find it better to “use it in anger” to really learn a topic instead of just following a prescribed plan. I don’t have any idea what the side project will be yet but I think the book will give me some good ideas.

The Github repo has been created and I have my IDE and REPL set up and working. All that’s left is the fun part, reading and learning.

My Personal C# Style Guide

The other day I was asked at work to write a style guide for our C# practice. So what is a developer to do? Googling it was a good place to start, but none of the ones I found quite fit. Some were 80 pages long and made me nauseous and others completely lacked any detail.

I settled on a modified corefx style guide. So, what do you think? I will be adding to this guide in the future and wanted to make sure I didn’t misplace it again. It will now live on in blog form, at least for now.

1 – Use Allman style braces, where each brace begins on a new line. A single line statement block can go without braces but the block must be properly indented on its own line and it must not be nested in other statement blocks that use braces.

2 – Use four spaces of indentation (no tabs).

3 – Use camelCase for internal and private fields and use readonly where possible. When used on static fields, readonly should come after static (i.e. static readonly not readonly static).

4 – Avoid this. unless absolutely necessary.

5 – Always specify the visibility, even if it’s the default (i.e. private string foo not string foo). Visibility should be the first modifier (i.e. public abstract not abstract public).

6 – Namespace imports should be specified at the top of the file, outside of namespace declarations and should be sorted alphabetically.

7 – Avoid more than one empty line at any time. For example, do not have two blank lines between members of a type.

8 – Avoid spurious free spaces. For example avoid if (someVar == 0)..., where the dots mark the spurious free spaces. Consider enabling “View White Space (Ctrl+E, S)” if using Visual Studio, to aid detection.

9 – Do not commit large blocks of commented out code. Use the source code repository for this feature. All comment out code in an actively edited file should be removed. Please do not go through the code base deleting commented out code and check it in.

10 – Do not create an interface unless there are two or more non-test scenario implementations that use that interface.

11 – Within a class, struct, or interface, elements should be positioned in the following order:

  • Constants
  • Fields
  • Constructors
  • Finalizers (Destructors)
  • Delegates
  • Events
  • Enums
  • Interfaces
  • Properties
  • Indexers
  • Methods
  • Structs
  • Classes

static elements have to appear before instance elements.

Elements should be ordered by access:

  • public
  • internal
  • protected internal
  • protected
  • private

12 – Avoid putting multiple top-level classes/interfaces/enums in the same file.

13 – Avoid the use of regions in code unless it is surrounding auto-generated code. Do not use regions to separate the different types of class members.

14 – Use var when it’s obvious what the variable type is (i.e. var stream = new FileStream(...) not var stream = OpenStandardInput()).

15 – We use language keywords instead of BCL types (i.e. int, string, float instead of Int32, String, Single, etc) for both type references as well as method calls (i.e. int.Parse instead of Int32.Parse). See issue 391 for examples.

16 – We use PascalCasing to name all our constant local variables and fields. The only exception is for interop code where the constant value should exactly match the name and value of the code you are calling via interop.

17 – We use nameof(...) instead of "..." whenever possible and relevant.

18 – When including non-ASCII characters in the source code use Unicode escape sequences (\uXXXX) instead of literal characters. Literal non-ASCII characters occasionally get garbled by a tool or editor.

19 – If a file happens to differ in style from these guidelines, update the file if you are actively working on that file.

Use the .NET Codeformatter Tool to ensure a code base maintains a consistent style over time, the tool automatically fixes the code base to conform to the guidelines outlined above.

Example File:

ObservableLinkedList.cs:

using System;
using System.Collections;
using System.Collections.Generic;
using System.Collections.Specialized;
using System.ComponentModel;
using System.Diagnostics;
using Microsoft.Win32;

namespace System.Collections.Generic
{
    public partial class ObservableLinkedList : INotifyCollectionChanged, INotifyPropertyChanged
    {
        private ObservableLinkedListNode head;
        private int count;

        public ObservableLinkedList(IEnumerable items)
        {
            if (items == null)
                throw new ArgumentNullException(nameof(items));

            foreach (T item in items)
            {
                AddLast(item);
            }
        }

        public event NotifyCollectionChangedEventHandler CollectionChanged;

        public int Count
        {
            get { return count; }
        }

        public ObservableLinkedListNode AddLast(T value) 
        {
            var newNode = new LinkedListNode(this, value);

            InsertNodeBefore(head, node);
        }

        protected virtual void OnCollectionChanged(NotifyCollectionChangedEventArgs e)
        {
            NotifyCollectionChangedEventHandler handler = CollectionChanged;
            if (handler != null)
            {
                handler(this, e);
            }
        }
    }
}

This is a Visual Studio 2013 .vssettings file source for enabling C# auto-formatting conforming to the above guidelines. Note that rules 7 and 8 are not covered by the vssettings, since these are not rules currently supported by VS formatting.

<UserSettings>
    <ApplicationIdentity version="12.0"/>
    <ToolsOptions>
        <ToolsOptionsCategory name="TextEditor" RegisteredName="TextEditor">
            <ToolsOptionsSubCategory name="AllLanguages" RegisteredName="AllLanguages" PackageName="Text Management Package"/>
            <ToolsOptionsSubCategory name="CSharp" RegisteredName="CSharp" PackageName="Text Management Package">
                <PropertyValue name="TabSize">4</PropertyValue>
                <PropertyValue name="InsertTabs">false</PropertyValue>
                <PropertyValue name="IndentSize">4</PropertyValue>
                <PropertyValue name="BraceCompletion">true</PropertyValue>
            </ToolsOptionsSubCategory>
            <ToolsOptionsSubCategory name="CSharp-Specific" RegisteredName="CSharp-Specific" PackageName="Visual C# Language Service Package">
                <PropertyValue name="NewLines_QueryExpression_EachClause">1</PropertyValue>
                <PropertyValue name="Space_Normalize">0</PropertyValue>
                <PropertyValue name="Space_AroundBinaryOperator">1</PropertyValue>
                <PropertyValue name="Formatting_TriggerOnPaste">1</PropertyValue>
                <PropertyValue name="NewLines_Braces_Method">1</PropertyValue>
                <PropertyValue name="Indent_CaseLabels">1</PropertyValue>
                <PropertyValue name="Formatting_TriggerOnBlockCompletion">1</PropertyValue>
                <PropertyValue name="CodeDefinitionWindow_DocumentationComment_IndentOffset">2</PropertyValue>
                <PropertyValue name="NewLines_Braces_ControlFlow">1</PropertyValue>
                <PropertyValue name="NewLines_Braces_AnonymousMethod">0</PropertyValue>
                <PropertyValue name="Space_WithinOtherParentheses">0</PropertyValue>
                <PropertyValue name="Wrapping_KeepStatementsOnSingleLine">1</PropertyValue>
                <PropertyValue name="Space_AfterBasesColon">1</PropertyValue>
                <PropertyValue name="Indent_Braces">0</PropertyValue>
                <PropertyValue name="Wrapping_IgnoreSpacesAroundVariableDeclaration">0</PropertyValue>
                <PropertyValue name="Space_WithinMethodCallParentheses">0</PropertyValue>
                <PropertyValue name="Space_AfterCast">0</PropertyValue>
                <PropertyValue name="NewLines_Braces_CollectionInitializer">0</PropertyValue>
                <PropertyValue name="NewLines_AnonymousTypeInitializer_EachMember">1</PropertyValue>
                <PropertyValue name="NewLines_Keywords_Catch">1</PropertyValue>
                <PropertyValue name="NewLines_Braces_ObjectInitializer">0</PropertyValue>
                <PropertyValue name="NewLines_Braces_ArrayInitializer">0</PropertyValue>
                <PropertyValue name="Space_WithinExpressionParentheses">0</PropertyValue>
                <PropertyValue name="Space_InControlFlowConstruct">1</PropertyValue>
                <PropertyValue name="Formatting_TriggerOnStatementCompletion">0</PropertyValue>
                <PropertyValue name="NewLines_Keywords_Finally">1</PropertyValue>
                <PropertyValue name="Space_BetweenEmptyMethodDeclarationParentheses">0</PropertyValue>
                <PropertyValue name="Indent_UnindentLabels">0</PropertyValue>
                <PropertyValue name="NewLines_ObjectInitializer_EachMember">1</PropertyValue>
                <PropertyValue name="NewLines_Keywords_Else">1</PropertyValue>
                <PropertyValue name="Space_WithinMethodDeclarationParentheses">0</PropertyValue>
                <PropertyValue name="Space_BetweenEmptyMethodCallParentheses">0</PropertyValue>
                <PropertyValue name="Space_BeforeSemicolonsInForStatement">0</PropertyValue>
                <PropertyValue name="Space_BeforeComma">0</PropertyValue>
                <PropertyValue name="Space_AfterMethodCallName">0</PropertyValue>
                <PropertyValue name="Space_AfterComma">1</PropertyValue>
                <PropertyValue name="Wrapping_IgnoreSpacesAroundBinaryOperators">0</PropertyValue>
                <PropertyValue name="Space_BeforeBasesColon">1</PropertyValue>
                <PropertyValue name="Space_AfterMethodDeclarationName">0</PropertyValue>
                <PropertyValue name="Space_AfterDot">0</PropertyValue>
                <PropertyValue name="NewLines_Braces_Type">1</PropertyValue>
                <PropertyValue name="Space_AfterLambdaArrow">1</PropertyValue>
                <PropertyValue name="NewLines_Braces_LambdaExpressionBody">0</PropertyValue>
                <PropertyValue name="Space_WithinSquares">0</PropertyValue>
                <PropertyValue name="Space_BeforeLambdaArrow">1</PropertyValue>
                <PropertyValue name="NewLines_Braces_AnonymousTypeInitializer">0</PropertyValue>
                <PropertyValue name="Space_WithinCastParentheses">0</PropertyValue>
                <PropertyValue name="Space_AfterSemicolonsInForStatement">1</PropertyValue>
                <PropertyValue name="Indent_CaseContents">0</PropertyValue>
                <PropertyValue name="Indent_FlushLabelsLeft">1</PropertyValue>
                <PropertyValue name="Wrapping_PreserveSingleLine">1</PropertyValue>
                <PropertyValue name="Space_BetweenEmptySquares">0</PropertyValue>
                <PropertyValue name="Space_BeforeOpenSquare">0</PropertyValue>
                <PropertyValue name="Space_BeforeDot">0</PropertyValue>
                <PropertyValue name="Indent_BlockContents">1</PropertyValue>
                <PropertyValue name="SortUsings_PlaceSystemFirst">1</PropertyValue>
                <PropertyValue name="SortUsings">1</PropertyValue>
                <PropertyValue name="RemoveUnusedUsings">1</PropertyValue>
            </ToolsOptionsSubCategory>
        </ToolsOptionsCategory>
    </ToolsOptions>
</UserSettings>

Ruby and Docker

rubyI maintain the Ruby Client wrapper for my company’s REST API. I’m not exactly sure how this happened. I might have volunteered, or it could have been bad luck. Either way, it was time for a little bug fixing and as it so happens, I got a new laptop a couple of weeks ago. You know what that means; starting from scratch, again.

This time will be different. I’ll do it right this time, I tell myself. Maybe a VM, or a Vagrant file, or something hot like Docker.

Oooh, the new hotness, Docker; that would be fun.

I’ve been learning about Docker and Kubernetes for a bit so creating a development environment for some quick Ruby fixes would be fun and educational. Especially, since I already have Docker installed.

dockerWith a 5 line Dockerfile and a Ruby 2.2 base image, I was set and ready. A one line code bug fix, a version bump and our Ruby Gem was updated. So smooth it was impressive and exciting.

My Dockerfile

FROM ruby:2.2
RUN apt-get update && apt-get install -y build-essential
RUN mkdir -p /app
WORKDIR /app
RUN gem install bundler

With a docker build -t devtheruby . and a docker run -it -v $(pwd):/app devtheruby bash, I had my environment up and running. With the volume mount, I could edit the code from VS Code on the host and build and run tests in the connected terminal. Not a bad way to work.

Docker was really easy to set up and as a bonus, I added the Dockerfile to the Git repository for future me. I try to be nice to future me; he has to fix all my mistakes. For quick fixes, this seems like a good alternative to VM’s.  I’m just not sure if this is the way I would want to develop every day. I think I’m going to leave that up to future me.

My Re:Invent 2017

reinvent2017

The conference is over, I’m back home and it’s time to digest and make plans. AWS re:Invent 2017 was my first big conference and it was a lot of fun. I learned a lot, and now I have a bunch of new stuff to learn. When will it end? Hopefully never!

Announcements

Here is the list of the announcements that I was hoping for and never dreamed they would announce.

DynamoDB Global TablesAmazonDynamoDB

The biggest and most welcome announcement at re:Invent for me was DynamoDB Global Tables. Yep, Kubernetes support, graph databases and Golang for Lambda is great and all, but cross-region replication is going to be a hit at work. No more loading 100’s of GB in three regions from a Data Pipeline job that takes days. YAS!!!

Neptune

The announcement of AWS Neptune was a complete shocker for me. I’ve been researching graph databases for a couple of months and the industry is decidedly against using the common graph database flavors in a transactional system like the one I work on.

Time will tell, but from what I’ve read, Neptune is limited to an in-memory store with read replicas in the same region. For me, that won’t work. I still think Dgraph would be a better fit with their distributed transaction and multiple machine support. I might start a Dgraph POC in a few weeks and I’ll try to blog about the results.

Kubernetes

Most cloud providers now have Kubernetes support and it could not have come at a better time. I still need to dig into Kubernetes more and start using it in anger. This announcement was expected but nice to hear.

Go for LambdaAWSLambda

This was a surprise and a very welcome announcement. It’s not released yet, and we did not get a good timeframe for its release. Or at least good enough for me.  When it is, we will be using it heavily at work.

My Schedule

For those of you that have never been to one of these conferences, here is what my days looked like. I was lucky that all of my sessions were in the Venetian Hotel. Other people I went with had to move between two or three hotels during the day.

Code Description Time
ATC303 Cache Me If You Can: Minimizing Latency While Optimizing Cost Through Advanced Caching Strategies 11/27/17 10:45 AM
ATC301 1 Million bids in 100ms – using AWS to power your Real-Time Bidder 11/27/17 12:15 PM
GAM307 Ubisoft: How For Honor Runs Using Amazon ECS 11/27/17 1:45 PM
GAM401 Designing for the Future: Building a Flexible Event-based Analytics Architecture for Games 11/27/17 3:15 PM
DAT321 From Minutes to Milliseconds: How Careem Used Amazon ElastiCache for Redis 11/28/17 9:15 AM
ARC209 A Day in the Life of a Netflix Engineer III 11/28/17 10:45 AM
ARC401 Serverless Architectural Patterns and Best Practices 11/28/17 1:45 PM
DEV206 Life of a Code Change to a Tier 1 Service 11/28/17 3:15 PM
DAT305 ElastiCache Deep Dive: Best Practices and Usage Patterns 11/28/17 4:45 PM
GEN25 Andy Jassy’s Keynote 11/29/17 8:00 AM
DAT328 Tinder and DynamoDB: It’s a Match! Massive Data Migration, Zero Down Time 11/29/17 11:30 AM
DEV314 Monitoring as Code: Getting to Monitoring-Driven Development 11/29/17 1:45 PM
DAT304 DynamoDB – What’s new 11/29/17 3:15 PM
GEN06 Werner Vogels’ Keynote 11/30/17 8:30 AM
ARC319 How to Design a Multi-Region Active-Active Architecture 11/30/17 11:30 AM
DAT327 DynamoDB adaptive capacity: smooth performance for chaotic workloads 11/30/17 2:30 PM
ARC321 Models of Availability 11/30/17 4:00 PM

Going Again

I think I’m going to need the rest of 2017 to recover, but all in all, I think it was worthwhile to go to the conference even though you could watch the reply on Youtube. If you ever have the chance to go, you should take it. One of the benefits of going is speaking to the engineers who built the systems you use. I had a great conversation and a good chuckle about AWS support with them. See you next year at re:Invent 2018.