Sam Moore
Meet our writer
Sam Moore
Principal Engineer, Betterment
Sam is a Senior Staff Engineer at Betterment where he collaborates with various teams to build better software and a shared culture of values and principles. At work, you'll usually find him solving problems that make his and his teammates’ lives easier. Outside of his passion for programming, Sam has deep appreciation for The Lord of the Rings, garden gnomes, and tortellini.
Articles by Sam Moore
-
End-to-end-ish tests using fake HTTP in Flutter
End-to-end-ish tests using fake HTTP in Flutter Feb 23, 2022 8:55:21 AM We write tests in order to prove our features work as intended and we run those tests consistently to prove that our features don't stop working as intended. Writing end-to-end tests is pretty expensive. Typically, they use real devices or sometimes a simulator/emulator and real backend services. That usually means that they end up being pretty slow and they tend to be somewhat flaky. That isn't to say that they're not worth it for some teams or for a subset of the features in your app. However, I'm here to tell you (or maybe just remind you) that tests and test coverage aren't the goal in and of themselves. We write tests in order to prove our features work as intended and we run those tests consistently to prove that our features don't stop working as intended. That means that our goal when writing tests should be to figure out how to achieve our target level of confidence that our features work as intended as affordably as possible. It’s a spectrum. On the one end is 100% test coverage using all the different kinds of tests: solitary unit tests, sociable more-integrated tests, and end-to-end tests; all features, fully covered, no exceptions. On the other end of the spectrum there are no tests at all; YOLO, just ship-it. At Betterment, we definitely prefer to be closer to the 100% coverage end of the spectrum, but we know that in practice that's not really a feasible end state if we want to ship changes quickly and deliver rapid feedback to our engineers about their proposed changes. So what do we do? Well, we aim to find an affordable, maintainable spot on that testing spectrum a la Justin Searls' advice. We focus on writing expressive, fast, and reliable solitary unit tests, some sociable integrated tests of related units, and some "end-to-end-ish" tests. It's that last bucket of tests that's the most interesting and it's what the rest of this post will focus on. What are "end-to-end-ish" tests? They're an answer to the question "how can we approximate end-to-end tests for a fraction of the cost?" In Flutter, the way to write end-to-end tests is with flutter_driver and the integration_test package. These tests use the same widgetTester API that regular Widget tests use but they are designed to run on a simulator, emulator, or preferably a real device. These tests are pretty easy to write (just as easy as regular widget tests) but hard-ish to debug and very slow to run. Where a widget test will run in a fraction of a second to a second, one of these integration tests will take many seconds. We love the idea of these tests, the level of confidence they'd give us that our app works as intended, and how they'd eliminate manual QA testing, but we loathe the cost of running them, both in terms of time and actual $$$ of CI execution. So, we decided that we really only want to write these flutter_driver end-to-end tests for a tiny subset of our features, almost like a "smoke testing" suite that would signal us if something was seriously wrong with our app. That might include a single happy-path test apiece for features like log-in and sign-up. But that leaves us with a pretty large gap where it's way too easy for us to accidentally create a feature that depends on some Provider that's not provided and our app blows up at runtime in a user's hands. Yuck! Enter, end-to-end-ish tests (patent pending 😉). These tests are as close to end-to-end tests as we can get without actually running on a real device using flutter_driver. They look just like widget tests (because they are just widget tests) but they boot up our whole app, run all the real initialization code, and rely on all our real injected dependencies with a few key exceptions (more on that next). This gives us the confidence that all our code is configured properly, all our dependencies are provided, our navigation works, and the user can tap on whatever and see what they'd expect to see. You can read more about this approach here. "With a few key exceptions" If the first important distinction of end-to-end-ish tests is that they don't run on a real device with flutter_driver, the second important distinction is that they don't rely on a real backend API. That is, most apps rely on one (or sometimes a few) backend APIs, typically powered by HTTP. Our app is one of those apps. So, the second major difference is that we inject a fake HTTP configuration into our network stack so that we can run nearly all of our code for real but cut out the other unreliable and costly dependency. The last important hurdle is native plugins. Because widget tests aren't typically run on a real device or a simulator/emulator, they run in a context in which we should assume the underlying platform doesn't support using real plugins. This means that we have to also inject fake implementations of any plugins we use. What I mean by fake plugins is really simple. When we set up a new plugin and we wrap it in a class that we inject into our app. Making a fake implementation of that plugin is typically as easy as making another class, prefixing its name with Fake and having it implement the public contract of the regular plugin class with suitably real but not quite real behavior. It's a standard test double, and it does the trick. It's definitely a bummer that we can't exercise that real plugin code, but when you think about it, that plugin code is tested in the plugin's test suite. A lot of the time, the plugin code is also integration tested as well because the benefits outweigh the costs for many plugins, e.g. the shared preferences plugin can use a single integration test to provide certainty that it works as intended. Ultimately, using fake plugins works well and makes this a satisfyingly functional testing solution. About that fake HTTP thing One of the most interesting bits of this solution is the way we inject a fake HTTP configuration into our network stack. Before building anything ourselves, we did some research to figure out what the community had already done. Unfortunately, our google-fu was bad and we didn't find anything until after we went and implemented something ourselves. Points for trying though, right? Eventually, we found nock. It's similar to libraries for other platforms that allow you to define fake responses for HTTP requests using a nice API and then inject those fake responses into your HTTP client. It relies on the dart:io HttpOverrides feature. It actually configures the current Zone's HTTP client builder to return its special client so that any code in your project that finds its way to using the dart:io HTTP client to make a request will end up routed right into the fake responses. It's clever and great. I highly recommend using it. We, however, are not using it. How we wrote our own fake HTTP Client Adapter As I said, we didn't find nock until after we wrote our own solution. Fortunately, it was a fun experience and it really took very little time! This also meant that we ended up with an API that fit our exact needs rather than having to reframe our approach to fit what nock was able to offer us. The solution we came up with is called charlatan and it's open-source and available on pub.dev. Both libraries are great and each is designed for a specific challenge, check both of them out and decide which one works for your needs. Here's what our API looks like and how we use it to set up a fake HTTP client for our tests. Here you can see how to construct an instance of the Charlatan class and then use its methods like whenGet to configure it with fake responses that we want to see when we make requests to the configured URLs. We've also created an extension method withDefaults that allows us to configure a bunch of common, default responses so that we don't have to specify those in each and every test case. This is useful for API calls that always behave the same way, like POSTs that return no body, and to provide a working foundation of responses. When a test case cares about the specifics of a response, it can override that default. The last important step is to make sure to convert the Charlatan instance into an adapter and pass that into our HTTP client so that the client will use it to fulfill requests. Here's a peek inside of the Charlatan API. It's just collecting fake responses and organizing them so that they're easy to access later.As you can see, the internals are pretty tiny. We provide a class that exposes the developer-friendly configuration API for fake responses, and we implement the HttpClientAdapter interface provided by dio. In our app we use dio and not dart:io's built-in HTTP client mostly due to preference and slight feature set differences. For the most part, the code collects fake responses and then smartly spits them back out when requested. The key functionality (Ahem! Magic ✨) is only a few lines of code. We use the uri package to support matching templated URLs rather than requiring developers to pass in exactly matching strings for requests their tests will make. We store fake responses with a URI template, a status code, and a body. If we find a match, we return it, if we don't then we throw a helpful exception to guide the developer on how to fix the issue. Takeaways Testing software is important, but it's not trivial to write a balanced test suite for your app's needs. Sometimes, it's a good idea to think outside the box in order to strike the right balance of test coverage, confidence, and maintainability. That's what we do here at Betterment, come join us! -
WebValve – The Magic You Need for HTTP Integration
WebValve – The Magic You Need for HTTP Integration Apr 26, 2019 12:00:00 AM Struggling with HTTP integrations locally? Use WebValve to define HTTP service fakes and toggle between real and fake services in non-production environments. When I started at Betterment (the company) five years ago, Betterment (the platform) was a monolithic Java application. As good companies tend to do, it began growing—not just in terms of users, but in terms of capabilities. And our platform needed to grow along with it. At the time, our application had no established patterns or tooling for the kinds of third-party integrations that customers were increasingly expecting from fintech products (e.g., like how Venmo connects to your bank to directly deposit and withdraw money). We were also feeling the classic pain points of a growing team contributing to a single application. To keep the momentum going, we needed to transition towards a service-oriented architecture that would allow the engineers of different business units to run in parallel against their specific business goals, creating even more demand for repeatable solutions to service integration. This brought up another problem (and the starting point for this blog post): in order to ensure tight feedback loops, we strongly believed that our devs should be able to do their work on a modern, modestly-specced laptop without internet connectivity. That meant no guaranteed connection to a cloud service mesh. And unfortunately, it’s not possible to run a local service mesh on a laptop without it melting. In short, our devs needed to be able to run individual services in isolation; by default they were set to communicate with one another, meaning an engineer would have to run all of the services locally in order to work on any one service. To solve this problem, we developed WebValve—a tool that allows us to define and register fake implementations of HTTP services and toggle between real and fake services in non-production environments. I’m going to walk you through how we got there. Start with the test Here’s a look at what a test would look like to see if a deposit from a bank was initiated: The five lines of code on the bottom is the meat of the test. Easy right? Not quite. Notice the two WebMock stub_requests calls at the top. The second one has the syntax you’d expect to execute the test itself. But take a look at the first one—notice the 100+ lines of (omitted) code. Without getting into the gory details, this essentially requires us, for every test we write, to stub a request for user data—with differences across minor things like ID values, we can’t share these stubs between tests. In short it’s a sloppy feature spec. So how do we narrow this feature spec down to something like this? Through the magic of libraries. First things first—defining our view of the problem space. The success of projects like these don’t come down to the code itself—it comes down to the ‘design’ of the solution based on its specific needs. In this case, it meant paring the conditions down to making it work using just rails. Those come to life in four major principles, which guide how we engage with the problem space for our shift to a service-oriented architecture: We use HTTP & REST to communicate with collaborator services We define the boundaries and limit the testing of integrations with contract tests We don't share code across service boundaries Engineers must remain nimble and building features must remain enjoyable. A little bit of color on each, starting with HTTP and REST. For APIs that we build for ourselves (e.g. internal services) we have full control over how we build them, so using HTTP and REST is no issue. We have a strong preference to use a single integration pattern for both internal and external service integrations; this reduces cognitive overhead for devs. When we’re communicating with external services, we have less control, but HTTP is the protocol of the web and REST has been around since 2000—the dawn of modern web applications— so the majority of integrations we build will use them. REST is semantic, evolvable, limber, and very familiar to us as Rails developers —a natural ‘other side of the coin’ for HTTP to make up the lingua franca of the web. Secondly, we need to define the boundaries in terms of ‘contracts.’ Contracts are a point of exchange between the consumption side (the app) and producer side (the collaborator service). The contract defines the expectations of input and output for the exchange. They’re an alternative to the kind of high-level systems integration tests that would include a critical mass of components that would render the test slow and non-repeatable. Thirdly, we don't want to have shared code across service boundaries. Shared code between services creates shared ownership, and shared ownership leads to undesirable coupling. We want the API provider to own and version their APIs, and we want the API consumer to own their integration with each version of a collaborator service's API. If we were willing to accept tight coupling between our services, specifically in their API contracts, we'd be well-served by a tool like Pact. With Pact, you create a contract file based on the consumer's expectations of an API and you share it with the provider. The contract files themselves are about the syntax and structure of requests and responses rather than the interpretation. There's a human conversation and negotiation to be had about these contracts, and you can fool yourself into thinking you don't need to have that conversation if you've got a file that guarantees that you and your collaborator service are speaking the same language; you may be speaking the same words, but you might not infer the same meaning. Pact's docs encourage these human conversations, but as a tool it doesn't require them. By avoiding shared code between services, we force ourselves to have a conversation about every API we build with the consumers of those APIs. Finally, these tests’ effectiveness is directly related to how we can apply them to reality, so we need to be simple—we want to be able to test and build features without connections to other features. We want them to be able to work without an internet connection, and if we do want to integrate with a real service in local development, we should be able to do that—meaning we should be able to test and integrate locally at will, without having to rely on cumbersome, extra-connected services (think Docker, Kubernetes; anything that pairs cloud features with the local environment.) Straightforward tests are easy to write, read, and maintain. That keeps us moving fast and not breaking things. So, to recap, there are four principles that will drive our solution: Service interactions happen over HTTP & REST Contract tests ensure that service interactions behave as expected Providing an API contract requires no shared code Building features remains fast and fun Okay, okay, but how? So we’ve established that we don’t want to hit external services in tests, which we can do through WebMock or similar libraries. The challenge becomes: how do we replicate the integration environment without the integration environment? Through fakes. We’ll fake the integration by using Sinatra to build a rack app that quacks like the real thing. In the rack app, we define the routes we care about for the things we normally would have stubbed in the tests. From here, we do the things we couldn’t do before—pull real parameters out of the requests and feed them back into the fake response to make it more realistic. Additionally, we can use things like ActiveRecord to make these fake responses even more realistic based on the data stored in our actual database. So what does the fake look like? It's a class with a route defined for each URL we care about faking. We can use WebMock to wire the fake to requests that match a certain pattern. If we receive a request for a URL we didn't define, it will 404. Simple. However, this doesn’t allow us to solve all the things we were working for. What’s missing? First, an idiomatic setup stance. We want to be able to define fakes in a single place, so when we add a new one, we can easily find it and change it. In the same vein, we want to be able to answer similar questions about registering fakes in one spot. Finally, convention over configuration—if we can load, register, and wire-up a fake based on its name, for example, that would be handy. Secondly, it’s missing environment-specific behavior, which in this case, translates into the ability to toggle the library on and off and separately toggle the connection to specific collaborator services on and off. We need to be able to have the library active when running tests or doing local development, but do not want to have it running in a production environment—if it remains active in a real environment, it might affect real customer accounts, which we cannot afford. But, there will also be times when we're running in a local development environment and we want to communicate with a real collaborator service to do some true integration testing. Thirdly, we want to be able to autoload our fakes. If they’re in our codebase, we should be able to iterate on the fakes without having to restart our server; the behavior isn’t always right the first time, and restarting is tedious and it's not the Rails Way. Finally, to bolt this on to an IRL application, we need the ability to define fakes incrementally and migrate them into existing integrations that we have, one by one. Okay brass tacks. No existing library allows us to integrate this way and map HTTP requests to in-process fakes for integration and development. Hence, WebValve. TL;DR—WebValve is an open-source gem that uses Sinatra and WebMock to provide fake HTTP service behavior. The special sauce is that it works for more than just your tests. It allows you to run your fakes in your dev environment as well, providing functionality akin to real environments with the toggles we need to access the real thing when we need to. Let’s run it through the gauntlet to show how it works and how it solves for all our requirements. First we add the gem to our Gemfile and run bundle install. With the gem installed, we can use the generator rails g webvalve:install to bootstrap a default config file where we can register our fakes. Then we can generate a fake for our "trading" collaborator service using rails generate webvalve:fake_service Trading. This gives us a class in a conventional location that inherits from WebValve::FakeService. This looks very similar to a Sinatra app, and that's because it is one—with some additional magic baked in. To make this fake work, all we have to do is define the conventionally-named environment variable, TRADINGAPIURL. That tells WebValve what requests to intercept and route to this fake. By inheriting from this WebValve class, we gain the ability to toggle the fake behavior on or off based on another conventionally-named environment variable, in this case TRADING_ENABLED. So let’s take our feature spec. First, we configure out test suite to use WebValve with the RSpec config helper require 'webvalve/rspec'. Then, we look at the user API call—we define a new route for user, in FakeTrading. Then we flesh out that fake route by scooping out our json from the test file and probably making it a little more dynamic when we drop it into the fake. Then we do the same for the deposit API call. And now our test, which doesn't care about the specifics of either of those API calls, is much clearer. It looks just like our ideal spec from before: We leverage all the power of WebMock and Sinatra through our conventions and the teeniest configuration to provide all the same functionality as before, but we can write cleaner tests, we get the ability to use these fakes in local development instead of the real services—and we can enable a real service integration without missing a beat. We’ve achieved our goal—we’ve allowed for all the functionality of integration without the threats of actual integration. Check it out on GitHub. This article is part of Engineering at Betterment.