Category Archives: software

OOP is dead, long live OOP

Raise your hand if you remember the golden hammer of Object Oriented Programming. You know, where you only have to code something once and then reuse it in all your subclasses? Remember those days?

Maybe this is what they still teach in dusty lecture halls amidst today’s Computer Science departments. But if you have spent any significant time coding in the trenches, you will have come to realize the lie that this mantra is.

You see, grandiose hierarchies of objects over time become nigh impossible to manage. At a certain point, one more subclass bolted onto the grand structure is deemed impossible and you must fork the solution. No, there is a different structure out there that we must adopt. And Spring is the platform that carries its banner.

Interface Oriented Programming. Let’s call it IOP since there aren’t enough acronyms in our industry. IOP is the premise that different slices and layers should talk to outsiders through a nice, clean interface. The backing service on the other side should provide a concrete implementation, but the caller need not know about it.

Why do I mention Spring as the champion of IOP? Because Rod Johnson’s foray onto the  Java scene was to craft a dependency injector whereby this interface-based contract could be satisfied. Before I learned of Spring, the concept of Java interfaces was foreign. Sure they were mentioned in my college textbook Java in a Nutshell (dating myself?) but I saw little value in using them. Why?

Because when you are “newing” everything yourself, there appears little value in defining an interface and then assigning the object to it. You already know what is! What is to be gained from all the extra typing? But delegate that task to a DI container, and suddenly the cost vanishes. Expressing dependency graphs between beans with interfaces becomes much smoother when the container takes over the job of creating everything.

This goes along with the Liskov Principle where you can plug in any implementation without knowing what it is. Now sometimes people drag out the old square-is-a-rectangle example. I hate that one because geometry is a terrible domain to model semantic software concepts.

Digging in, interfaces don’t have to be complex. Instead, think of a handful of “getters” and go from there.

This fragment is part of an ongoing effort to add Affordances to Spring HATEOAS. Since we might support multiple web frameworks, having a clean, unencumbered interface is critical for getting started. It also helps avoid saddling the interface with details from Spring MVC, Spring WebFlux, or JAX-RS. Instead, this interface avoids all of that, forcing the concrete details to be nicely contained apart from each other.

Abstract classes and long hierarchies are often tricky to evolve over time, so I try to dodge that as much as possible. Composition of objects through IOP-driven strategies tends to be more amenable to change. Having said all that, what happens when you need this?

This is ONLY an abstract class because the project is currently on Java 6 and we can’t plugin a default implementation for supports(). With Java 8, this whole thing can be rolled back into an interface. Abstract vs. default interface methods aside, this is the NEW OOP. Do as much as you can to code 1 interface-to-many classes, avoiding intermediaries. Sometimes it’s unavoidable, but don’t give up too fast.

Let IOP be your OOP.

And the more you learn about the whole code base, be willing to revisit some of those hierarchies and see if you can’t “move around” stuff to get away from the hierarchies. You might be surprised at how perky your code becomes. Some of those requested features might suddenly seem possible.

The Power of REST – Part 1

I was kind of shocked when I saw Yet Another Posted Solution to REST. I sighed and commented, and drew the ire of many. So I figured this might be a good time to gather some thoughts on REST.

The article starts by criticizing REST and offering their toolkit as the solution. What’s really funny, is that the problems they ding are not RESTful issues.

REST requires lots of hops

Let’s start with this one:

As you might notice, this is less than ideal. When all is said and done we have made 1 + M + M + sum(Am) round trip calls to our API where M is the number of movies and sum(Am) is the sum of the number of acting credits in each of the M movies. For applications with small data requirements, this might be okay but it would never fly in a large, production system.

Conclusion? Our simple RESTful approach is not adequate. To improve our API, we might go ask someone on the backend team to build us a special /moviesAndActors endpoint to power this page. Once that endpoint is ready, we can replace our 1 + M + M + sum(Am) network calls with a single request.

This is the classic problem when you run into when fetching a 3NF (3rd normal form) data structure served up as REST.

Tip #1: REST doesn’t prevent you from merging data or offering previews of combined data. Formats like HAL include ability to serve up _embedded data, letting you give clients what they need. Spring Data REST does this through projections, but you can use anything.

In fact, server-side providers will probably have a better insight into exactly the volume of traffic fetching such data before clients. And through the power of hypermedia, can evolve to add links to the hypermedia without breaking existing clients. Old clients can do multiple hops; new clients can proceed to consume the new links, with full backwards compatibility.

REST serves too much data

If you look closely, you’ll notice that our page is using a movie’s title and image, and an actor’s name and image (i.e. we are only using 2 of 8 fields in a movie object and 2 of 7 fields in an actor object). That means we are wasting roughly three-quarters of the information that we are requesting over the network! This excess bandwidth usage can have very real impacts on performance as well as your infrastructure costs!

Just a second ago, we complained that the REST API was serving up too little data, forcing us to take multiple hops. Now we are complaining that it serves too much data and is wasting bandwidth.

The example in the article is a bit forced, given we are probably talking a couple tweets worth of data. It’s not like they are shipping 50MB too much. In fact, big volume data (images, PDFs) would best be served as links in the hypermedia. This would let the browser efficiently fetch a linked item once, and lean on the browser’s cache.

But I sense the real derision in the article is because the endpoint isn’t tailored to the client’s precise demands. No, the real example here is to illustrate a query technique on the client.

Just put SQL in the client already!

Wouldn’t it be nice if we could build a generic API that explicitly represents the entities in our data model as well as the relationships between those entities but that does not suffer from the 1 + M + M + sum(Am) performance problem? Good news! We can!

With GraphQL, we can skip directly to the optimal query and fetch all the info we need and nothing more with a simple, intuitive query.

So now we get to the real intent of the article: introduce a query language. Presumably solving REST’s straw man “problems” (which it doesn’t).

If you want to write a highly detailed query, just open a connection the data store and query directly. That’s what query languages are for. Why invent something that’s weblike, but really just Another Query Language?

What problem are you solving?

GraphQL takes a fundamentally different approach to APIs than REST. Instead of relying on HTTP constructs like verbs and URIs, it layers an intuitive query language and powerful type system on top of our data. The type system provides a strongly-typed contract between the client and server, and the query language provides a mechanism that the client developer can use to performantly fetch any data he or she might need for any given page.

This query technology may be quite handy if you must write intense, focused queries. If cutting a couple text-based columns makes that much difference, then REST may not be solution you seek. (Of course, at that point why not just have your JavaScript frontend open a SQL/MongoDB/Neo4j connection?)

What does REST solve? REST solves the brittle problem that arose with CORBA and SOAP.

REST makes it possible to evolve APIs without forcing you to update every client at once.

Think about that. When web sites make updates, does the web browser require an update? And why?

It’s no light feat of accomplishment. People were being beaten up left right as APIs would evolve. Updates were tough. Some clients would get broken. And availability is key for web scale business. So adopting the tactics that made the web resilient into API design sounds like a keen idea to try.

Too bad not enough people actually KNOW what these concepts are, and press on to criticize REST while offering “fixes” that don’t even address its fundamentals. The solution served in the article would put strong domain knowledge into the client, resulting in tight coupling. REST doesn’t shoot for this.

Am I making this assessment up?

This “virtual graph” is more explicitly expressed as a schema. A schema is a collection of types, interfaces, enums, and unions that make up your API’s data model. GraphQL even includes a convenient schema language that we can use to define our API.

Agreed. Continuing with more tight coupling instead of letting server side logic remain server side would align with SOAP 2.0, in my opinion. And it’s something I don’t much care for.

To dig in a little more about how REST makes it possible to evolve APIs with minimal impact, wait for my next article, The Power of REST – Part 2.

The myth of polymorphism

I remember reading about polymorphism for the first time. I was in high school, and boy it sure looked cool! Too bad I didn’t realize that the myth of polymorphism was a bunch of poppy cock.

You see, polymorphism never seems to be presented in its real state. Instead, we get this goofy, toy-app type presentation. Does Shape -> Rectangle -> Square ring a bell?

The fallacy of geometrical shapes being polymorphic

One of the simplest ways people seem to introduce polymorphism is using geometric shapes. We all know that a Square is a Rectangle. We covered that nicely in geometry, right? Problem is, geometry doesn’t equal software.

When discussing things in light of geometry, the reason we value this relationship is because we are looking at things like angles, parallelism, vertices, and intersections. Hence, squares carry all the attributes of rectangles. They simply have the same width and height.

But software isn’t geometry. The things we construct we must also interact with. The shapes must afford us operations to grab them, interrogate them, draw them, and manipulate them. A rectangle has two attributes: width and height. A square has one: width.

If we grabbed a square, set its width, then set its height, the assumptions of what would happen is unclear. Should a square morph into a rectangle? Or should setting the height induce the side effect of also updating the width? Either way is bad form. Hence, its best to break apart this faulty geometric relationship and realize that squares are NOT rectangles.

The fallacy of inheriting behavior

So shaking off the trivial example of geometric shapes, another common example is to talk about the glorious ability to reuse code. With polymorphism, it will be SO easy to code something once, and then reuse it across dozens if not hundreds of classes! And with late binding options, gobs of 3rd party libraries can go forth and reuse your code.

The problem is, no programming language has adequately been invented to gather ALL the semantic nuances of code. As more and more classes extend a given class, they all either realize EXACTLY what the abstract parent class does and agree with it, or they discover some new wrinkle not quite handled. The API may be supported, but some underlying assumption is buried that requires an update.

As the tail of inheritance grows, maintainers are less likely to accept new changes to the shared code. The risk of breaking someone else grows, because everyone knows not the ENTIRE nature of the code can be captured.

Some of the avenues to remedy this involves opening up the API a bit more. Perhaps a private utility method is needed by a new extender. But opening it up introduces more maintenance down the road. Or more opportunities for others to abuse things that used to be kept tightly controlled.

History has proven that composition beats inheritance for sustainability. Raise your hand if maintenance, not new development, doesn’t encompass much of your work.

The alternative are more sophisticated languages where you can capture more of the concepts. Yet these languages come across as too complex to many, arguably because CAPTURING all semantics is inherently challenging. And more often than not, we don’t KNOW all these semantics on the first round.

The myth of polymorphism vs. the reality

One thing that has emerged is programming to interfaces. Interfaces provide a nice contract to work against. Naturally, we can’t capture everything. But at least every “Shape” can  institute the defined methods. In Java, interfaces can be combined, allowing multiple behaviors to be pulled together.

So when it comes to abstract and refactor, think about extracting interfaces when possible and delegating solutions.

Strangely enough, despite my consternation with static variables, I’ve come to appreciate static methods in the right scenario. Factories that support interfaces can be quite handy dandy. But a well modeled, inheritance hierarchy is to hard to accomplish. If possible, try to avoid multiple layers, and see if it’s not possible to extract share code into various utility classes.

And when the entire hierarchy is nicely contained INSIDE your framework, and not as an extension point to consumers, the fallacies of polymorphism can be contained. But watch out! It takes much effort to put this together. Never think it will be easy.

Of course, I could be entirely wrong. If you have some keen examples, please feel free to share!

Happy coding.

Why take a break?

Good developers take breaks. But why? Last night, I attended a men’s session where two very different people talked on stage about very different breaks. The first was a pastor who was granted a six month sabbatical. The second was someone that was fired, and instead of lunging for the next job, spent time evaluating things carefully. As I listened, it made me reflect upon my own discoveries about taking breaks.

Why do we need breaks?

To be honest, it’s quite easy to burn yourself out. Maybe when you’re a bit wet behind the ears and single, not so much. I once worked extensive hours (despite not getting paid overtime), and I don’t regret that.

But today I can’t do the same thing. I have three kids and many other responsibilities that preclude me from working ALL THE TIME. Well, a big difference between now and then, is that back then, I needed a lot of “tinkering” to figure things out. I was short on experience, but long on drive. And working extra hours helped me learn a lot of valuable lessons.

Today, the valuable lessons are taking breaks, and letting my subconscious continue chugging away. Even today, I’m working on a major review/refactor/rebase task that has already cost me two weeks and will probably cost a couple more. Yet I don’t dread this effort. In fact, I’m not trying to rush it. Because the hours I’m away from my laptop (like now), my mind is still noodling things out. And when my fingers rejoin the keyboard, I feel like my knowledge of this branch is growing at a comfortable pace. If I tried burning ten hours a day, I’d burn out and torch it all.

But by letting my family pull me away, and focus on other important things, I am stopped from burning out. And I can avoid spinning my wheels in the mud of frustration. In the past five years, I feel as if my productivity has actually increased, because I am working smarter not harder.

What good do breaks do?

Let’s just jump right in and state something critical – your work isn’t everything. When you pass from this world, people are less likely to remember the code you wrote, and instead remember the impact you made on your children, your spouse, and your community.

I remember a graphic that came out of Microsoft showing how you could access every aspect of your job from everywhere – a kids soccer game, waiting for an appointment, during breakfast, etc. 37Signals released a mocked up variant with the same style, only they dubbed it Work Can Wait (which you can see linked her). The thing basically stresses the importance of having a life.

But in addition to balancing work and life, I’ve discovered that my work actually flourishes when I get away from it. I once sat in on a talk by Dr. Venkat Subramaniam, who described that classic software eureka moment of realizing the answer to your problem you battled all day. The next morning, in the shower. I’ve experienced that a fair number of times, moreso since I started taking more breaks.

I enjoy, on occasion, merging a Github pull request after hours. But I don’t go out of my way to do that. It’s usually something small, like a typo in a guide. I enjoy being able to shuffle that off my plate in a bit of idleness. But pulling myself AWAY from the keyboard sometimes forces my mind to assess what just happened, and what’s coming next. Seeing that I have less time to spend, my brain hammers out certain details to make better usage of our next keyboard session. And I sincerely believe I code with higher quality because of this.

Try it. You might be surprised.

I’ll close by sharing a reflection I received from Mark Fisher, founding member of the Spring team. He once mentioned that if he had been forced to work nose to the grindstone, and hadn’t taken all the various walks, and simple “think about it” moments, Spring Integration may never have happened. Not everything is a burn-baby-burn coding moment.

Happy coding!

Guten Tag Deutschland. Ich bin da!

I just arrived in Germany for our big Spring Data summit. Our team is having a get  together to do some planning and scoping of work for the next year. And I couldn’t be more excited.

I arrived at 7:00 am local time, and waited two and half hours to discover the airline had lost my bag. But pay it no mind. I packed the critical in my carry on so with paperwork filed, it’s no sweat. The fun part (so far) was having the leverage my German to interact with my taxi driver, who spoke hardly any English.

Something i was able to contemplate while traveling, is that one of the biggest things I missed while working on Spinnaker was the jürgenized code base of Spring. The clean, elegant nature of all the portfolio projects really gives me a warm fuzzy.

Its hard to explain if you haven’t seen clean code before. But having a strongly enforced clean, consistent coding policy eases the mind. When we write code, we need to understand it. Sifting through inconsistent spacing, sloppy names, tabs AND spaces, and other aspects of code slows down the brain, making it hard to digest the functional aspects.

My father-in-law once shared that driving while it’s raining tires you out faster. He once had to drive at night through a lot of rain and it wore him out. The same can be said for sifting through sloppy, inconsistent code.

So this week I celebrate being back amongst my Spring Data mateys as he continue hammering out what has to be one of my favorite umbrella projects. Applying the concepts of the Spring programming model to Data management.

Cheers!

How to write a tech book, or how I stopped worrying and learned to love writing

I just sent in the last chapter of Learning Spring Boot 2nd Edition’s 1st draft. And my brain has collapsed. I’ve been working for several months on this 10-chapter book that embraces Spring Boot 2.0 and reactive programming. There are several books out there on reactive programming, but I believe this will be the first to hit the market about Project Reactor.

I’m not done, not by a long shot. I told my publisher that we’d need at least one big round to make updates to ALL the code, because I started writing when not everything was in place. And it’s still true. But editing, polishing, and updating an existing repository of code and manuscript is easier than creating one out of thin air.

I wanted to write just a little bit about how I approaching writing something like this. Maybe you have been thinking about writing a book yourself, and your curious what goes on. This isn’t the only, but it’s the way that works for me.

Tools

Laptop on a work table with DIY and construction tools all around top view hobby and crafts concept

To write a book, you need a mechanism to capture prose and code. For fiction, I use Scrivener, but when it comes to technical writing, where the code, screenshots, and text are tightly integrated, I use Asciidoctor. With Asciidoctor, the overhead of a word processor is removed, and instead I can focus on pure content.

Also, using Asciidoctor lets me pull in the code to generate the manuscript sent in to my publisher. This way, I have Sublime Text in one window viewing the source prose and IntelliJ open in another viewing the source code. To top it off, I have a Ruby guardfile configure to constantly regenerate an HTML proof of the chapter I’m writing with it refreshing via LiveReload in my browser.

This combination gives me a quick feedback loop as I write.

What to write

This may be the biggest hurdle to some. When you’ve picked the technology, setup your tools, and finally have the editor opened up, what do you type in that black, blank screen?

Waiting for magical words to arrive? Or perhaps you hope elves will scurry in and leave something? Nope. This is where the rubber hits the proverbial road and you have to push yourself to start typing.

What do I do? I actually start earlier than that. From time to time, I have a crazy idea about something I want to show to an audience at a conference. Some demo I want to give with a few pieces of the Spring portfolio. I began to noodle out code to make that happen. Once, I asked “can I snap a picture of the audience and upload it from my phone to a webpage the audience is watching on the overhead?” Thus was born my Spring-a-Gram demo.

That demo has morphed many times to the point that I have built a full blown, cloud native, microservice-based system. And guess what. It’s the system we get to explore in Learning Spring Boot 2nd Edition.

So when I sit down to write a chapter, I first start writing the code I want to walk through. Once it’s humming inside my IDE, I start to typeset it in Asciidoctor. And pages of code fragments, I began to tell a story.

Weaving a story

When writing technical articles, getting started guides, and books, everything is still a story. Even if this isn’t a novel, it’s still a story. People that grant you the honor of reading your work want to be entertained. When it comes to tech work, they want the ooh’s and ahh’s. They want to walk away saying, “That was cool. I could use that right now.”

At least, that’s how I read things. So my goal when I write is to make it fun for me. If it’s fun for me, I trust it will be fun for others.

If I sift through a chapter, and it’s just a boring dump of code, then it’s sad. And that’s not what I want. I can’t promise that all my writing has upheld this lofty goal. But it’s my goal nonetheless.

So oftentimes, I will typeset the code, hand some descriptive details around the code, then read it again, top to bottom, and add extra stuff. Paragraphs talking about why we’re doing this. Mentioning tradeoffs. Issues that maybe exist today and where we have to make a choice. Ultimately, understand not just what the code does but why it does it this way. And what other options are out there.

Letting go

At some point, after all the writing and polishing and fine tuning, you have to turn in your work. I don’t know if there is ever a time where I’m 100% satisfied. I’m kind of picky. But the truth is – you’ll never find every typo, every bug.

My code fidelity is much higher ever since I started using Asciidoctor. But stuff happens. And you have to be happy turning in your work.

You see, if you’ve acquired enough skill to sit down a write a book without someone leaning over your shoulder and coaching you, you might have a lot of good value other developers seek. Eager coders will be able to read what you wrote, look past small mistakes, and most importantly, grok the points you make. That’s what is key.

And one thing is for certain – writing makes you better. I have found that any gaps in my own understanding of certain parts of code lead me to chase down and grasp those bits. And then I want to share them with others. Which is what writing books is all about.

Happy writing!

Layering in new behavior with React

I’ve talked in the past how I like the approach React leads me to when it comes to building apps. How does such grandiose talk play out when it’s time to add a new, unexpected feature? Let’s check it out. I’ve been building an installation app for Spinnaker, and one of our top notch developer advocates gave it a spin.

Results? Not good. Too many presumptions were built into the UI meaning he had no clue where to go. Message to me? Fix the flow so it’s obvious what must be done and what’s optional.

So I started coding in a feature to flag certain fields REQUIRED and not allow the user to reach the installation screen without filling them out. Sounds easy enough in concept. But how do you do that?

With React, what we’re describing is an enhancement to the state model. Essentially, keep filling out fields, but earmark certain fields as required, and adjust the layout of things to show that, while barring other aspects of the interface in the event those same fields aren’t populated.

So I started with a little bit of code to gather a list of these so-called required fields, and it looked like this:

If you have done any React programming, assigning something via this.state[…] = foo should set off bells in your head. You always, always, ALWAYS used this.setState(…). So what’s up?

Rules are rules until they aren’t. This is a situation that defies the concept. I don’t WANT to set the state such that it triggers a ripple through the DOM. Instead, this code happens right after the initial state model is initialized. And I’m setting it using values populated in the previous line, because you can’t initialize required, pointing at this.state.api in the same call the initializes this.state.api itself!

With this list of required fields setup, we can start marking up the fields on the UI to alert the user. I have a handful of React components that encapsulate different HTML inputs. One of them is dedicate to plain old text inputs. Using the newly minted required list, I can adjust the rendering like this:

Notice the little clause where it checks this.props.settings.requires.includes(this.props.name)? That is a JavaScript ternary operation that if true, returns the label with extra, highlighted text. Otherwise, just render the same label as always.

By applying this same tactic to the other React components I have for rendering each selection on the UI, I don’t have to go to each component and slap on some new property. Instead, the designation for what’s required and what’s not it kept up top in the state model, making it easier to maintain and reason about.

At the top of the screen is a tab the user clicks on to actually install things and track their progress. To ensure no one clicks on that until all required fields are populated, I updated that tab like this:

A little function that detects whether or not all required fields have been filled out is checked, and if so, renders the HTML LI with its onClick property filled out with the handler. If NOT, then it renders the same component, but NO SUCH onClick property is present, meaning it just won’t respond.

This is the nature of React. Instead of dynamically adjusting the DOM model, you declare variant layouts based on the state of the model. This keeps pushing you to put all such changes up into the model, and writing ancillary functions to check the state. In this case, let’s peek at requiredFieldsFilledOut:

This tiny function checks this.state.required, counts how many are “truthy”, and if the count matches the size of this.state.required itself, we’re good to go.

In case you didn’t know it, React encourages you to keep moving state-based functions up, closer to the state model itself. It’s super simple to pass along a handle to the function to lower level components, so they can still be invoked at lower levels. But anytime one component is trying to invoke another one in a separate part of the hierarchy, that’s a React-smell hinting that the functions should be higher up, where they can meet. And that results can trickle down to lower level components.

I encountered such a function stuffed down below in a lower level component. The need to trigger it sooner based on a change in the state had me move it up there. Now, when the Installation tab is clicked, that function is run, instead of waiting for the user to click some button down below.

Suffice it to say, I rehabbed the UI quite nicely. The thing is, with React it’s not hard to effect such change in a couple days, compared to the days or weeks of effort combined with testing it might have taken with classic manipulated-the-DOM apps, where you have to hunt down gobs of wired event handlers and find every little nuanced operation you coded to make things operate correctly.

Check out my @SpringData and @SpinnakerIO talks from SpringOne Platform @S1P

Recently, my latest conference presentations have been released. You are free to check them out:

In the Introduction to Spring Data talk, I live code a project from scratch, using start.spring.io, Spring Data, and other handle Spring tools.

In the Spinnaker: Land of a 1000 Builds talk, I present the CI/CD (continuous integration/continuous delivery) multi-cloud tool Spinnaker:

Enjoy!

Tuning Reactor Flows

I previously wrote a post about Reactively talking to Cloud Foundry with Groovy. In this post, I want to discuss something of keen interest: tuning reactor flows.

When you use Project Reactor to build an application, is the style a bit new? Just trying to keep your head above water? Perhaps you haven’t even thought about performance. Well at some point you will. Because something big will happen. Like a 20,000 req/hour rate limit getting dropped on your head.

Yup. My development system mysteriously stopped working two weeks ago. I spotted some message about “rate limit exceeded” and rang up one of my friends in the Ops department to discover my app was making 43,000 req/hour. Yikes!

As I poured over the code (big thanks to the Ops team giving me a spreadsheet showing the biggest-to-smallest calls), I started to spot patterns that seemed like things I had seen before.

Reactor tuning a lot like SQL tuning

Long long ago, I learned SQL. As the saying goes, SQL isn’t rocket science. But understanding what is REALLY happening is the difference between a query taking twenty minutes vs. sub-second time to run.

So let’s back up and refresh things. In SQL, when you join two tables, it produces a cartesian product. Essentially, a table with n rows + a table with m rows, will produce a table with n x m rows, combining every possible pair. From there, you slim it down based on either relationships or based on filtering the data. What DBMS engines have had decades is learning is how to read your query and figure out the BEST order to do all these operations. For example, many queries will apply filtering BEFORE building the cartesian product.

In Reactor, when you generate a flux of data and then flatmap it to another flux, you’re doing the same thing. My reactor flow, meant to cache up a list of apps for Spinnaker, would scan a list of eighty existing apps and then perform a domain lookup…eighty times! Funny thing is, they were looking up the same domain EIGHTY TIMES! (SQL engines have caching…Reactor doesn’t…yet).

So ringing up my most experienced Reactor geek, he told me that it’s more performant to simply fetch all the domains in one call, first, and THEN do the flatmap against this in memory data structure.

Indexing vs. full table scans

When I learned how to do EXPLAIN PLANs in SQL, I was ecstatic. That tool showed me exactly what was happening in what order. And I would be SHOCKED at how many of my queries performed full table scans. FYI: they’re expensive. Sometimes it’s the right thing to do, but often it isn’t. Usually, searching every book in the library is NOT as effective as looking in the card catalog.

So I yanked the code that did a flatmap way at the end of my flow. Instead, I looked up ALL domains in a CF space up front and passed along this little nugget of data hop-to-hop. Then when it came time to deploy this knowledge, I just flatmapped against this collection of in memory of data. Gone were all those individual calls to find each domain.

.then(apps ->
	apps.stream()
		.findFirst()
		.map(function((org, app, environments) -> Mono.when(
			Mono.just(apps),
			CloudFoundryJavaClientUtils.getAllDomains(client, org))))
		.orElse(Mono.when(Mono.just(apps), Mono.empty())))

This code block, done right after fetching application details, pauses to getAllDomains(). Since it should only be done once, we only need one instance from our passed along data structure. The collection is gathered, wrapped up in a nice Mono, and passed along with the original apps. Optionally, if there are no domains, an empty is passed along.

(NOTE: Pay it no mind that after all this tweaking, the Ops guy pointed out that routes were ALREADY included in the original application details call, hence eliminating the need for this. The lesson on fetching a whole collection up front can be useful.)

To filter or not to filter, that is the question

Filtering is an art form. Simply put, a filter is a function to reduce rows. Being a part of both Java 8’s Stream API as well as Reactor’s Flux API, it’s pretty well known.

The thing is to watch out for if the filter operation is expensive and if it’s inside a tight loop.

Loop? Reactor flows don’t use loops, right? Actually, that’s what flatmaps really are. When you flatmap something, you are embedding a loop to go over every incoming entry and possibly generating a totally different collection. If this internal operation inside the flapmap involves a filter that makes an expensive call, you might be repeating that call too many times.

I used to gather application details and THEN apply a filter to find out whether or not this was a Spinnaker application vs. someone else’s non-Spinnaker app in the same space. Turns out, finding all those details was expensive. So I moved the filter inward so that it would be applied BEFORE looking up the expensive details.

Look at the following code from getApplications(client, space, apps):

return requestApplications(cloudFoundryClient, apps, spaceId)
	.filter(applicationResource ->
		applicationResource.getEntity().getEnvironmentJsons() != null &&
		applicationResource.getEntity().getEnvironmentJsons().containsKey(CloudFoundryConstants.getLOAD_BALANCERS())
	)
	.map(resource -> Tuples.of(cloudFoundryClient, resource))
	.switchIfEmpty(t -> ExceptionUtils.illegalArgument("Applications %s do not exist", apps));

The code above is right AFTER fetching application information, but BEFORE going to related tables to find things such as usage, statistics, etc. That way, we only go for the ones we need.

Sometimes it’s better to fetch all the data, fetch all the potential filter criteria, and merge the two together. It requires a little more handling to gather this together, but again this is what we must do to tailor such flows.

Individual vs. collective fetching

Something I discovered was that several of the Cloud Foundry APIs have an “IN” clause. This means you can feed it a collection of values to look up. Up until that point, I was flatmapping my way into these queries, meaning that for each application name in my flux, it was making a separate REST call for one.

Peeking at the lower level APIs, I spotted where I could give it a list of application ids vs. a single one. To do that, I had to write my flow. Again. By putting together a collection of ids, by NOT flatmapping against them (which would unpack them), but instead using collectList, I was able to fetch the next hop of data in one REST call (not eight), shown below:

return PaginationUtils
	.requestClientV2Resources(page -> client.spaces()
		.listApplications(ListSpaceApplicationsRequest.builder()
			.names(applications)
			.spaceId(spaceId)
			.page(page)
			.build()))
	.map(OperationUtils.<ApplicationResource, AbstractApplicationResource>cast());

cf-java-client has an handy utility to wrap paged result sets, iterating and gathering the results…reactively. Wrapped inside is the gold: client.spaces().listApplications(). There is a higher level API, the operations API, but it’s focus is replicating the CF CLI experience. The CF CLI isn’t built to do bulk operations, but instead operate on one application at a time.

While nice, it doesn’t scale. At some point, it can a be a jump to move to the lower level APIs, but the payoff is HUGE. Anyhoo, by altering this invocation to pass in a list of application names, and following all the mods up the stack, I was able to collapse eighty calls into one. (Well, two, since the page size is fifty).

You reap what you sow

By spending about two weeks working on this, I was able to replace a polling cycle that perform over seven hundred REST calls with less than fifty. That’s basically a 95% reduction in network traffic, and nicely put my app in the safe zone for the newly imposed rate limit.

I remember the Ops guy peeking at the new state of things and commenting, “I’m having a hard time spotting a polling cycle” to which the lead for Cloud Foundry Java Client replied, “sounds like a good thing.”

Yes it was. A VERY good thing.

Reactively talking to Cloud Foundry with Groovy

I’ve been working on this Spinnaker thing for over a year. I’ve coded support so Spinnaker can make continuous deployments to Cloud Foundry. And the whole thing is written in Groovy. I recently upgraded to that I can now talk reactively to Cloud Foundry with Groovy.

And it’s been a nightmare.

Why?

Groovy is pretty darn wicked. Coding Spring Boot apps mixed with Spring MVC controllers in the terse language of Groovy is nothing short of gnarly. But it turns out there’s a couple things where Groovy actually gets in your way.

Reactor + Cloud Foundry

Want a taste? The code fragment below shows part of a flow used to look up Spinnaker-deployed apps in Cloud Foundry:

operations.applications()
  .list()
  .flatMap({ ApplicationSummary appSummary ->
    operations.applications()
      .getEnvironments(GetApplicationEnvironmentsRequest.builder()
        .name(appSummary.name)
        .build())
      .and(Mono.just(appSummary))
  })
  .log('mapAppToEnv')
  .filter(predicate({ ApplicationEnvironments environments, ApplicationSummary application ->
    environments?.userProvided?.containsKey(CloudFoundryConstants.LOAD_BALANCERS) ?: false
  } as Predicate2))
  .log('filterForLoadBalancers')
  .flatMap(function({ ApplicationEnvironments environments, ApplicationSummary application ->
    operations.applications()
      .get(GetApplicationRequest.builder()
        .name(application.name)
        .build())
      .and(Mono.just(environments))
  } as Function2))

This is the new and vastly improved Cloud Foundry Java SDK built on top of Project Reactor’s async, non-blocking constructs (Mono and Flux with their operations). Every function call is an async, non-blocking operation fed to the next function call when the results arrive.

What does this code do? It looks up a list of Cloud Foundry apps. Iterating over the list, it weeds anything that doesn’t have a LOAD_BALANCER environment variable, a tell for Spinnaker-deployed apps. Finally it looks up the detailed record for each application.

The heart of the issue

What’s nestled inside several of these “hops” in this flow is a tuple structure. In functional flows like where each hop gets a single return, we often need to pass along more than one piece of data to the next hop. It’s the side effect of not using the imperative style of building up a set of variables, but instead passing along the bits in each subsequent funtion call.

cf-java-client has TupleUtils, a collection of functions meant to pack and unpack data, hop to hop. It’s elegant and nicely overloaded to support up to eight items passed between hops.

And that’s where Groovy falls flat. Groovy has this nice feature where it can coerce objects. However, with all the overloading, Groovy gets lost and can’t tell which TupleUtils function to target.

So we must help it by coercing it into the right structure. See those “as Function2”  and “as Predicate2” calls? That helps Groovy figure out the type of lambda expression to slide things into.

And it’s dragging me down!

The solution

So I finally threw in the towel and converted this one class into pure Java.

Yes, I ditched hip and cool Groovy in favor of the old warhorse Java.

You see, when something is so dependent on every character being in the right place, we need all the static support from the IDE we can get. Never fear; I’m not dropping Groovy everywhere. Just this one class.

And here is where Groovy’s interoperability with Java shines. Change the suffix of one file. Make the changes I need. And both the IDE and the compiler is happy, giving me an operational chunk of code.

I had to rewrite a handful of collections, but it wasn’t the worse thing in the world. In half a day, I had successfully moved the code. And now as I’m working on another flow, the pain of Groovy’s need for coercion specification is no longer wreaking havoc.

Cheers!