Category Archives: Uncategorized

Replicate traffic from production

For most of our services we have custom JMeter based performance tests, they try to mimic production usage of given service. JMeter has built-in support for generating random values, you can also add a CSV file with values copied straight from production, add custom delays, etc. It is nice but this is in fact an approximation. There is nothing like production ;-). Testing in production still sounds a bit scary, but one of the techniques named among Testing in Production (TiP) patterns is traffic replication.

This can be approached from several different angles, if you have something like Varnish or haproxy in front of your service most likely you can configure this quite easily. Another way is to log requests in your application and then replay request using a simple curl (or something similar) but this requires additional work.

After looking at different possibilities I came across Gor. Gor gives you something a bit different than two usecases mentioned above. In general – it can give you a flavour of continuous performance testing. So, long story short — what Gor can do for you?

In first place you need to install Gor on one of your production machines.

wget https://github.com/buger/gor/releases/download/v0.10.1/gor_0.10.1_x64.tar.gz
tar -xvf gor_0.10.1_x64.tar.gz

Gor is written in Go, so after executing this you will end up with single, self-contained executable file ;-). As a warm up let’s record some requests into a file.

sudo ./gor --input-raw :8080 --output-file requests.gor --http-allow-method POST

As you see in the example above, you can limit recorded requests only to POSTs (complete list of selectors can be found in Gor’s documentation). You can also rewrite recorded requests in all sort of ways using your own custom scripts. Nice. It is time for first downside of Gor — there is no stop button, in order to stop the recording you need to shutdown the Gor process (correct me if I am wrong).

In order replicate traffic, you need to cast the following spell:

./gor --input-file requests.gor --output-http "http://test01-example.com:8080"

Gor has some nice additions, which allows you to use this traffic for performance tests, you can specify the multiplier which will increase (or decrease) number of requests.

./gor --output-http-stats --input-file "requests.gor|200%" --output-http "http://test01-example.com:8080"

If you want to setup the constant traffic replication from your production machine into staging, you can run Gor in the following way:

sudo gor --input-raw :8080 --output-http "http://test01-example.com:8080"

It is worth to mention that this is the original usecase of Gor and the reason why it was created (this explains why it does not have STOP button).

You can now sit and look at dashboards of your test instance and compare response times / error rates etc with production. In practise this may be of course a bit more complicated, but this is a perfect case for read-only services which is not that rare this days.

As for downsides:

  • Both traffic recording and replication adds load to your production machine, you need to remember about this. In general I would prefer setting traffic replication using external proxy, but in my current use case this fits perfectly
  • Gor requires root access, which might be problematic in some hosting environments
  • Gor can loose significant portion of requests — which is definitely not nice but it is getting better with every release.
  • Lack of switches which allows to disable/enable Gor for a while is a bit annoying
  • Replaying requests from file does not leave original time spacings between requests 🙁

Take a look at Gor, it is not perfect but it embodies a very useful idea. I wonder what do you think about it and how do you deal with traffic replication.

Everything you ever wanted to know about uservices

A few weeks ago I had a chance to pariticipate in GeeCON microservices (http://2015.microservices.geecon.org/) conference. We’ve (GeeCON team) always dreamed to have a conference near the sea side, so the Sopot sounds like a perfect place. In this post I wanted to summarize my observations and notes taken during the conference. Not all presentations were described here, for the rest of my impressions take, a look at my twitter feed.

“It’s Not Just MicroServices: Areas of focus for MicroService Success”, Fred George

During his opening keynote Fred introduced a Cynefin Framework, which to some extent explains the nature of the complex systems. Cynefin framework divides problems into 5 domains/types: obvious (or simple), complicated, complex, chaotic (we can make some assumptions and look for the answer) and disorder (lack of knowledge about nature of the problem). As far as I understand, in the beginning every problem solving quest starts in the disorder and afterwards can be categorized as one of obvious, complicated, complex or chaotic. The problem is that in the beginning we don’t know what kind of problem we are facing, so we don’t know what kind of tools are applicable (Cynefin framework makes some suggestions in this matter). Fred mentioned that most of the problems faced by modern companies can be categories as choatic. In this case we need to look for the new knowledge and learn quickly to get closer to optimal solution — which is unkown.

What can be done to address such a use case? On the technical level, microservices looks promising here. This paradigm is well suited for cloud environments assuring scalability and productivity. It stress the discardability of software which is essential when we need to build prototype to verify our hypothesis and discard it when hypothesis is wrong. Fred presented the most common uservice architecture with the shared, dumb event bus, which gives async access to events to any service which is interested in particular piece of problem domain. After diving into technical aspects, Fred jumped into organization of work in microservice environments. I will just refer to my twitter notes here:

  • Kill the specialists, we need full stack problem solvers. In theory specialists are more productive but in practice specialization is causing the communication overhead and may lead to delays.
  • New project means you will get new job title
  • Fun — was one of the most frequent words in Fred’s keynote – when people have fun, they are more engaged and effective
  • It takes 3 to 10 weeks to build efficient team, so forming a team is a cost, that is why you should bring the work to the team

Microservices are usually mentioned as a technical paradigm but after Fred’s keynote it was clear that this approach needs to be reflected in the organizational structure.

“Swimming upstream in the container revolution”, Bert Jan Schrijver

Bert gave a very nice talk in which he described a few practices which he and his teams have adopted. Once again I will refer to my twitter notes and I will mention only the most interesting things from my personal point of view:

  • “Hands off” policy – No logging into servers, need a change or update, just update your puppet manifest. This one is crucial if we want to triet our infrastructure as code.
  • Nevertheless testing Puppet manifests before pushing them into production might be tricky
  • AWS has its limits (e.g. your pick of traffic might occur at the same time as others)
  • “Don’t depend on availiability of Ops experts” – this kill team progress.

Slides can be found here.

“Microservices – enough with theory, let’s do some coding”,
Tomek SzymaƄski and Marcin Grzejszczak

This was the only presentation with the live coding — Tomek and Marcin produced around 5 lines of code ;-). Nevertheless it was the most technical presentation of the conference. What sleeps under the hood of the microservices ecosystem? Monitoring, alerting, deployment, dependency resolution and discoverability of services, all this on top ZooKeeper, Grafana, Ansible, ELK, Spring Boot, Hystrix, Rundeck and Slack ;-). Since “sudo apt-get install microservices” still does not work, we need to deal with the plethora of tools to automate all the steps necessary to build microservices. All the code examples were based on Accurest and 4Finance spring boot micro infra (https://github.com/4finance/micro-infra-spring).

Among other things Tomek and Marcin mentioned a very important practical feature/pattern – correlation id. In order to track the flow of the particular request in the stack of a few dozens of microservices you need to assign a unique identifier to every user request and pass it downstream. This allows you to visualize the request flow and debug/monitor your system.

In general it was a very nice presentation, nevertheless 30min was definately not enough to cover all the topics prepared by presenters.

“Scaling microservices at Gilt”, Adrian Trenaman

Adrian described the whole evolution of architecture at Gilt, starting from a successfull startup to a mature company who still want to innovate and grow.

  • Voluntary adoption – give engineers freedom to choose tools their like. This is something similar to the theory of evolution 😉 productional use will verify which tools are accurate. Devs are smart enough to choose tools which works instead of new, shiny javascript framework (no offence).
  • Once again it was stressed that microservices are tighly connected to how organization operates.
  • Dark canary and Testing in Production (TiP) are prefered over traditional testing. Adrian mentioned that in case of read-only services TiP is very natural 😉 especially when implemented in a facebook way.
  • It’s starting to be a nice tradition, after opensourcing of Hermes by Allegro at GeeCON in May, also Gilt.tech decided to release their tool Ion roller during the GeeCON in Sopot! Hope to see more opensource activity during GeeCONs 😉

I was under huge impression of Adrian’s presentation and really glad that he decided to visit Sopot. Slides can be found here.

“Architecture without Architects”, Erik Dörnenburg

Erik’s talk was a closing keynote of entire conference. He was talking about the role of the architect in the modern software industry. Comparing software industry to the traditional architecture is wrong, software architecture is more like a town planning e.g. take a look at evolution of London and compare it to a long living IT systems.

Erik presented a few very striking examples of how diagram abstraction is implemented by real systems, we as a developers need to be aware of abstractions which surrounds us. That is why architecture (abstraction/planning) and development cannot be separated. In the conclusion Erik stated that there is no reason to maintain the role architect as we know it, because most of the activities performed by archictects can and should be handled by developers.

Summary

I’ve really enjoyed this conference from both perspectives: as an organizer and as an participant. After this gig I am a great fan of single track conferences, this requires a lot more from the organizers — to select only the best talks (which is not that hard when you deal with focused event). But you don’t have this unpleasent feeling that you are missing something interesting like one usually have with multiple tracks.

Thanks to everyone who were involved in the organization, especially to Kuba Marchwicki, Tom Bujok and MichaƂ Gruca our awesome collegues from Tricity JUG. Apart from these guys I need to mention Adrian Nowak (jesteƛ zajebisty!) and Ɓukasz Stachowiak who were the most active during the organization of the event.

Algorithms of the intelligent web – review

Thanks to MEAP and PoznaƄ JUG I had a chance to read “Algorithms of the intelligent web” by Haralambos Marmanis and Dmitry Babenko. Content is organized into seven chapters, starting with general introduction which gives a broad overview of state-of-art in the field of modern web application. Second chapter offers a few bites of theory and finally practical example of building simple search engine. You can also find information about using classifiers, creation of recommendation systems and document clustering. Final chapter presents complete example of news portal which incorporate all introduced techniques in neat working solution.

Chapters from two to six have similar structure, starting with some theory necessary to understand presented concepts, then some clear examples presenting real word usage. Examples are extended with some additional more advanced features but everything is still perfectly understandable. Readers would learn how to adopt existing APIs (eg. digg.com), how to aggregate and transform content in order to create innovative mashups. After practical part, readers will find some notions about usage of presented solution in production. Authors describes common mistakes which leads to dead ends during implementation of modern intelligent web applications and this is definitely one of the biggest advantages of this book. What is also worth to mention, Marmanis and Babenko emphasize the role of quality of results and show general ways in which one can evaluate obtained outcome. At the end of each chapter readers can find TODOs, a section with tasks that maybe done in order to utilize presented solutions better.

All examples are delivered in BeansShell and Java. Nowadays, in the age of frameworks like Grails or Ruby on Rails the choice of BeanShell is quite unexpected. Examples in JRuby or Groovy could simplify adoption of presented solutions in real life web applications. But this is a minor thing, BeanShell is very similar to Java, so none Java developer should have problems with understanding examples. In MEAP-copy of book which I have evaluated there was also no information about how to run presented examples nor that knowledge about Java or BeanShell are required. I hope that would be improved in final release of book (from that what I’ve read in answer to my feedback those issues were addressed in final version). Authors presents quite a few open source libraries which can be easily use not only during creation of intelligent web applications but also in everyday work of Java developer.

What’s missing? I would love to read more about OpenSocial API which is only mentioned in first chapter of the book. Another thing that is missing are some references to so called Web 3.0, I’m constantly looking for a comprehensive overview of semantic web applications (eg. OpenCalais, Hakia). Creating a small semantic enabled application would definitely be a plus.

„Algorithms of the intelligent web” is definitely worth to recommend to all developers who want to gain knowledge about some useful Information retrieval and Machine learning techniques. Those techniques are presented in a very clear and understandable way. Book contains universal methods and algorithms, knowledge like this does not get old so fast (like for example web frameworks). I would definitely come back and read this book again.