Developing Resilient Systems in the Blockchain Age

By OpsMatters

Oct 10, 2024

4 minutes

OpsMatters

For over nine years, Coin Metrics Software Engineer Roman Rashtiehaiev has been building scalable, fault-tolerant backends. His experience spans enterprise software development at EPAM Systems, senior engineering roles at AgileEngine, CARRIYO, and VerifyAffiliate, and nearly three years designing data infrastructure for blockchain analysis.

Today, we talk to Roman about the evolution of resilient architecture, how his early experience with infrastructure and DevOps shaped his engineering mindset, what it takes to design systems that hold up under real-world conditions, and how the rise of blockchain data has transformed the meaning of scalability and reliability.

You started out at EPAM Systems and progressively advanced to more senior roles. Which key lessons from those early years still influence your engineering approach today?

Those early years of my professional life at EPAM also set the tone for how I perceive software engineering as an engineering discipline. I was blessed to be introduced to infrastructure and DevOps engineering from day one, especially for CI/CD pipelines, automation, and configuration management. It has remained with me since then and continues to impact my approach to system design until this day.

What actually surprised me about that time was the focus on clean code and architecture. It wasn't all about things doing something, it was about having them be testable and maintainable. Doing that in an enterprise environment taught me the manner in which processes must scale in larger organizations, and I gained a huge respect for the discipline of getting that to function.

Another key thing I learned was how to speak with various stakeholders. How to explain technical concepts in ways that non-technical people are able to understand, or the opposite, is something that I implement every day. Perhaps the most important thing I realized, however, was that it does not always turn out as you envisioned. That unpredictability taught me the habit of building systems and processes that are resilient in nature and not designed for the happy path only.

You've worked in several environments, from VerifyAffiliate's affiliate networks to Carriyo's logistics. What did you need to adapt your architecture thinking in transitioning from such disparate problem spaces?

Each area has other priorities, and you should understand that early on. In logistics, for example, integrations and reliability are of the utmost importance. You're working with a lot of systems that have to work together without any hitch. In affiliate systems and analytics, priorities will be towards data throughput and managing latency.

What did work for me was using modular microservice patterns whenever feasible but molding the specifics into the domain. That is simpler to uphold and isolate concerns irrespective of business context. On all these domains, I prioritized observability and integration quality from the beginning. Those will identify underlying bottlenecks early on, before those would become real issues.

The main message was to package the architecture into actual business constraints instead of shoe-horning the business into an assumed style. All domains possess their non-negotiables, and good architecture observes them but is technically superb.

At Coin Metrics, you worked for almost three years building blockchain and digital asset data systems. What were some of the unconventional engineering challenges of building in that space versus a standard backend system?

Blockchain data produces some interesting engineering problems. We developed high-throughput APIs that must contend with both real-time and historical cryptocurrency data at scale, and that is a different brain model than the traditional backend work. The volume and velocity are considerable, but the real coolness is handling the immutability of blockchain data. That part introduces a completely different paradigm of reconciling and updating versus handling the traditional database-backed systems.

Accuracy and dependability of the information then became the focus. We were consuming information from heterogeneous diverse sources with all their quirks and failure points. Observability within each of these streams of data was essential in order to be able to rely on the system.

One of my most proud achievements is having deployed the APISIX API Gateway. It greatly enhanced our rate limiting, scalability, and system readability. Apart from the tech implementation, I helped define and mandate cross-team API design consistency. This improved integration friction and reliability throughout the platform. With a shared API language across teams, collaboration is smooth sailing.

Coin Metrics sits at the intersection of data and finance. How did you ensure scalability, compliance, and robustness when faced with those limitations?

Robustness isn't a choice in the world of finance, it's necessary. We prioritized system resiliency through strict design reviews, defensive architecture patterns, and fault-tolerant services designed from the ground up. Each component needed to fail gracefully without introducing problems throughout the system.

Observability and monitoring were designed at the platform level, and not a bolted-on afterthought. That meant that there were rich API Gateway metrics opening us up to system behavior in real time. Consistent API standards, versioning schemes, and schema validation also let us require backward compatibility and data integrity. These might sound stodgy practices, but they're crucial to systems financial institutions rely on.

Cross-team consistency was another essential strategy. Enabling architectural coherence across teams allowed us to balance compliance needs with a requirement for innovation. Here, auditability and transparency are not nice-to-haves; they're essential architectural values. Everything must be traceable and comprehensible, particularly when handling financial information.

Having traversed several stages of team maturity, from start-ups to old-established companies, how did you balance process discipline with agility in software delivery?

It's always a close dance. I've learned it boils down to using lean processes that introduce reliability without getting in the way of delivery. There are just too many heavyweight processes for their own ends that create friction, but just the right level of form actually helps to deliver faster and with greater confidence.

Integration test pipelines and CI/CD automation are a perfect case in point. They provide confidence for fast iteration without the need for manual monitoring at every step. I've long believed in ownership and responsibility by teams rather than the imposition of a large bureaucracy. When engineers are responsible for their systems, they gravitate toward practices that guarantee quality by default.

There is a huge context here. In large organizations, you require more structure to align across teams and be consistent. In startups, you require more agility to respond quickly. The mistake is that you are attempting to apply the same strategy everywhere. With all this, having a culture of technical excellence in place through mentorship, code review, and architecture alignment ensures speed does not come at the expense of quality.

You've been doing intensive work with Java and Kotlin in backend technologies. Through the years, how has your stack or approach in backend development changed, particularly with cloud-native and microservice ecosystems?

My personal style has indeed changed quite a lot. I spent most of my early career working with Java-style designs, but I've moved on to Kotlin-based microservices with Spring Boot and structured concurrency. Kotlin's expressiveness and safety have actually changed code quality and developer productivity.

Talking about productivity, that's become a growing interest area for me. I've spent some time implementing internal starter projects and tooling that enable teams to work at a faster rate without reproducing prevalent patterns. More recently, I've adopted AI-powered development tools like Cursor and Copilot, and even created prompt rules to enable code quality while getting the most out of these tools.

The cloud-native era shifted the nature of system-building. Kubernetes, AWS, Docker, these toolkits have made automation a first-class concept instead of an afterthought. I used to be in the habit of running behind frameworks for the sake of frameworks, though. Now, my concern is maintainability, observability, and good abstractions. These principles are not specific to any technology fad and build systems that teams can grow sure of over time.

Developing Resilient Systems in the Blockchain Age

Monthly Archive

Follow Us