Bengaluru · Petasense/open to what's next

I build the
invisible.

The AI infrastructure, the reliability layer, the systems that keep the whole thing honest.

Akash Singh, full-stack engineer, 4 years building production AI and distributed systems. Currently at Petasense, where the data comes from physical machines and the stakes are real.

scroll

The Origin

How I got here.

I fell into distributed systems through the back door, debugging race conditions in a real-time contact center platform at 2am, figuring out why Kafka lag spiked during peak hours. That question led to three others, and I've been following the thread ever since.

Now I build full-stack: frontend product surface, backend services, and the AI layer connecting them. At Petasense the data comes from physical machines and the stakes are real equipment failure. The lessons translate anywhere reliability matters.

I care about reliabilitynot as a metric, but as a design philosophy. The systems I build are meant to outlast the sprint that created them: observable, recoverable, and honest about what they don't know.

“Build systems that outlive the sprint.Optimize for reliability, not novelty.”
personal principle
01

Production AI is 80% infrastructure, 20% model.

02

Observability isn't optional. It's how you earn the right to ship.

03

The senior engineer asks what happens when it doesn't work.

What I Believe

The principles.

Not a list of technologies.

I don't optimize for knowing the most tools. I optimize for knowing which tool, and why. Five principles shape how I work.

01

Event-driven over polling

02

Observe before you optimize

03

AI needs a foundation

04

Infrastructure as code, always

05

Build it once, read it forever

primary stack
React
Python
TypeScript
Kafka
Kubernetes
Docker
Postgres
Redis

The Journey

Work.

Across every role, the same question: how do we make this more reliable, observable, and intelligent?

Chapter I · Apr 2025 – Present
60%

faster engineer analysis

Petasense

Software Engineercurrent

Bengaluru, India

  • Engineered a Spectrum Overlay feature for vibration analysis, enabling multi-signal comparison with real-time FFT rendering at 60fps using WebGL, cutting engineer analysis time by 60%.
  • Designed a high-availability Interactive Annotations System using microservices, Redis caching, and RBAC, delivering sub-second response times, >99% uptime, and full audit logging.
  • Built a GenAI-powered conversational analytics agent with LangChain, NLP-driven dynamic SQL generation, and multi-turn context memory, achieving 95% query accuracy.
Chapter II · Jun 2023 – Apr 2025
25%

CSAT improvement

Exotel

Software Development Engineer

Bengaluru, Karnataka

  • Architected the frontend for the AMC (Automated Contact Center) product using Next.js, Turborepo monorepo, and Tailwind CSS, achieving 30% faster load times.
  • Built an end-to-end speech intelligence service processing call recordings via OpenAI Whisper, delivering automated transcripts, summaries, and sentiment analysis.
  • Spearheaded GenAI integration into the contact center platform, delivering LLM-powered call summarization and intent detection, driving a 25% improvement in CSAT scores.
Chapter III · Jul 2022 – Jun 2023
66%

scheduling efficiency

Cogno AI (Exotel)

Software Development Engineer

Mumbai, Maharashtra

  • Architected a distributed scheduler microservice with event-driven task queuing and intelligent retry logic, increasing scheduling efficiency by 66% and saving 110 engineering hours/week.
  • Deployed a WebRTC-based video conferencing solution using Django, WebSockets, and Dyte SDK, reducing video drop-off issues by 30%.
Chapter IV · Apr 2022 – Jul 2022
22%

customer interaction increase

Cogno AI (Exotel)

SDE Intern

Mumbai, Maharashtra

  • Integrated Meta's Graph API to onboard Instagram and Facebook channels into the LiveChat Fusion backend, expanding platform reach and increasing customer interactions by 22%.

Selected Work

Things I built.

Built to understand something deeply, solve something real, or because the problem was too interesting to leave alone.

distributed · 2024

DistributedKV

Fault-Tolerant Distributed Key-Value Store

The Problem

I wanted to understand consensus at the protocol level, not just use it.

The Approach

Fault-tolerant distributed key-value store built from scratch in Go, implementing the Raft consensus algorithm for strong consistency. Features consistent hashing with virtual nodes, automatic leader election, read replicas, and a custom binary wire protocol for low-latency inter-node RPC.

Built a production-grade consensus system from first principles. Under 2ms p99 read latency under real load.

  1. 01.Raft consensus guaranteeing strong consistency across nodes
  2. 02.Consistent hashing with virtual nodes for even distribution
  3. 03.< 2ms p99 read latency under load
GoRaft ConsensusgRPCConsistent HashingDocker
LN1N2N3N4raftreplication
architecture▸ system view

systems · 2025

Spectrum Overlay Engine

High-Performance Signal Processing & Visualization

The Problem

Vibration data from industrial machines is messy. I wanted to see what GPU-accelerated rendering could do with real FFT signals.

The Approach

High-performance signal processing tool for vibration analysis in predictive maintenance. Engineers overlay multiple frequency spectrums on a single canvas, apply real-time FFT filters, zoom into anomalies, and export comparative reports, rendered at 60fps with WebGL acceleration.

Reduced engineer analysis time by 60%, making the invisible signal visible at 60fps on consumer hardware.

  1. 01.60fps rendering via WebGL GPU acceleration
  2. 02.60% reduction in analysis time per engineer
  3. 03.Real-time FFT computation and signal filtering
ReactWebGLNode.jsD3.jsSignal Processing
AMPFREQ →peaksignal Asignal B
architecture▸ system view

More work

ai2025

GenAI Analytics Agent

Conversational AI for Complex Datasets

why → Most data tools assume SQL fluency. Most users don't have it.

AI-powered conversational agent enabling non-technical users to query industrial datasets in plain English. Combines NLP intent parsing, LangChain-driven dynamic SQL generation, Redis caching, and multi-turn context memory.

  • ·95% query accuracy on production industrial data
  • ·60% reduction in data analyst workflow time
  • ·Multi-turn conversation with persistent session memory
PythonLangChainPostgreSQLRedisOpenAIFastAPI
observability2025

CloudWatch · Infra Observability Dashboard

Full-Stack Monitoring & Alerting System

why → I kept building systems and then flying blind after deploying them.

End-to-end observability platform aggregating metrics, logs, and traces from distributed microservices into a unified real-time dashboard. Features Kafka-based ingestion, anomaly detection, and auto-discovery of service topology.

  • ·Sub-second metric ingestion via high-throughput Kafka pipeline
  • ·ML-based anomaly detection with configurable alerting
  • ·Auto-discovery of service topology and dependency maps
ReactNode.jsKafkaElasticSearchPrometheusDockerAWS
devops2024

Kube Deploy · GitOps CI/CD Platform

Automated Deployment Pipeline

why → Manual deployments were the most boring and error-prone part of the week.

Self-hosted GitOps-style deployment platform that watches GitHub repos, builds Docker images in parallel, runs tests in isolated containers, and deploys to Kubernetes with automatic rollback on health check failure.

  • ·Zero-downtime rolling deployments to Kubernetes
  • ·Automatic rollback triggered by health check failures
  • ·Parallel build pipeline with layer caching
Node.jsDockerKubernetesGitHub APIPostgreSQLRedis
real-timeJul 2024

Next Chat

Real-Time Communication Platform

why → WebSocket scaling at a company convinced me I needed to understand it deeply myself.

Production-grade real-time messaging platform built on a micro-frontend architecture with Turborepo. Implements WebSocket messaging, presence indicators, typing status, and horizontal scaling via Redis pub/sub.

  • ·Redis pub/sub enables horizontal scaling across nodes
  • ·30% faster builds via Turborepo monorepo caching
  • ·Real-time presence, typing indicators & message reactions
Next.jsTailwind CSSWebSocketNextAuthTurborepoRedis

What's On My Mind

Currently thinking.

Problems I'm turning over. Raw curiosity, not polished positions. Good conversation starters.

actively exploring

Multi-Agent Orchestration

Not the hype. The hard part: how do agents share state without collisions when multiple tools are running in parallel?

reading into

Consensus Without Raft

Is Paxos actually simpler when you control the failure model? Exploring alternatives for single-datacenter deployments.

actively exploring

Performance at the Edge

Sub-millisecond latency when the data lives on the device, not in the cloud. IIoT is forcing me to rethink what 'fast' means.

reading into

LLM Failure Modes in Production

What happens when a 95%-accurate SQL agent is wrong? Designing graceful degradation for AI systems that touch real data.

Let's Talk

Let's build something real.

If you're working on something that needs a full-stack engineer who thinks about reliability, observability, and AI infrastructure, or just wants to think through a hard problem together, I'd like to hear about it.

I'm particularly drawn to problems at the intersection of real-time systems, AI pipelines, and the infrastructure that makes them trustworthy in production.

akashsingh095@gmail.com

I respond to interesting problems within 24 hours.

What I'm open to

Full-stack product roles where the backend quality and the frontend craft are both part of the job description.

AI infrastructure challenges agent systems, RAG pipelines, the reliability work that makes AI production-ready.

Platform and developer tooling, especially where distributed systems and observability are first-class concerns.

Less interested in roles where infrastructure quality is someone else's problem.